Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
Lustre 2.12.5
-
None
-
Centos 7, 3.10.0-1127.8.2.el7_lustre.x86_64, ZFS
-
3
-
9223372036854775807
Description
Immediately after a failover from one to node to the other (manually performed using pcs resource move) the new node crashed:
[82973.263903] Lustre: meteo0-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900
[82973.263907] Lustre: Skipped 1 previous similar message
[82973.273770] LustreError: 28190:0:(osp_md_object.c:167:osp_md_create()) ASSERTION( attr->la_valid & LA_TYPE ) failed:
[82973.275269] LustreError: 28190:0:(osp_md_object.c:167:osp_md_create()) LBUG
[82973.276678] Pid: 28190, comm: lod0000_rec0001 3.10.0-1127.8.2.el7_lustre.x86_64 #1 SMP Mon Jun 8 13:48:45 UTC 2020
[82973.276679] Call Trace:
[82973.276686] [<ffffffffc10487cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[82973.276698] [<ffffffffc104887c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[82973.276703] [<ffffffffc1995b7a>] osp_md_create+0x42a/0x470 [osp]
[82973.276715] [<ffffffffc1151334>] llog_osd_get_cat_list+0x8d4/0xbd0 [obdclass]
[82973.276740] [<ffffffffc18ca359>] lod_sub_prep_llog+0xb9/0x783 [lod]
[82973.276758] [<ffffffffc188f82b>] lod_sub_recovery_thread+0x1cb/0xc80 [lod]
[82973.276764] [<ffffffff9f2c6691>] kthread+0xd1/0xe0
[82973.276769] [<ffffffff9f992d1d>] ret_from_fork_nospec_begin+0x7/0x21
[82973.276774] [<ffffffffffffffff>] 0xffffffffffffffff
[82973.276793] Kernel panic - not syncing: LBUG
[82973.278196] CPU: 29 PID: 28190 Comm: lod0000_rec0001 Kdump: loaded Tainted: P OE ------------ 3.10.0-1127.8.2.el7_lustre.x86_64 #1
[82973.281009] Hardware name: Supermicro X10DRi/X10DRI-T, BIOS 3.1 09/14/2018
[82973.282404] Call Trace:
[82973.282990] Lustre: meteo0-MDT0000: in recovery but waiting for the first client to connect
[82973.282992] Lustre: Skipped 1 previous similar message
[82973.286683] [<ffffffff9f97ffa5>] dump_stack+0x19/0x1b
[82973.288065] [<ffffffff9f979541>] panic+0xe8/0x21f
[82973.289403] [<ffffffffc10488cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
[82973.290714] [<ffffffffc1995b7a>] osp_md_create+0x42a/0x470 [osp]
[82973.292031] [<ffffffffc1151334>] llog_osd_get_cat_list+0x8d4/0xbd0 [obdclass]
[82973.293333] [<ffffffffc18ca359>] lod_sub_prep_llog+0xb9/0x783 [lod]
[82973.294630] [<ffffffffc118de6c>] ? keys_fill+0xfc/0x180 [obdclass]
[82973.295895] [<ffffffffc188f82b>] lod_sub_recovery_thread+0x1cb/0xc80 [lod]
[82973.297134] [<ffffffffc188f660>] ? lod_obd_get_info+0x9d0/0x9d0 [lod]
[82973.298364] [<ffffffff9f2c6691>] kthread+0xd1/0xe0
[82973.299582] [<ffffffff9f2c65c0>] ? insert_kthread_work+0x40/0x40
[82973.300792] [<ffffffff9f992d1d>] ret_from_fork_nospec_begin+0x7/0x21
[82973.301984] [<ffffffff9f2c65c0>] ? insert_kthread_work+0x40/0x40
This was fixed in 2.14 via patch https://review.whamcloud.com/40655 "
LU-14039obdclass: set LA_TYPE when update_log init".