Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.7.0
-
3
-
15171
Description
Running racer with 2 MDTs, migration disabled, http://review.whamcloud.com/#/c/5936/, and http://review.whamcloud.com/#/c/11319/. I see this:
[ 196.021505] LustreError: 9497:0:(lod_dev.c:67:lod_fld_lookup()) ASSERTION( fid_is_sane(fid) ) failed: Invalid FID [0x0:0x0:0x0] [ 196.023832] LustreError: 9497:0:(lod_dev.c:67:lod_fld_lookup()) LBUG [ 196.024871] Pid: 9497, comm: mdt00_009 [ 196.025538] [ 196.025539] Call Trace: [ 196.026221] [<ffffffffa02be8c5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [ 196.027358] [<ffffffffa02beec7>] lbug_with_loc+0x47/0xb0 [libcfs] [ 196.028357] [<ffffffffa0d55265>] lod_fld_lookup+0x255/0x400 [lod] [ 196.029366] [<ffffffffa0d68433>] lod_object_init+0x103/0x3c0 [lod] [ 196.030422] [<ffffffffa0457f98>] lu_object_alloc+0xd8/0x320 [obdclass] [ 196.031520] [<ffffffffa04596d8>] lu_object_find_at+0x208/0x360 [obdclass] [ 196.032762] [<ffffffffa0459846>] lu_object_find+0x16/0x20 [obdclass] [ 196.033930] [<ffffffffa0ca3e36>] mdt_object_find+0x56/0x170 [mdt] [ 196.035219] [<ffffffffa0cabfd2>] mdt_object_find_lock+0x42/0x170 [mdt] [ 196.036535] [<ffffffffa0cc9368>] mdt_lock_slaves+0x228/0x520 [mdt] [ 196.037802] [<ffffffffa0cc9806>] mdt_attr_set+0x1a6/0x4d0 [mdt] [ 196.038965] [<ffffffffa0ccaf1b>] mdt_reint_setattr+0x33b/0xfb0 [mdt] [ 196.040135] [<ffffffffa0691a0e>] ? lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc] [ 196.041440] [<ffffffffa0cc13b1>] mdt_reint_rec+0x41/0xe0 [mdt] [ 196.042645] [<ffffffffa0ca6c63>] mdt_reint_internal+0x4c3/0x7c0 [mdt] [ 196.043970] [<ffffffffa0ca74eb>] mdt_reint+0x6b/0x120 [mdt] [ 196.044996] [<ffffffffa06f2325>] tgt_request_handle+0x245/0xad0 [ptlrpc] [ 196.046165] [<ffffffffa06a2dc1>] ptlrpc_main+0xce1/0x1970 [ptlrpc] [ 196.047279] [<ffffffffa06a20e0>] ? ptlrpc_main+0x0/0x1970 [ptlrpc] [ 196.048355] [<ffffffff8109eab6>] kthread+0x96/0xa0 [ 196.049169] [<ffffffff8100c30a>] child_rip+0xa/0x20 [ 196.050003] [<ffffffff81554710>] ? _spin_unlock_irq+0x30/0x40 [ 196.050996] [<ffffffff8100bb10>] ? restore_args+0x0/0x30 [ 196.051906] [<ffffffff8109ea20>] ? kthread+0x0/0xa0 [ 196.052756] [<ffffffff8100c300>] ? child_rip+0x0/0x20 [ 196.053630] [ 196.055757] Kernel panic - not syncing: LBUG [ 196.056077] Pid: 9497, comm: mdt00_009 Not tainted 2.6.32-431.5.1.el6.lustre.x86_64 #1 [ 196.056758] Call Trace:
I assume that this is due to the way the full LMV xattr for a striped directory is synthesized when needed. I have seen similar crashes on the client.
Note that no fault injection was used here.