Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.6.0
-
3
-
14748
Description
Running racer on 2 nodes with MDSCOUNT=1 and http://review.whamcloud.com/#/c/5936/ (add more operations to racer) I see this frequently.
[ 803.128945] Lustre: lustre-MDT0000: Client 045a894d-d2a7-e744-df1f-a695740583d8 (at 19\ 2.168.122.108@tcp) reconnecting [ 803.134578] LustreError: 5113:0:(lu_object.h:852:lu_object_attr()) ASSERTION( ((o)->lo\ _header->loh_attr & LOHA_EXISTS) != 0 ) failed: [ 803.136583] LustreError: 5113:0:(lu_object.h:852:lu_object_attr()) LBUG [ 803.137789] Pid: 5113, comm: mdt00_005 [ 803.138439] [ 803.138440] Call Trace: [ 803.139129] [<ffffffffa02be8c5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [ 803.140296] [<ffffffffa02beec7>] lbug_with_loc+0x47/0xb0 [libcfs] [ 803.141337] [<ffffffffa0c034ac>] mdt_attr_get_complex+0xaec/0xb80 [mdt] [ 803.142457] [<ffffffffa0c20425>] mdt_reconstruct_setattr+0xd5/0x410 [mdt] [ 803.143631] [<ffffffffa04737e0>] ? lu_ucred+0x20/0x30 [obdclass] [ 803.144677] [<ffffffffa0c1fde5>] mdt_reconstruct+0x45/0x120 [mdt] [ 803.145722] [<ffffffffa0bfdbfa>] mdt_reint_internal+0x6fa/0x7c0 [mdt] [ 803.146820] [<ffffffffa0bfe24b>] mdt_reint+0x6b/0x120 [mdt] [ 803.147818] [<ffffffffa06e9c45>] tgt_request_handle+0x245/0xad0 [ptlrpc] [ 803.148977] [<ffffffffa069ce31>] ptlrpc_main+0xcf1/0x1880 [ptlrpc] [ 803.150053] [<ffffffffa069c140>] ? ptlrpc_main+0x0/0x1880 [ptlrpc] [ 803.151093] [<ffffffff8109eab6>] kthread+0x96/0xa0 [ 803.151886] [<ffffffff8100c30a>] child_rip+0xa/0x20 [ 803.152714] [<ffffffff81554710>] ? _spin_unlock_irq+0x30/0x40 [ 803.153677] [<ffffffff8100bb10>] ? restore_args+0x0/0x30 [ 803.154572] [<ffffffff8109ea20>] ? kthread+0x0/0xa0 [ 803.155403] [<ffffffff8100c300>] ? child_rip+0x0/0x20 [ 803.156261]
When I hit this I see the following sequence of events:
- The MDT crashes.
- The client node stays up.
- I restart the MDT.
- The client resends the setattr.
- GOTO 1.
Attachments
Issue Links
- is related to
-
LU-6085 racer stuck on mutex_lock in ll_setattr_raw()
- Resolved
-
LU-5388 Interop 2.5.2<->2.6 failure on test suite replay-dual test_24: (lu_object.h:855:lu_object_attr()) ASSERTION( ((o)->lo_header->loh_attr & LOHA_EXISTS) != 0 ) failed
- Resolved
-
LU-6088 racer test_1: dir_create.sh mutex deadlock in sys_open->do_lookup
- Resolved
-
LU-5340 oops in cl_object_top() after reconstructed setattr on removed directory
- Resolved
- is related to
-
LU-5125 lu_object_attr() should return 0 for non-existing objects
- Closed