[LU-5697] ll_statahead_interpret()) ASSERTION( entry != ((void *)0) ) Created: 01/Oct/14  Updated: 24/Jan/17  Resolved: 24/Jan/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Johann Lombardi (Inactive) Assignee: WC Triage
Resolution: Won't Fix Votes: 0
Labels: soak

Severity: 3
Rank (Obsolete): 15945

 Description   

LBUG hit during soak testing on a lola client while IOR & compilebench were running:

<4>Lustre: soaked-OST003b-osc-ffff88102fe96400: Connection to soaked-OST003b (at 192.168.1.105@o2ib10) was lost; in progress operations using 
this service will wait for recovery to complete
<6>Lustre: soaked-OST003b-osc-ffff88102fe96400: Connection restored to soaked-OST003b (at 192.168.1.105@o2ib10)
<0>LustreError: 3649:0:(statahead.c:734:ll_statahead_interpret()) ASSERTION( entry != ((void *)0) ) failed: 
<0>LustreError: 3649:0:(statahead.c:734:ll_statahead_interpret()) LBUG
<4>Pid: 3649, comm: ptlrpcd_23
<4>
<4>Call Trace:
<4> [<ffffffffa0696895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa0696e97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa0e1ff67>] ll_statahead_interpret+0x3d7/0x440 [lustre]
<4> [<ffffffffa0d14a5a>] mdc_intent_getattr_async_interpret+0x20a/0x550 [mdc]
<4> [<ffffffffa0a2c893>] ptlrpc_check_set+0x343/0x1d20 [ptlrpc]
<4> [<ffffffff81084a1b>] ? try_to_del_timer_sync+0x7b/0xe0
<4> [<ffffffffa0a59a83>] ptlrpcd_check+0x533/0x550 [ptlrpc]
<4> [<ffffffffa0a5a0cb>] ptlrpcd+0x33b/0x3f0 [ptlrpc]
<4> [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa0a59d90>] ? ptlrpcd+0x0/0x3f0 [ptlrpc]
<4> [<ffffffff8109abf6>] kthread+0x96/0xa0
<4> [<ffffffff8100c20a>] child_rip+0xa/0x20
<4> [<ffffffff8109ab60>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20

A crash dump is available on lola-26, under /var/crash/127.0.0.1-2014-10-01-12:24:42.



 Comments   
Comment by Johann Lombardi (Inactive) [ 01/Oct/14 ]

Got another occurrence on lola-27:

<6>Lustre: soaked-OST003b-osc-ffff88102fd21400: Connection restored to soaked-OST003b (at 192.168.1.105@o2ib10)
<3>LustreError: 11-0: soaked-MDT0000-mdc-ffff88102fd21400: Communicating with 192.168.1.108@o2ib10, operation ldlm_enqueue failed with -71.
<3>LustreError: Skipped 1 previous similar message
<3>LustreError: 3644:0:(mdc_locks.c:1210:mdc_intent_getattr_async_interpret()) ldlm_cli_enqueue_fini: -71
<0>LustreError: 3654:0:(statahead.c:734:ll_statahead_interpret()) ASSERTION( entry != ((void *)0) ) failed: 
<0>LustreError: 3654:0:(statahead.c:734:ll_statahead_interpret()) LBUG
<4>Pid: 3654, comm: ptlrpcd_23
<4>
<4>Call Trace:
<4> [<ffffffffa0696895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa0696e97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa0e1ff67>] ll_statahead_interpret+0x3d7/0x440 [lustre]
<4> [<ffffffffa0d14a5a>] mdc_intent_getattr_async_interpret+0x20a/0x550 [mdc]
<4> [<ffffffffa0a2c893>] ptlrpc_check_set+0x343/0x1d20 [ptlrpc]
<4> [<ffffffff81084a1b>] ? try_to_del_timer_sync+0x7b/0xe0
<4> [<ffffffffa0a59a83>] ptlrpcd_check+0x533/0x550 [ptlrpc]
<4> [<ffffffffa0a5a0cb>] ptlrpcd+0x33b/0x3f0 [ptlrpc]
<4> [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa0a59d90>] ? ptlrpcd+0x0/0x3f0 [ptlrpc]
<4> [<ffffffff8109abf6>] kthread+0x96/0xa0
<4> [<ffffffff8100c20a>] child_rip+0xa/0x20
<4> [<ffffffff8109ab60>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20

Crash dump available on lola-27:/var/crash/127.0.0.1-2014-10-01-12:24:42.

Comment by John Hammond [ 13/Oct/14 ]

Also on shadow running d1d02bc http://review.whamcloud.com/#/c/11887/3 MERGED 2014-09-15 during racer.

/export/scratch/dumps/shadow-46vm1.shadow.whamcloud.com/10.1.6.28-2014-09-16-04:02:55/vmcore-dmesg.txt

<3>LustreError: 3326:0:(vvp_io.c:1221:vvp_io_init()) lustre: refresh file layout [0x200000402:0x197:0x0] error -34.
<3>LustreError: 3326:0:(vvp_io.c:1221:vvp_io_init()) Skipped 35334 previous similar messages
<3>LustreError: 11-0: lustre-MDT0000-mdc-ffff88007a001c00: Communicating with 10.1.6.34@tcp, operation mds_getxattr failed with -34.
<3>LustreError: Skipped 85125 previous similar messages
<3>LustreError: 3373:0:(vvp_io.c:1221:vvp_io_init()) lustre: refresh file layout [0x200000402:0x197:0x0] error -34.
<3>LustreError: 3373:0:(vvp_io.c:1221:vvp_io_init()) Skipped 85092 previous similar messages
<3>LustreError: 11-0: lustre-MDT0000-mdc-ffff880079f84800: Communicating with 10.1.6.34@tcp, operation mds_getxattr failed with -34.
<3>LustreError: Skipped 204201 previous similar messages
<3>LustreError: 3373:0:(vvp_io.c:1221:vvp_io_init()) lustre: refresh file layout [0x200000402:0x197:0x0] error -34.
<3>LustreError: 3373:0:(vvp_io.c:1221:vvp_io_init()) Skipped 204168 previous similar messages
<0>LustreError: 1957:0:(statahead.c:735:ll_statahead_interpret()) ASSERTION( entry != ((void *)0) ) failed: 
<0>LustreError: 1957:0:(statahead.c:735:ll_statahead_interpret()) LBUG
<4>Pid: 1957, comm: ptlrpcd_0
<4>
<4>Call Trace:
<4> [<ffffffffa03dd895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa03dde97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa0a65f97>] ll_statahead_interpret+0x3d7/0x440 [lustre]
<4> [<ffffffffa095aa5a>] mdc_intent_getattr_async_interpret+0x20a/0x550 [mdc]
<4> [<ffffffffa0729a43>] ptlrpc_check_set+0x343/0x1d20 [ptlrpc]
<4> [<ffffffff81084a1b>] ? try_to_del_timer_sync+0x7b/0xe0
<4> [<ffffffffa0755cd3>] ptlrpcd_check+0x533/0x550 [ptlrpc]
<4> [<ffffffffa075631b>] ptlrpcd+0x33b/0x3f0 [ptlrpc]
<4> [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa0755fe0>] ? ptlrpcd+0x0/0x3f0 [ptlrpc]
<4> [<ffffffff8109abf6>] kthread+0x96/0xa0
<4> [<ffffffff8100c20a>] child_rip+0xa/0x20
<3>LustreError: 22332:0:(file.c:3251:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000401:0xa80:0x0] error: rc = -116
<4> [<ffffffff8109ab60>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
<4>
<0>Kernel panic - not syncing: LBUG
Comment by Cliff White (Inactive) [ 24/Jan/17 ]

Dead for two years, never repeated. Closing

Generated at Sat Feb 10 01:53:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.