[LU-2579] Test failure on test suite ost-pools: test_6: @@@@@@ FAIL: LBUG/LASSERT detected Created: 05/Jan/13  Updated: 16/Jan/13  Resolved: 16/Jan/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.8
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 6019

 Description   

This issue was created by maloo for liuying <emoly.liu@intel.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/221f9d24-5744-11e2-8772-52540035b04c.

Error: 'LBUG/LASSERT detected'
Failure Rate: 1.00% of last 100 executions [all branches]

MDS console showed

00:13:34:Lustre: DEBUG MARKER: == ost-pools test 6: getstripe/setstripe ============================================================= 00:13:33 (1357373613)
00:13:34:Lustre: DEBUG MARKER: lctl pool_new lustre.testpool
00:13:45:Lustre: DEBUG MARKER: lctl pool_list lustre
00:13:45:Lustre: DEBUG MARKER: lctl pool_add lustre.testpool lustre-OST[0000-0006/1]
00:14:07:LustreError: 3588:0:(lov_request.c:694:lov_update_create_set()) error creating fid 0x80015 sub-object on OST idx 0/7: rc = -107
00:14:39:Lustre: Service thread pid 3665 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
00:14:39:Pid: 3665, comm: ll_mdt_08
00:14:39:
00:14:39:Call Trace:
00:14:39: [<ffffffff8006389f>] schedule_timeout+0x8a/0xad
00:14:39: [<ffffffff8009a41d>] process_timeout+0x0/0x5
00:14:39: [<ffffffff88a99695>] osc_create+0xc75/0x13d0 [osc]
00:14:39: [<ffffffff8008ee84>] default_wake_function+0x0/0xe
00:14:39: [<ffffffff88b48edb>] qos_remedy_create+0x45b/0x570 [lov]
00:14:39: [<ffffffff88b42df3>] lov_fini_create_set+0x243/0x11e0 [lov]
00:14:39: [<ffffffff88b36b72>] lov_create+0x1552/0x1860 [lov]
00:14:39: [<ffffffff88b377a8>] lov_iocontrol+0x928/0xf0f [lov]
00:14:39: [<ffffffff8008ee84>] default_wake_function+0x0/0xe
00:14:39: [<ffffffff88ca8b21>] mds_finish_open+0x1fa1/0x4370 [mds]
00:14:39: [<ffffffff80009860>] __d_lookup+0xb0/0xff
00:14:39: [<ffffffff8000d543>] dput+0x2c/0x114
00:14:39: [<ffffffff88c88fad>] mds_verify_child+0x2dd/0x870 [mds]
00:14:39: [<ffffffff889a79a0>] ldlm_blocking_ast+0x0/0x2a0 [ptlrpc]
00:14:39: [<ffffffff88cafd41>] mds_open+0x2f01/0x386b [mds]
00:14:39: [<ffffffff8886ccfd>] libcfs_debug_vmsg2+0x70d/0x970 [libcfs]
00:14:39: [<ffffffff8898a86c>] _ldlm_lock_debug+0x57c/0x6e0 [ptlrpc]
00:14:39: [<ffffffff889d15f1>] lustre_swab_buf+0x81/0x170 [ptlrpc]
00:14:39: [<ffffffff8000d543>] dput+0x2c/0x114
00:14:39: [<ffffffff88c860a5>] mds_reint_rec+0x365/0x550 [mds]
00:14:39: [<ffffffff88cb0c6e>] mds_update_unpack+0x1fe/0x280 [mds]
00:14:39: [<ffffffff88c78eda>] mds_reint+0x35a/0x420 [mds]
00:14:39: [<ffffffff88c77dea>] fixup_handle_for_resent_req+0x5a/0x2c0 [mds]
00:14:39: [<ffffffff88c82bee>] mds_intent_policy+0x49e/0xc10 [mds]
00:14:39: [<ffffffff88992270>] ldlm_resource_putref_internal+0x230/0x460 [ptlrpc]
00:14:39: [<ffffffff8898feb6>] ldlm_lock_enqueue+0x186/0xb20 [ptlrpc]
00:14:39: [<ffffffff8898c7fd>] ldlm_lock_create+0x9bd/0x9f0 [ptlrpc]
00:14:39: [<ffffffff889b4870>] ldlm_server_blocking_ast+0x0/0x83d [ptlrpc]
00:14:39: [<ffffffff889b1b39>] ldlm_handle_enqueue+0xc09/0x1210 [ptlrpc]
00:14:39: [<ffffffff88c81b2e>] mds_handle+0x40ce/0x4cf0 [mds]
00:14:39: [<ffffffff88869868>] libcfs_ip_addr2str+0x38/0x40 [libcfs]
00:14:39: [<ffffffff88869c7e>] libcfs_nid2str+0xbe/0x110 [libcfs]
00:14:39: [<ffffffff889dcaf5>] ptlrpc_server_log_handling_request+0x105/0x130 [ptlrpc]
00:14:39: [<ffffffff889df874>] ptlrpc_server_handle_request+0x984/0xe00 [ptlrpc]
00:14:39: [<ffffffff889dffd5>] ptlrpc_wait_event+0x2e5/0x310 [ptlrpc]
00:14:39: [<ffffffff8008d2a9>] __wake_up_common+0x3e/0x68
00:14:39: [<ffffffff889e0f16>] ptlrpc_main+0xf16/0x10e0 [ptlrpc]
00:14:39: [<ffffffff8005dfb1>] child_rip+0xa/0x11
00:14:39: [<ffffffff889e0000>] ptlrpc_main+0x0/0x10e0 [ptlrpc]
00:14:39: [<ffffffff8005dfa7>] child_rip+0x0/0x11


 Comments   
Comment by Jian Yu [ 14/Jan/13 ]

Console log on the OSS node showed that:

00:13:40:Lustre: DEBUG MARKER: == ost-pools test 6: getstripe/setstripe ============================================================= 00:13:33 (1357373613)
00:14:01:LustreError: 6862:0:(filter.c:3225:filter_handle_precreate()) ASSERTION(diff >= 0) failed: lustre-OST0000: 33 - 513 = -480
00:14:01:LustreError: 6862:0:(filter.c:3225:filter_handle_precreate()) LBUG
00:14:01:Pid: 6862, comm: ll_ost_creat_01
00:14:02:
00:14:02:Call Trace:
00:14:02: [<ffffffff888646a1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs]
00:14:02: [<ffffffff88864bda>] lbug_with_loc+0x7a/0xd0 [libcfs]
00:14:03: [<ffffffff88c662b4>] filter_create+0x1174/0x17c0 [obdfilter]
00:14:03: [<ffffffff8001a98d>] vsnprintf+0x5df/0x627
00:14:03: [<ffffffff889d10b4>] lustre_msg_add_version+0x34/0x110 [ptlrpc]
00:14:03: [<ffffffff88864164>] set_ptldebug_header+0x34/0xa0 [libcfs]
00:14:03: [<ffffffff889d3ec9>] lustre_pack_reply+0x29/0xb0 [ptlrpc]
00:14:03: [<ffffffff88c1e801>] ost_handle+0x1281/0x55c0 [ost]
00:14:03: [<ffffffff88869868>] libcfs_ip_addr2str+0x38/0x40 [libcfs]
00:14:03: [<ffffffff889df874>] ptlrpc_server_handle_request+0x984/0xe00 [ptlrpc]
00:14:03: [<ffffffff889dffd5>] ptlrpc_wait_event+0x2e5/0x310 [ptlrpc]
00:14:03: [<ffffffff8008d2a9>] __wake_up_common+0x3e/0x68
00:14:03: [<ffffffff889e0f16>] ptlrpc_main+0xf16/0x10e0 [ptlrpc]
00:14:03: [<ffffffff8005dfb1>] child_rip+0xa/0x11
00:14:03: [<ffffffff889e0000>] ptlrpc_main+0x0/0x10e0 [ptlrpc]
00:14:03: [<ffffffff8005dfa7>] child_rip+0x0/0x11
00:14:03:
00:14:03:LustreError: dumping log to /tmp/lustre-log.1357373633.6862
Comment by Jian Yu [ 14/Jan/13 ]

Lustre Branch: b1_8
Lustre Build: http://build.whamcloud.com/job/lustre-b1_8/241

The sanity-quota test 8 also failed with the same issue:

CMD: client-28vm3,client-28vm4,client-28vm5 rc=\$([ -f /proc/sys/lnet/catastrophe ] && echo \$(< /proc/sys/lnet/catastrophe) || echo 0);
if [ \$rc -ne 0 ]; then echo \$(hostname): \$rc; fi
exit \$rc;
client-28vm4.lab.whamcloud.com: 1
 sanity-quota test_8: @@@@@@ FAIL: LBUG/LASSERT detected

Console log on the OSS node showed that:

22:44:01:Lustre: DEBUG MARKER: == sanity-quota test 8: Run dbench with quota enabled ============= 22:43:59 (1357886639)
22:44:12:LustreError: 27582:0:(filter.c:3225:filter_handle_precreate()) ASSERTION(diff >= 0) failed: lustre-OST0000: 65 - 513 = -448
22:44:12:LustreError: 27582:0:(filter.c:3225:filter_handle_precreate()) LBUG
22:44:12:Pid: 27582, comm: ll_ost_creat_03
22:44:12:
22:44:12:Call Trace:
22:44:12: [<ffffffff888646a1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs]
22:44:12: [<ffffffff88864bda>] lbug_with_loc+0x7a/0xd0 [libcfs]
22:44:12: [<ffffffff88c662b4>] filter_create+0x1174/0x17c0 [obdfilter]
22:44:12: [<ffffffff8001a98d>] vsnprintf+0x5df/0x627
22:44:12: [<ffffffff889d10b4>] lustre_msg_add_version+0x34/0x110 [ptlrpc]
22:44:12: [<ffffffff88864164>] set_ptldebug_header+0x34/0xa0 [libcfs]
22:44:12: [<ffffffff889d3ec9>] lustre_pack_reply+0x29/0xb0 [ptlrpc]
22:44:12: [<ffffffff88c1e801>] ost_handle+0x1281/0x55c0 [ost]
22:44:12: [<ffffffff88864164>] set_ptldebug_header+0x34/0xa0 [libcfs]
22:44:12: [<ffffffff88869868>] libcfs_ip_addr2str+0x38/0x40 [libcfs]
22:44:12: [<ffffffff889df874>] ptlrpc_server_handle_request+0x984/0xe00 [ptlrpc]
22:44:12: [<ffffffff889dffd5>] ptlrpc_wait_event+0x2e5/0x310 [ptlrpc]
22:44:12: [<ffffffff8008d2a9>] __wake_up_common+0x3e/0x68
22:44:12: [<ffffffff889e0f16>] ptlrpc_main+0xf16/0x10e0 [ptlrpc]
22:44:12: [<ffffffff8005dfb1>] child_rip+0xa/0x11
22:44:12: [<ffffffff889e0000>] ptlrpc_main+0x0/0x10e0 [ptlrpc]
22:44:12: [<ffffffff8005dfa7>] child_rip+0x0/0x11
22:44:12:
22:44:12:LustreError: dumping log to /tmp/lustre-log.1357886643.27582

Maloo report: https://maloo.whamcloud.com/test_sets/58e311de-5d79-11e2-8199-52540035b04c

Comment by Jian Yu [ 14/Jan/13 ]

The patch for LU-1129 is needed on Lustre b1_8 branch.

Patch for b1_8 branch is in http://review.whamcloud.com/5013.

Comment by Jian Yu [ 16/Jan/13 ]

Patch was landed on Lustre b1_8 branch.

Generated at Sat Feb 10 01:26:24 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.