[LU-2579] Test failure on test suite ost-pools: test_6: @@@@@@ FAIL: LBUG/LASSERT detected Created: 05/Jan/13 Updated: 16/Jan/13 Resolved: 16/Jan/13 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.8 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Jian Yu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 6019 |
| Description |
|
This issue was created by maloo for liuying <emoly.liu@intel.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/221f9d24-5744-11e2-8772-52540035b04c.
MDS console showed 00:13:34:Lustre: DEBUG MARKER: == ost-pools test 6: getstripe/setstripe ============================================================= 00:13:33 (1357373613) 00:13:34:Lustre: DEBUG MARKER: lctl pool_new lustre.testpool 00:13:45:Lustre: DEBUG MARKER: lctl pool_list lustre 00:13:45:Lustre: DEBUG MARKER: lctl pool_add lustre.testpool lustre-OST[0000-0006/1] 00:14:07:LustreError: 3588:0:(lov_request.c:694:lov_update_create_set()) error creating fid 0x80015 sub-object on OST idx 0/7: rc = -107 00:14:39:Lustre: Service thread pid 3665 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: 00:14:39:Pid: 3665, comm: ll_mdt_08 00:14:39: 00:14:39:Call Trace: 00:14:39: [<ffffffff8006389f>] schedule_timeout+0x8a/0xad 00:14:39: [<ffffffff8009a41d>] process_timeout+0x0/0x5 00:14:39: [<ffffffff88a99695>] osc_create+0xc75/0x13d0 [osc] 00:14:39: [<ffffffff8008ee84>] default_wake_function+0x0/0xe 00:14:39: [<ffffffff88b48edb>] qos_remedy_create+0x45b/0x570 [lov] 00:14:39: [<ffffffff88b42df3>] lov_fini_create_set+0x243/0x11e0 [lov] 00:14:39: [<ffffffff88b36b72>] lov_create+0x1552/0x1860 [lov] 00:14:39: [<ffffffff88b377a8>] lov_iocontrol+0x928/0xf0f [lov] 00:14:39: [<ffffffff8008ee84>] default_wake_function+0x0/0xe 00:14:39: [<ffffffff88ca8b21>] mds_finish_open+0x1fa1/0x4370 [mds] 00:14:39: [<ffffffff80009860>] __d_lookup+0xb0/0xff 00:14:39: [<ffffffff8000d543>] dput+0x2c/0x114 00:14:39: [<ffffffff88c88fad>] mds_verify_child+0x2dd/0x870 [mds] 00:14:39: [<ffffffff889a79a0>] ldlm_blocking_ast+0x0/0x2a0 [ptlrpc] 00:14:39: [<ffffffff88cafd41>] mds_open+0x2f01/0x386b [mds] 00:14:39: [<ffffffff8886ccfd>] libcfs_debug_vmsg2+0x70d/0x970 [libcfs] 00:14:39: [<ffffffff8898a86c>] _ldlm_lock_debug+0x57c/0x6e0 [ptlrpc] 00:14:39: [<ffffffff889d15f1>] lustre_swab_buf+0x81/0x170 [ptlrpc] 00:14:39: [<ffffffff8000d543>] dput+0x2c/0x114 00:14:39: [<ffffffff88c860a5>] mds_reint_rec+0x365/0x550 [mds] 00:14:39: [<ffffffff88cb0c6e>] mds_update_unpack+0x1fe/0x280 [mds] 00:14:39: [<ffffffff88c78eda>] mds_reint+0x35a/0x420 [mds] 00:14:39: [<ffffffff88c77dea>] fixup_handle_for_resent_req+0x5a/0x2c0 [mds] 00:14:39: [<ffffffff88c82bee>] mds_intent_policy+0x49e/0xc10 [mds] 00:14:39: [<ffffffff88992270>] ldlm_resource_putref_internal+0x230/0x460 [ptlrpc] 00:14:39: [<ffffffff8898feb6>] ldlm_lock_enqueue+0x186/0xb20 [ptlrpc] 00:14:39: [<ffffffff8898c7fd>] ldlm_lock_create+0x9bd/0x9f0 [ptlrpc] 00:14:39: [<ffffffff889b4870>] ldlm_server_blocking_ast+0x0/0x83d [ptlrpc] 00:14:39: [<ffffffff889b1b39>] ldlm_handle_enqueue+0xc09/0x1210 [ptlrpc] 00:14:39: [<ffffffff88c81b2e>] mds_handle+0x40ce/0x4cf0 [mds] 00:14:39: [<ffffffff88869868>] libcfs_ip_addr2str+0x38/0x40 [libcfs] 00:14:39: [<ffffffff88869c7e>] libcfs_nid2str+0xbe/0x110 [libcfs] 00:14:39: [<ffffffff889dcaf5>] ptlrpc_server_log_handling_request+0x105/0x130 [ptlrpc] 00:14:39: [<ffffffff889df874>] ptlrpc_server_handle_request+0x984/0xe00 [ptlrpc] 00:14:39: [<ffffffff889dffd5>] ptlrpc_wait_event+0x2e5/0x310 [ptlrpc] 00:14:39: [<ffffffff8008d2a9>] __wake_up_common+0x3e/0x68 00:14:39: [<ffffffff889e0f16>] ptlrpc_main+0xf16/0x10e0 [ptlrpc] 00:14:39: [<ffffffff8005dfb1>] child_rip+0xa/0x11 00:14:39: [<ffffffff889e0000>] ptlrpc_main+0x0/0x10e0 [ptlrpc] 00:14:39: [<ffffffff8005dfa7>] child_rip+0x0/0x11 |
| Comments |
| Comment by Jian Yu [ 14/Jan/13 ] |
|
Console log on the OSS node showed that: 00:13:40:Lustre: DEBUG MARKER: == ost-pools test 6: getstripe/setstripe ============================================================= 00:13:33 (1357373613) 00:14:01:LustreError: 6862:0:(filter.c:3225:filter_handle_precreate()) ASSERTION(diff >= 0) failed: lustre-OST0000: 33 - 513 = -480 00:14:01:LustreError: 6862:0:(filter.c:3225:filter_handle_precreate()) LBUG 00:14:01:Pid: 6862, comm: ll_ost_creat_01 00:14:02: 00:14:02:Call Trace: 00:14:02: [<ffffffff888646a1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs] 00:14:02: [<ffffffff88864bda>] lbug_with_loc+0x7a/0xd0 [libcfs] 00:14:03: [<ffffffff88c662b4>] filter_create+0x1174/0x17c0 [obdfilter] 00:14:03: [<ffffffff8001a98d>] vsnprintf+0x5df/0x627 00:14:03: [<ffffffff889d10b4>] lustre_msg_add_version+0x34/0x110 [ptlrpc] 00:14:03: [<ffffffff88864164>] set_ptldebug_header+0x34/0xa0 [libcfs] 00:14:03: [<ffffffff889d3ec9>] lustre_pack_reply+0x29/0xb0 [ptlrpc] 00:14:03: [<ffffffff88c1e801>] ost_handle+0x1281/0x55c0 [ost] 00:14:03: [<ffffffff88869868>] libcfs_ip_addr2str+0x38/0x40 [libcfs] 00:14:03: [<ffffffff889df874>] ptlrpc_server_handle_request+0x984/0xe00 [ptlrpc] 00:14:03: [<ffffffff889dffd5>] ptlrpc_wait_event+0x2e5/0x310 [ptlrpc] 00:14:03: [<ffffffff8008d2a9>] __wake_up_common+0x3e/0x68 00:14:03: [<ffffffff889e0f16>] ptlrpc_main+0xf16/0x10e0 [ptlrpc] 00:14:03: [<ffffffff8005dfb1>] child_rip+0xa/0x11 00:14:03: [<ffffffff889e0000>] ptlrpc_main+0x0/0x10e0 [ptlrpc] 00:14:03: [<ffffffff8005dfa7>] child_rip+0x0/0x11 00:14:03: 00:14:03:LustreError: dumping log to /tmp/lustre-log.1357373633.6862 |
| Comment by Jian Yu [ 14/Jan/13 ] |
|
Lustre Branch: b1_8 The sanity-quota test 8 also failed with the same issue: CMD: client-28vm3,client-28vm4,client-28vm5 rc=\$([ -f /proc/sys/lnet/catastrophe ] && echo \$(< /proc/sys/lnet/catastrophe) || echo 0); if [ \$rc -ne 0 ]; then echo \$(hostname): \$rc; fi exit \$rc; client-28vm4.lab.whamcloud.com: 1 sanity-quota test_8: @@@@@@ FAIL: LBUG/LASSERT detected Console log on the OSS node showed that: 22:44:01:Lustre: DEBUG MARKER: == sanity-quota test 8: Run dbench with quota enabled ============= 22:43:59 (1357886639) 22:44:12:LustreError: 27582:0:(filter.c:3225:filter_handle_precreate()) ASSERTION(diff >= 0) failed: lustre-OST0000: 65 - 513 = -448 22:44:12:LustreError: 27582:0:(filter.c:3225:filter_handle_precreate()) LBUG 22:44:12:Pid: 27582, comm: ll_ost_creat_03 22:44:12: 22:44:12:Call Trace: 22:44:12: [<ffffffff888646a1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs] 22:44:12: [<ffffffff88864bda>] lbug_with_loc+0x7a/0xd0 [libcfs] 22:44:12: [<ffffffff88c662b4>] filter_create+0x1174/0x17c0 [obdfilter] 22:44:12: [<ffffffff8001a98d>] vsnprintf+0x5df/0x627 22:44:12: [<ffffffff889d10b4>] lustre_msg_add_version+0x34/0x110 [ptlrpc] 22:44:12: [<ffffffff88864164>] set_ptldebug_header+0x34/0xa0 [libcfs] 22:44:12: [<ffffffff889d3ec9>] lustre_pack_reply+0x29/0xb0 [ptlrpc] 22:44:12: [<ffffffff88c1e801>] ost_handle+0x1281/0x55c0 [ost] 22:44:12: [<ffffffff88864164>] set_ptldebug_header+0x34/0xa0 [libcfs] 22:44:12: [<ffffffff88869868>] libcfs_ip_addr2str+0x38/0x40 [libcfs] 22:44:12: [<ffffffff889df874>] ptlrpc_server_handle_request+0x984/0xe00 [ptlrpc] 22:44:12: [<ffffffff889dffd5>] ptlrpc_wait_event+0x2e5/0x310 [ptlrpc] 22:44:12: [<ffffffff8008d2a9>] __wake_up_common+0x3e/0x68 22:44:12: [<ffffffff889e0f16>] ptlrpc_main+0xf16/0x10e0 [ptlrpc] 22:44:12: [<ffffffff8005dfb1>] child_rip+0xa/0x11 22:44:12: [<ffffffff889e0000>] ptlrpc_main+0x0/0x10e0 [ptlrpc] 22:44:12: [<ffffffff8005dfa7>] child_rip+0x0/0x11 22:44:12: 22:44:12:LustreError: dumping log to /tmp/lustre-log.1357886643.27582 Maloo report: https://maloo.whamcloud.com/test_sets/58e311de-5d79-11e2-8199-52540035b04c |
| Comment by Jian Yu [ 14/Jan/13 ] |
|
The patch for Patch for b1_8 branch is in http://review.whamcloud.com/5013. |
| Comment by Jian Yu [ 16/Jan/13 ] |
|
Patch was landed on Lustre b1_8 branch. |