[LU-6830] sanity-lfsck test_31b: (3) Fail to start LFSCK for namespace Created: 09/Jul/15  Updated: 15/Jul/16  Resolved: 15/Jul/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Incomplete Votes: 0
Labels: None
Environment:

client and server: lustre-master build # 3094 RHEL7 DNE


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/2790b194-2640-11e5-8b33-5254006e85c2.

The sub-test test_31b failed with the following error:

(3) Fail to start LFSCK for namespace

test log

== sanity-lfsck test 31b: The LFSCK can find/repair the name entry with bad name hash (2) ============ 04:09:33 (1436414973)
#####
For the name entry under a striped directory, if the name
hash does not match the shard, then the LFSCK will repair
the bad name entry
#####
Inject failure stub on client to simulate the case that
some name entry should be inserted into other non-second
shard, but inserted into the secod shard by wrong
total: 4 creates in 0.01 seconds: 296.58 creates/second
Trigger namespace LFSCK to repair bad name hash
CMD: shadow-42vm3 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t namespace -r -A
pdsh@shadow-42vm6: shadow-42vm3: mcmd: connect failed: No route to host
 sanity-lfsck test_31b: @@@@@@ FAIL: (3) Fail to start LFSCK for namespace 

MDS dmesg

[ 7075.100332] Lustre: DEBUG MARKER: == sanity-lfsck test 31a: The LFSCK can find/repair the name entry with bad name hash (1) ============ 04:09:23 (1436414963)
[ 7075.302542] Pid: 12872, comm: mdt_out00_003
[ 7075.302878] 
Call Trace:
[ 7075.303192]  [<ffffffffa061a843>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
[ 7075.303739]  [<ffffffffa0bd2292>] osd_trans_start+0x642/0x670 [osd_ldiskfs]
[ 7075.304307]  [<ffffffffa0a2b68d>] out_tx_end+0x9d/0x5e0 [ptlrpc]
[ 7075.306409] Pid: 11543, comm: mdt_out00_001
[ 7075.307162]  [<ffffffffa0a2f0e2>] out_handle+0xf12/0x19a0 [ptlrpc]
[ 7075.307182]  [<ffffffffa097a1f0>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 7075.307207]  [<ffffffffa0a2529b>] tgt_request_handle+0x88b/0x1100 [ptlrpc]
[ 7075.307230]  [<ffffffffa09ccfbb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 7075.307252]  [<ffffffffa09ca078>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 7075.307256]  [<ffffffff810a9662>] ? default_wake_function+0x12/0x20
[ 7075.307258]  [<ffffffff810a0898>] ? __wake_up_common+0x58/0x90
[ 7075.307280]  [<ffffffffa09d0900>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
[ 7075.307282]  [<ffffffff810ad8b6>] ? __dequeue_entity+0x26/0x40
[ 7075.307303]  [<ffffffffa09cfd00>] ? ptlrpc_main+0x0/0x1f60 [ptlrpc]
[ 7075.307305]  [<ffffffff8109739f>] kthread+0xcf/0xe0
[ 7075.307306]  [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 7075.307310]  [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0
[ 7075.307311]  [<ffffffff810972d0>] ? kthread+0x0/0xe0

[ 7075.325258] 
Call Trace:
[ 7075.325569]  [<ffffffffa061a843>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
[ 7075.326139]  [<ffffffffa0bd2292>] osd_trans_start+0x642/0x670 [osd_ldiskfs]
[ 7075.326724]  [<ffffffffa0a2b68d>] out_tx_end+0x9d/0x5e0 [ptlrpc]
[ 7075.327229]  [<ffffffffa0a2f0e2>] out_handle+0xf12/0x19a0 [ptlrpc]
[ 7075.327693]  [<ffffffffa097a1f0>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 7075.329028]  [<ffffffffa0a2529b>] tgt_request_handle+0x88b/0x1100 [ptlrpc]
[ 7075.330478]  [<ffffffffa09ccfbb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 7075.331261]  [<ffffffffa09ca078>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 7075.332394]  [<ffffffffa09d0900>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
[ 7075.333023]  [<ffffffffa09cfd00>] ? ptlrpc_main+0x0/0x1f60 [ptlrpc]
[ 7075.333847]  [<ffffffff8109739f>] kthread+0xcf/0xe0
[ 7075.334218]  [<ffffffff810972d0>] ? kthread+0x0/0xe0
[ 7075.334604]  [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0
[ 7075.335635]  [<ffffffff810972d0>] ? kthread+0x0/0xe0



 Comments   
Comment by Andreas Dilger [ 15/Jul/16 ]

No console logs are available, and too old to debug. Might relate to DCO-4019 or LDEV-419.

Generated at Sat Feb 10 02:03:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.