[LU-4133] Failure on test suite racer test_1 Created: 22/Oct/13  Updated: 22/Dec/17  Resolved: 22/Dec/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

server: lustre-b2_5 build #2
client: lustre-b2_5 build #2 SLES SP3


Severity: 3
Rank (Obsolete): 11206

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/9aefe8ee-381c-11e3-844f-52540035b04c.

The sub-test test_1 failed with the following error:

test failed to respond and timed out

server: lustre-b2_5 build #2 RHEL6
client: lustre-b2_5 build #2 SLES11 SP3

client2 dmesg shows:

[32700.552019] ls              D ffff88002b280500     0 23724  20156 0x00000000
[32700.552019]  ffff88002a879c00 0000000000000086 0000000000000ab2 ffff88002a878010
[32700.552019]  0000000000011780 0000000000011780 ffff88002a879fd8 ffff88002a879fd8
[32700.552019]  0000000000011780 ffff88002b280500 ffff88002a879b68 ffff88002ba7a0c0
[32700.552019] Call Trace:
[32700.552019]  [<ffffffff8145d00f>] __mutex_lock_slowpath+0xdf/0x150
[32700.552019]  [<ffffffff8145ca9a>] mutex_lock+0x1a/0x40
[32700.552019]  [<ffffffff81165978>] do_lookup+0x278/0x3a0
[32700.552019]  [<ffffffff811675d4>] link_path_walk+0x184/0x8a0
[32700.552019]  [<ffffffff81167ebb>] path_openat+0xbb/0x420
[32700.552019]  [<ffffffff8116835c>] do_filp_open+0x4c/0xc0
[32700.552019]  [<ffffffff81158eff>] do_sys_open+0x17f/0x250
[32700.552019]  [<ffffffff81466012>] system_call_fastpath+0x16/0x1b
[32700.552019]  [<00007f11aae8eeb0>] 0x7f11aae8eeaf
[32700.552019] ls              D ffff88002d39a1c0     0 23725  20156 0x00000000
[32700.552019]  ffff88002acb3c00 0000000000000086 0000000000000187 ffff88002acb2010
[32700.552019]  0000000000011780 0000000000011780 ffff88002acb3fd8 ffff88002acb3fd8
[32700.552019]  0000000000011780 ffff88002d39a1c0 ffff88002acb3b68 ffff88002bba0640
[32700.552019] Call Trace:
[32700.552019]  [<ffffffff8145d00f>] __mutex_lock_slowpath+0xdf/0x150
[32700.552019]  [<ffffffff8145ca9a>] mutex_lock+0x1a/0x40
[32700.552019]  [<ffffffff81165978>] do_lookup+0x278/0x3a0
[32700.552019]  [<ffffffff811675d4>] link_path_walk+0x184/0x8a0
[32700.552019]  [<ffffffff81167ebb>] path_openat+0xbb/0x420
[32700.552019]  [<ffffffff8116835c>] do_filp_open+0x4c/0xc0
[32700.552019]  [<ffffffff81158eff>] do_sys_open+0x17f/0x250
[32700.552019]  [<ffffffff81466012>] system_call_fastpath+0x16/0x1b
[32700.552019]  [<00007f783e148eb0>] 0x7f783e148eaf
[32700.552019] ls              S 0000000000000000     0 23726  20156 0x00000000
[32700.552019]  ffff88002b3876e8 0000000000000086 ffffffffff0a0200 ffff88002b386010
[32700.552019]  0000000000011780 0000000000011780 ffff88002b387fd8 ffff88002b387fd8
[32700.552019]  0000000000011780 ffff88002a80a040 0000000000000001 ffff88002b94c2c0
[32700.552019] Call Trace:
[32700.552019]  [<ffffffff8145c6d0>] schedule_timeout+0x1b0/0x2a0
[32700.552019]  [<ffffffffa0841385>] ptlrpc_set_wait+0x2f5/0x8f0 [ptlrpc]
[32700.552019]  [<ffffffffa0841a01>] ptlrpc_queue_wait+0x81/0x220 [ptlrpc]
[32700.552019]  [<ffffffffa0823f58>] ldlm_cli_enqueue+0x388/0x770 [ptlrpc]
[32700.552019]  [<ffffffffa0970c5b>] mdc_enqueue+0x30b/0xf80 [mdc]
[32700.552019]  [<ffffffffa0971a71>] mdc_intent_lock+0x1a1/0x730 [mdc]
[32700.552019]  [<ffffffffa0ba0e71>] lmv_intent_open+0x1a1/0x9a0 [lmv]
[32700.552019]  [<ffffffffa0ba194a>] lmv_intent_lock+0x2da/0x3b0 [lmv]
[32700.552019]  [<ffffffffa0ae9029>] ll_lookup_it+0x4c9/0xb00 [lustre]
[32700.552019]  [<ffffffffa0ae96e1>] ll_lookup_nd+0x81/0x3e0 [lustre]
[32700.552019]  [<ffffffff811641e2>] d_alloc_and_lookup+0x42/0x80
[32700.552019]  [<ffffffff811659a5>] do_lookup+0x2a5/0x3a0
[32700.552019]  [<ffffffff81166b72>] do_last+0x102/0x800
[32700.552019]  [<ffffffff81167ed9>] path_openat+0xd9/0x420
[32700.552019]  [<ffffffff8116835c>] do_filp_open+0x4c/0xc0
[32700.552019]  [<ffffffff81158eff>] do_sys_open+0x17f/0x250
[32700.552019]  [<ffffffff81466012>] system_call_fastpath+0x16/0x1b
[32700.552019]  [<00007ffc92860eb0>] 0x7ffc92860eaf


 Comments   
Comment by Oleg Drokin [ 22/Oct/13 ]

The real problem here is that client1 had a sudden reboot a few minutes into the racer test, but we don't have a console log to see what has actually happened.

Comment by Andreas Dilger [ 22/Dec/17 ]

Close old bug that has not been seen in a long time.

Generated at Sat Feb 10 01:39:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.