[LU-5287] (ldlm_lib.c:2253:target_queue_recovery_request()) ASSERTION( req->rq_export->exp_lock_replay_needed ) failed - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: Lustre 2.7.0, Lustre 2.5.4
Affects Version/s: Lustre 2.6.0, Lustre 2.7.0
Labels:
- llnl
- ost

Severity:
3
Rank (Obsolete):
14750

Description

Running racer with 2 clients MDSCOUNT=1 and 2.5.60-90-g37432a8 + http://review.whamcloud.com/#/c/5936/ I see this when restarting a crashed OST with some clients still mounted.

[  230.089707] Lustre: Skipped 75 previous similar messages
[  231.775205] Lustre: 2151:0:(client.c:1924:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1404323793/real 1404323793]  req@ffff8801f78fc110 x1472540086110788/t0(0) o400->lustre-OST0001-osc-MDT0000@0@lo:28/4 lens 224/224 e 1 to 1 dl 1404323837 ref 1 fl Rpc:X/c0/ffffffff rc 0/-1
[  237.775938] Lustre: lustre-OST0001: Denying connection for new client cc64d6dc-4180-e700-9f7e-ce147524a8f0 (at 0@lo), waiting for all 4 known clients (2 recovered, 1 in progress, and 1 evicted) to recover in 0:36
[  237.781858] Lustre: Skipped 3 previous similar messages
[  242.801254] LustreError: 2880:0:(ldlm_lib.c:2253:target_queue_recovery_request()) ASSERTION( req->rq_export->exp_lock_replay_needed ) failed: 
[  242.805102] LustreError: 2880:0:(ldlm_lib.c:2253:target_queue_recovery_request()) LBUG
[  242.807953] Pid: 2880, comm: ll_ost00_007
[  242.809274] 
[  242.809276] Call Trace:
[  242.810585]  [<ffffffffa02b98c5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[  242.812764]  [<ffffffffa02b9ec7>] lbug_with_loc+0x47/0xb0 [libcfs]
[  242.814689]  [<ffffffffa064ea0c>] target_queue_recovery_request+0xbac/0xc10 [ptlrpc]
[  242.816347]  [<ffffffffa06e122f>] tgt_handle_recovery+0x38f/0x520 [ptlrpc]
[  242.817666]  [<ffffffffa06e6b8d>] tgt_request_handle+0x18d/0xad0 [ptlrpc]
[  242.818987]  [<ffffffffa0699e31>] ptlrpc_main+0xcf1/0x1880 [ptlrpc]
[  242.820261]  [<ffffffffa0699140>] ? ptlrpc_main+0x0/0x1880 [ptlrpc]
[  242.821440]  [<ffffffff8109eab6>] kthread+0x96/0xa0
[  242.822360]  [<ffffffff8100c30a>] child_rip+0xa/0x20
[  242.823303]  [<ffffffff81554710>] ? _spin_unlock_irq+0x30/0x40
[  242.824390]  [<ffffffff8100bb10>] ? restore_args+0x0/0x30
[  242.825391]  [<ffffffff8109ea20>] ? kthread+0x0/0xa0
[  242.826315]  [<ffffffff8100c300>] ? child_rip+0x0/0x20
[  242.827283]

Attachments

Issue Links

duplicates

LU-5572 replay-single test_73b: import is not in FULL state

Closed

is related to

LU-5651 ASSERTION( req->rq_export->exp_lock_replay_needed ) failed

Resolved

mentioned in: Page Loading...

Activity

People

Assignee:: Niu Yawei (Inactive)

Reporter:: John Hammond

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 02/Jul/14 6:06 PM

Updated:: 19/Sep/16 2:51 AM

Resolved:: 06/Nov/14 9:49 PM