[LU-4844] ost_prolong_lock_one()) ASSERTION( lock->l_export == opd->opd_exp ) Created: 31/Mar/14  Updated: 14/Jun/14  Resolved: 11/Apr/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Christopher Morrone Assignee: Lai Siyao
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-2232 LustreError: 9120:0:(ost_handler.c:16... Resolved
Severity: 3
Rank (Obsolete): 13347

 Description   

While testing Lustre 2.4.0-28chaos (see github.com/chaos/lustre) on an ldiskfs filesystem, we hit the following assertion on an OSS:

2014-03-30 10:14:22 Lustre: lc2-OST000b: deleting orphan objects from 0x0:70377775 to 0x0:70380721
2014-03-30 10:14:22 Lustre: lc2-OST000b: Recovery over after 1:45, of 143 clients 143 recovered and 0 were evicted.
2014-03-30 10:20:28 LNetError: 2672:0:(o2iblnd_cb.c:2635:kiblnd_rejected()) 10.1.1.161@o2ib9 rejected: o2iblnd fatal error
2014-03-30 10:20:28 LNetError: 2672:0:(o2iblnd_cb.c:2635:kiblnd_rejected()) Skipped 19 previous similar messages
2014-03-30 10:21:03 LustreError: 0:0:(ldlm_lockd.c:403:waiting_locks_callback()) ### lock callback timer expired after 150s: evicting client at 192.168.121.90@o2i
2014-03-30 10:21:06 LustreError: 18813:0:(client.c:1049:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88018e191400 x1463232807590408/t0(0) o104->lc2-OST0017
2014-03-30 10:21:06 LustreError: 18813:0:(client.c:1049:ptlrpc_import_delay_req()) Skipped 8 previous similar messages
2014-03-30 10:21:06 LustreError: 18813:0:(ldlm_lockd.c:736:ldlm_handle_ast_error()) ### client (nid 192.168.121.90@o2ib2) returned 0 from blocking AST ns: filter-
2014-03-30 10:21:06 LustreError: 18813:0:(ldlm_lockd.c:736:ldlm_handle_ast_error()) Skipped 1 previous similar message
2014-03-30 10:21:07 LustreError: 18937:0:(ldlm_lib.c:2734:target_bulk_io()) @@@ bulk GET failed: rc -107  req@ffff8801097bbc00 x1463677594325660/t0(0) o4->e5fffb3
2014-03-30 10:21:07 Lustre: lc2-OST0017: Bulk IO write error with e5fffb36-4dc9-0a2e-f74b-66de9283e46f (at 192.168.121.90@o2ib2), client will retry: rc -107
2014-03-30 10:21:07 Lustre: Skipped 1 previous similar message
2014-03-30 10:21:07 LustreError: 18937:0:(ldlm_lib.c:2734:target_bulk_io()) Skipped 1 previous similar message
2014-03-30 10:21:09 LustreError: 5121:0:(ldlm_lib.c:2734:target_bulk_io()) @@@ bulk GET failed: rc -107  req@ffff880053a75c00 x1463677594328176/t0(0) o4->e5fffb36
2014-03-30 10:21:09 Lustre: lc2-OST0017: Bulk IO write error with e5fffb36-4dc9-0a2e-f74b-66de9283e46f (at 192.168.121.90@o2ib2), client will retry: rc -107
2014-03-30 10:21:09 Lustre: Skipped 1 previous similar message
2014-03-30 10:21:13 LustreError: 18877:0:(ldlm_lib.c:2734:target_bulk_io()) @@@ bulk GET failed: rc -107  req@ffff8800341cdc00 x1463677594328180/t0(0) o4->e5fffb3
2014-03-30 10:21:13 Lustre: lc2-OST0017: Bulk IO write error with e5fffb36-4dc9-0a2e-f74b-66de9283e46f (at 192.168.121.90@o2ib2), client will retry: rc -107
2014-03-30 10:21:13 LustreError: 18877:0:(ldlm_lib.c:2734:target_bulk_io()) Skipped 3 previous similar messages
2014-03-30 10:21:17 LustreError: 18964:0:(ldlm_lib.c:2734:target_bulk_io()) @@@ bulk GET failed: rc -107  req@ffff88006e5a7400 x1463677594328172/t0(0) o4->e5fffb3
2014-03-30 10:21:17 Lustre: lc2-OST0017: Bulk IO write error with e5fffb36-4dc9-0a2e-f74b-66de9283e46f (at 192.168.121.90@o2ib2), client will retry: rc -107
2014-03-30 10:21:17 Lustre: Skipped 3 previous similar messages
2014-03-30 10:21:17 LustreError: 18829:0:(ost_handler.c:1909:ost_prolong_lock_one()) ASSERTION( lock->l_export == opd->opd_exp ) failed: 
2014-03-30 10:21:17 LustreError: 18834:0:(ost_handler.c:1909:ost_prolong_lock_one()) ASSERTION( lock->l_export == opd->opd_exp ) failed: 
2014-03-30 10:21:17 LustreError: 18834:0:(ost_handler.c:1909:ost_prolong_lock_one()) LBUG
2014-03-30 10:21:17 Pid: 18834, comm: ll_ost_io00_019
2014-03-30 10:21:17 
2014-03-30 10:21:17 Call Trace:
2014-03-30 10:21:17  [<ffffffffa032d8f5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
2014-03-30 10:21:17  [<ffffffffa032def7>] lbug_with_loc+0x47/0xb0 [libcfs]
2014-03-30 10:21:17  [<ffffffffa0f473b7>] ost_prolong_lock_one+0xe7/0x170 [ost]
2014-03-30 10:21:17  [<ffffffffa07bf579>] ? __ldlm_handle2lock+0x39/0x320 [ptlrpc]
2014-03-30 10:21:17  [<ffffffffa0f474dc>] ost_prolong_locks+0x9c/0x340 [ost]
2014-03-30 10:21:17  [<ffffffffa0f4caab>] ost_rw_hpreq_check+0x25b/0x500 [ost]
2014-03-30 10:21:17  [<ffffffffa080d620>] ? lustre_swab_niobuf_remote+0x0/0x30 [ptlrpc]
2014-03-30 10:21:17  [<ffffffffa081cb53>] ptlrpc_main+0x1113/0x1700 [ptlrpc]
2014-03-30 10:21:17  [<ffffffffa081ba40>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
2014-03-30 10:21:17  [<ffffffff8100c10a>] child_rip+0xa/0x20
2014-03-30 10:21:17  [<ffffffffa081ba40>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
2014-03-30 10:21:17  [<ffffffffa081ba40>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
2014-03-30 10:21:17  [<ffffffff8100c100>] ? child_rip+0x0/0x20

It looks like this has been seen in the past by multiple people in LU-2232, but that ticket was closed without a resolution.



 Comments   
Comment by Peter Jones [ 01/Apr/14 ]

Lai

Could you please assist with this one?

Thanks

Peter

Comment by Lai Siyao [ 11/Apr/14 ]

This is a duplicate of LU-2232.

Generated at Sat Feb 10 01:46:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.