Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
None
-
None
-
3
-
13347
Description
While testing Lustre 2.4.0-28chaos (see github.com/chaos/lustre) on an ldiskfs filesystem, we hit the following assertion on an OSS:
2014-03-30 10:14:22 Lustre: lc2-OST000b: deleting orphan objects from 0x0:70377775 to 0x0:70380721 2014-03-30 10:14:22 Lustre: lc2-OST000b: Recovery over after 1:45, of 143 clients 143 recovered and 0 were evicted. 2014-03-30 10:20:28 LNetError: 2672:0:(o2iblnd_cb.c:2635:kiblnd_rejected()) 10.1.1.161@o2ib9 rejected: o2iblnd fatal error 2014-03-30 10:20:28 LNetError: 2672:0:(o2iblnd_cb.c:2635:kiblnd_rejected()) Skipped 19 previous similar messages 2014-03-30 10:21:03 LustreError: 0:0:(ldlm_lockd.c:403:waiting_locks_callback()) ### lock callback timer expired after 150s: evicting client at 192.168.121.90@o2i 2014-03-30 10:21:06 LustreError: 18813:0:(client.c:1049:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88018e191400 x1463232807590408/t0(0) o104->lc2-OST0017 2014-03-30 10:21:06 LustreError: 18813:0:(client.c:1049:ptlrpc_import_delay_req()) Skipped 8 previous similar messages 2014-03-30 10:21:06 LustreError: 18813:0:(ldlm_lockd.c:736:ldlm_handle_ast_error()) ### client (nid 192.168.121.90@o2ib2) returned 0 from blocking AST ns: filter- 2014-03-30 10:21:06 LustreError: 18813:0:(ldlm_lockd.c:736:ldlm_handle_ast_error()) Skipped 1 previous similar message 2014-03-30 10:21:07 LustreError: 18937:0:(ldlm_lib.c:2734:target_bulk_io()) @@@ bulk GET failed: rc -107 req@ffff8801097bbc00 x1463677594325660/t0(0) o4->e5fffb3 2014-03-30 10:21:07 Lustre: lc2-OST0017: Bulk IO write error with e5fffb36-4dc9-0a2e-f74b-66de9283e46f (at 192.168.121.90@o2ib2), client will retry: rc -107 2014-03-30 10:21:07 Lustre: Skipped 1 previous similar message 2014-03-30 10:21:07 LustreError: 18937:0:(ldlm_lib.c:2734:target_bulk_io()) Skipped 1 previous similar message 2014-03-30 10:21:09 LustreError: 5121:0:(ldlm_lib.c:2734:target_bulk_io()) @@@ bulk GET failed: rc -107 req@ffff880053a75c00 x1463677594328176/t0(0) o4->e5fffb36 2014-03-30 10:21:09 Lustre: lc2-OST0017: Bulk IO write error with e5fffb36-4dc9-0a2e-f74b-66de9283e46f (at 192.168.121.90@o2ib2), client will retry: rc -107 2014-03-30 10:21:09 Lustre: Skipped 1 previous similar message 2014-03-30 10:21:13 LustreError: 18877:0:(ldlm_lib.c:2734:target_bulk_io()) @@@ bulk GET failed: rc -107 req@ffff8800341cdc00 x1463677594328180/t0(0) o4->e5fffb3 2014-03-30 10:21:13 Lustre: lc2-OST0017: Bulk IO write error with e5fffb36-4dc9-0a2e-f74b-66de9283e46f (at 192.168.121.90@o2ib2), client will retry: rc -107 2014-03-30 10:21:13 LustreError: 18877:0:(ldlm_lib.c:2734:target_bulk_io()) Skipped 3 previous similar messages 2014-03-30 10:21:17 LustreError: 18964:0:(ldlm_lib.c:2734:target_bulk_io()) @@@ bulk GET failed: rc -107 req@ffff88006e5a7400 x1463677594328172/t0(0) o4->e5fffb3 2014-03-30 10:21:17 Lustre: lc2-OST0017: Bulk IO write error with e5fffb36-4dc9-0a2e-f74b-66de9283e46f (at 192.168.121.90@o2ib2), client will retry: rc -107 2014-03-30 10:21:17 Lustre: Skipped 3 previous similar messages 2014-03-30 10:21:17 LustreError: 18829:0:(ost_handler.c:1909:ost_prolong_lock_one()) ASSERTION( lock->l_export == opd->opd_exp ) failed: 2014-03-30 10:21:17 LustreError: 18834:0:(ost_handler.c:1909:ost_prolong_lock_one()) ASSERTION( lock->l_export == opd->opd_exp ) failed: 2014-03-30 10:21:17 LustreError: 18834:0:(ost_handler.c:1909:ost_prolong_lock_one()) LBUG 2014-03-30 10:21:17 Pid: 18834, comm: ll_ost_io00_019 2014-03-30 10:21:17 2014-03-30 10:21:17 Call Trace: 2014-03-30 10:21:17 [<ffffffffa032d8f5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 2014-03-30 10:21:17 [<ffffffffa032def7>] lbug_with_loc+0x47/0xb0 [libcfs] 2014-03-30 10:21:17 [<ffffffffa0f473b7>] ost_prolong_lock_one+0xe7/0x170 [ost] 2014-03-30 10:21:17 [<ffffffffa07bf579>] ? __ldlm_handle2lock+0x39/0x320 [ptlrpc] 2014-03-30 10:21:17 [<ffffffffa0f474dc>] ost_prolong_locks+0x9c/0x340 [ost] 2014-03-30 10:21:17 [<ffffffffa0f4caab>] ost_rw_hpreq_check+0x25b/0x500 [ost] 2014-03-30 10:21:17 [<ffffffffa080d620>] ? lustre_swab_niobuf_remote+0x0/0x30 [ptlrpc] 2014-03-30 10:21:17 [<ffffffffa081cb53>] ptlrpc_main+0x1113/0x1700 [ptlrpc] 2014-03-30 10:21:17 [<ffffffffa081ba40>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] 2014-03-30 10:21:17 [<ffffffff8100c10a>] child_rip+0xa/0x20 2014-03-30 10:21:17 [<ffffffffa081ba40>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] 2014-03-30 10:21:17 [<ffffffffa081ba40>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] 2014-03-30 10:21:17 [<ffffffff8100c100>] ? child_rip+0x0/0x20
It looks like this has been seen in the past by multiple people in LU-2232, but that ticket was closed without a resolution.
Attachments
Issue Links
- duplicates
-
LU-2232 LustreError: 9120:0:(ost_handler.c:1673:ost_prolong_lock_one()) LBUG
- Resolved