Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.1.0
-
None
-
3
-
4922
Description
This issue was created by maloo for sarah <sarah@whamcloud.com>
This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/4f87c578-77b4-11e0-9b1b-52540025f9af.
03:33:24:Lustre: DEBUG MARKER: == recovery-small test 28: handle error adding new clients (bug 6086) ================================ 20:33:24 (1304652804)
03:33:35:Lustre: 12843:0:(client.c:1775:ptlrpc_expire_one_request()) @@@ Request x1368023838678515 sent from lustre-MDT0000 to NID 192.168.4.23@o2ib has timed out for slow reply: [sent 1304652804] [real_sent 1304652804] [current 1304652815] [deadline 11s] [delay 0s] req@ffff810037e66c00 x1368023838678515/t0(0) o-1->Àx<9a>^V^D<81>ÿÿ^G@NET_0x50000c0a80417_UUID:15/16 lens 296/192 e 0 to 1 dl 1304652815 ref 1 fl Rpc:XN/ffffffff/ffffffff rc 0/-1
03:33:35:LustreError: 138-a: lustre-MDT0000: A client on nid 192.168.4.23@o2ib was evicted due to a lock blocking callback time out: rc -107
03:33:36:LustreError: 12843:0:(mdt_handler.c:2813:mdt_recovery()) operation 41 on unconnected MDS from 12345-192.168.4.23@o2ib
03:33:36:LustreError: 12843:0:(ldlm_lib.c:2118:target_send_reply_msg()) @@@ processing error (107) req@ffff81041baef800 x1368025347789244/t0(0) o-1><?>@<?>:0/0 lens 192/0 e 0 to 0 dl 1304652857 ref 1 fl Interpret:/ffffffff/ffffffff rc -107/-1
03:33:36:LustreError: 12843:0:(obd_support.h:455:obd_fail_check_set()) *** obd_fail_loc=12f ***
03:33:36:LustreError: 12843:0:(mdt_recovery.c:549:mdt_client_new()) no room for 0 clients - fix LR_MAX_CLIENTS
03:33:55:Lustre: Failing over lustre-MDT0000
03:33:55:Lustre: Skipped 8 previous similar messages
03:33:55:Lustre: 21999:0:(quota_master.c:793:close_quota_files()) quota[0] is off already
03:33:55:Lustre: 21999:0:(quota_master.c:793:close_quota_files()) Skipped 1 previous similar message
03:33:55:Lustre: mdd_obd-lustre-MDT0000-0: shutting down for failover; client state will be preserved.
03:33:55:LustreError: 21999:0:(ldlm_request.c:1169:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
03:33:55:LustreError: 21999:0:(ldlm_request.c:1796:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
03:33:55:LustreError: 5682:0:(service.c:2686:ptlrpc_unregister_service()) ASSERTION(service->srv_n_queued_reqs == 0) failed
03:33:55:LustreError: 5682:0:(service.c:2686:ptlrpc_unregister_service()) LBUG
03:33:55:Pid: 5682, comm: obd_zombid
03:33:55:
03:33:55:Call Trace:
03:33:55: [<ffffffff8877c5f1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs]
03:33:55: [<ffffffff8877cb2a>] lbug_with_loc+0x7a/0xd0 [libcfs]
03:33:55: [<ffffffff88787960>] cfs_tracefile_init+0x0/0x10a [libcfs]
03:33:55: [<ffffffff889412e3>] ptlrpc_unregister_service+0x4e3/0xbd0 [ptlrpc]
03:33:55: [<ffffffff8002e244>] __wake_up+0x38/0x4f
03:33:55: [<ffffffff88c390bb>] mgs_cleanup+0xeb/0x220 [mgs]
03:33:55: [<ffffffff8884b2cf>] class_decref+0x43f/0x5b0 [obdclass]
03:33:55: [<ffffffff88c392d6>] mgs_destroy_export+0xe6/0xf0 [mgs]
03:33:55: [<ffffffff88831302>] obd_zombie_impexp_cull+0x402/0x4f0 [obdclass]
03:33:55: [<ffffffff88838847>] obd_zombie_impexp_thread+0x1f7/0x2a0 [obdclass]
03:33:55: [<ffffffff8008cf99>] default_wake_function+0x0/0xe
03:33:55: [<ffffffff8005dfb1>] child_rip+0xa/0x11
03:33:55: [<ffffffff88838650>] obd_zombie_impexp_thread+0x0/0x2a0 [obdclass]
03:33:55: [<ffffffff8005dfa7>] child_rip+0x0/0x11
03:33:55:
03:33:55:Kernel panic - not syncing: LBUG
03:33:55: <7>APIC error on CPU0: 00(04)