Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.4.0
-
3
-
5555
Description
After reboot of an OSS, we see the following on the console:
2012-11-12 16:25:54 Lustre: Lustre: Build Version: 2.3.54-4chaos-3surya1-3surya1--PRISTINE-2.6.32-220.23.1.2chaos.ch5.x86_64 2012-11-12 16:25:55 LustreError: 6447:0:(mgc_request.c:248:do_config_log_add()) failed processing sptlrpc log: -2 2012-11-12 16:25:55 LustreError: 11-0: lstest-MDT0000-osp-OST0193: Communicating with 172.20.5.2@o2ib500, operation mds_connect failed with -11 2012-11-12 16:25:55 Lustre: lstest-OST0193: Will be in recovery for at least 5:00, or until 275 clients reconnect. 2012-11-12 16:25:56 LustreError: 6528:0:(ldlm_lockd.c:824:ldlm_server_blocking_ast()) ### BUG 6063: lock collide during recovery ns: filter-ffff8807fef4c000 lock: ffff880ff005dcc0/0xbdf5847332a7d090 lrc: 3/0,0 mode: PW/PW res: 10773567/0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x80010020 nid: 172.20.4.149@o2ib500 remote: 0x9c0890bf799c59d0 expref: 4 pid: 6531 timeout 0 2012-11-12 16:25:56 LustreError: 137-5: UUID 'lstest-OST0194_UUID' is not available for connect (no target) 2012-11-12 16:26:02 LustreError: 137-5: UUID 'lstest-OST0194_UUID' is not available for connect (no target) 2012-11-12 16:26:06 LustreError: 137-5: UUID 'lstest-OST0194_UUID' is not available for connect (no target) 2012-11-12 16:26:20 LustreError: 137-5: UUID 'lstest-OST0194_UUID' is not available for connect (no target) 2012-11-12 16:26:20 LustreError: 11-0: lstest-MDT0000-osp-OST0193: Communicating with 172.20.5.2@o2ib500, operation mds_connect failed with -11 2012-11-12 16:26:31 LustreError: 137-5: UUID 'lstest-OST0194_UUID' is not available for connect (no target) 2012-11-12 16:26:45 LustreError: 11-0: lstest-MDT0000-osp-OST0193: Communicating with 172.20.5.2@o2ib500, operation mds_connect failed with -11 2012-11-12 16:26:51 LustreError: 137-5: UUID 'lstest-OST0194_UUID' is not available for connect (no target) 2012-11-12 16:27:01 LustreError: 137-5: UUID 'lstest-OST0194_UUID' is not available for connect (no target) 2012-11-12 16:27:01 LustreError: Skipped 2 previous similar messages 2012-11-12 16:27:10 LustreError: 11-0: lstest-MDT0000-osp-OST0193: Communicating with 172.20.5.2@o2ib500, operation mds_connect failed with -11 2012-11-12 16:27:35 LustreError: 11-0: lstest-MDT0000-osp-OST0193: Communicating with 172.20.5.2@o2ib500, operation mds_connect failed with -11 2012-11-12 16:27:51 Lustre: lstest-OST0193: Client 7877635e-a0c6-353f-51e9-47e6f0ef5fb2 (at 172.20.17.2@o2ib500) reconnecting, waiting for 275 clients in recovery for 3:04 2012-11-12 16:27:51 Lustre: lstest-OST0193: Client 7877635e-a0c6-353f-51e9-47e6f0ef5fb2 (at 172.20.17.2@o2ib500) refused reconnection, still busy with 1 active RPCs 2012-11-12 16:27:54 Lustre: lstest-OST0193: Client df4b9103-f4bf-8082-3f31-a1512a4dda76 (at 172.20.17.7@o2ib500) reconnecting, waiting for 275 clients in recovery for 3:01 2012-11-12 16:27:54 Lustre: lstest-OST0193: Client df4b9103-f4bf-8082-3f31-a1512a4dda76 (at 172.20.17.7@o2ib500) refused reconnection, still busy with 1 active RPCs 2012-11-12 16:27:57 Lustre: lstest-OST0193: Client 4028a636-dc0d-66a7-557b-f4d960ae30a7 (at 172.20.17.9@o2ib500) reconnecting, waiting for 275 clients in recovery for 2:58 2012-11-12 16:27:57 Lustre: lstest-OST0193: Client 4028a636-dc0d-66a7-557b-f4d960ae30a7 (at 172.20.17.9@o2ib500) refused reconnection, still busy with 1 active RPCs 2012-11-12 16:27:58 Lustre: lstest-OST0193: Client 6b5359f5-fbbd-23ea-2c3e-9f96a635e074 (at 172.20.17.12@o2ib500) reconnecting, waiting for 275 clients in recovery for 2:57 2012-11-12 16:27:58 Lustre: lstest-OST0193: Client 6b5359f5-fbbd-23ea-2c3e-9f96a635e074 (at 172.20.17.12@o2ib500) refused reconnection, still busy with 1 active RPCs 2012-11-12 16:28:00 LustreError: 11-0: lstest-MDT0000-osp-OST0193: Communicating with 172.20.5.2@o2ib500, operation mds_connect failed with -11
It actually goes on for a while, with recovery not going well at all. See attached oss_grove403_console.txt file with more console log output.