Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
3
-
9223372036854775807
Description
New clients were unable to establish a connection to the MDT, even after recovery had been aborted due to an llog context not being set up properly. The clients were permanently getting -11 = -EAGAIN errors from the server:
(service.c:2298:ptlrpc_server_handle_request()) Handling RPC req@ffff8cdd37ad0d80 pname:cluuid+ref:pid:xid:nid:opc:job mdt09_0 01:0+-99:4093:x1719493089340032:12345-10.16.172.159@tcp:38: (service.c:2303:ptlrpc_server_handle_request()) got req 1719493089340032 (tgt_handler.c:736:tgt_request_handle()) Process entered (ldlm_lib.c:1100:target_handle_connect()) Process entered (ldlm_lib.c:1360:target_handle_connect()) lfs02-MDT0003: connection from 16778a5c-5128-4231-8b45-426adc7e94b6@10.16.172.159@tcp t55835524055 exp (null) cur 51537 last 0 (obd_class.h:831:obd_connect()) Process entered (mdt_handler.c:6671:mdt_obd_connect()) Process entered (lod_dev.c:2136:lod_obd_get_info()) lfs02-MDT0003-mdtlov: lfs02-MDT0001-osp-MDT0003 is not ready. (lod_dev.c:2145:lod_obd_get_info()) Process leaving (rc=18446744073709551605 : -11 : fffffffffffffff5) (ldlm_lib.c:1446:target_handle_connect()) Process leaving via out (rc=18446744073709551605 : -11 : 0xfffffffffffffff5) (service.c:2347:ptlrpc_server_handle_request()) Handled RPC req@ffff8cdd37ad0d80 pname:cluuid+ref:pid:xid:nid:opc:job mdt09_001:0+-99:4093:x1719493089340032:12345-10.16.172.159@tcp:38: Request processed in 86us (124us total) trans 0 rc -11/-11
This corresponds to the following block of code in lod_obd_get_info(), where it is the second "is not ready" message being printed from the missing ctxt->loc_handle:
lod_foreach_mdt(d, tgt) { struct llog_ctxt *ctxt; if (!tgt->ltd_active) continue; ctxt = llog_get_context(tgt->ltd_tgt->dd_lu_dev.ld_obd, LLOG_UPDATELOG_ORIG_CTXT); if (!ctxt) { CDEBUG(D_INFO, "%s: %s is not ready.\n", obd->obd_name, tgt->ltd_tgt->dd_lu_dev.ld_obd->obd_name); rc = -EAGAIN; break; } if (!ctxt->loc_handle) { CDEBUG(D_INFO, "%s: %s is not ready.\n", obd->obd_name, tgt->ltd_tgt->dd_lu_dev.ld_obd->obd_name); rc = -EAGAIN; llog_ctxt_put(ctxt); break; } llog_ctxt_put(ctxt); }
It would be useful to distinguish those two messages more clearly, e.g. "ctxt is not ready" and "handle is not ready", as minor differences in line numbers would make it difficult to distinguish them in the logs.
The root problem is that the MDT0003-MDT0001 connection wasn't completely set up due to abort_recovery_mdt (due to a different recovery error, LU-15761), and the MDS never retries to establish this connection, leaving the filesystem permanently unusable. Running "lctl --device NN recover" reconnected the import, but did not actually re-establish the llog context. Mounting with "-o abort_recov_mdt" resulted in the problem moving to MDT0000 (only the first bad llog context is printed before breaking out of the loop).
I think there are two issues to be addressed here:
1) the MDS should try to reconnect and rebuild the llog connection in this case, at least on recover if not automatically. there didn't appear to be any permanent reason why these llog connections were not working, just fallout from abort_recovery_mdt.
2) is it strictly necessary to block client mounting if not all MDT-MDT connections are established? Or is no different than any other case where the MDT loses a connection after it is mounted? The MDT recovery had already been aborted, so allowing new clients to connect shouldn't cause any issues. Maybe this issue would be moot if (1) was fixed, but it seems otherwise counter productive. The filesystem was apparently fully functional for clients that had previously mounted before the MDT recovery (both MDT0003/MDT0001 and MDT0001/MDT0003 remote directory creation worked fine).