Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
9223372036854775807
Description
When reconnecting idle imports (in ptlrpc_request_alloc_internal and ptlrpc_disconnect_idle_interpret), we set the import state to IMP_NEW, and then release the import lock before immediately calling ptlrpc_connect_import, which immediately takes the import lock.
However, there's a small gap where an import in the IMP_NEW state is exposed.
This can cause messages sent in this interval to fail, like this:
ptlrpc_import_delay_req()) @@@ Uninitialized import. req@[...]x1638589897143744/t0(0) o101->[....]-OST0000-osc-[...]@o2ib:28/4 lens 328/400 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1
} else if (imp->imp_state == LUSTRE_IMP_NEW) { DEBUG_REQ(D_ERROR, req, "Uninitialized import."); *status = -EIO;
The solution is to not release the import lock, so this gap does not exist.