[LU-12559] Reconnecting from idle exposes import in IMP_NEW state, resulting in EIO Created: 16/Jul/19 Updated: 04/Oct/19 Resolved: 17/Sep/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.12.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Patrick Farrell (Inactive) | Assignee: | Patrick Farrell (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
When reconnecting idle imports (in ptlrpc_request_alloc_internal and ptlrpc_disconnect_idle_interpret), we set the import state to IMP_NEW, and then release the import lock before immediately calling ptlrpc_connect_import, which immediately takes the import lock. However, there's a small gap where an import in the IMP_NEW state is exposed. This can cause messages sent in this interval to fail, like this: ptlrpc_import_delay_req()) @@@ Uninitialized import. req@[...]x1638589897143744/t0(0) o101->[....]-OST0000-osc-[...]@o2ib:28/4 lens 328/400 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1 } else if (imp->imp_state == LUSTRE_IMP_NEW) {
DEBUG_REQ(D_ERROR, req, "Uninitialized import.");
*status = -EIO;
The solution is to not release the import lock, so this gap does not exist. |
| Comments |
| Comment by Gerrit Updater [ 16/Jul/19 ] |
|
Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35530 |
| Comment by Gerrit Updater [ 17/Sep/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35530/ |
| Comment by Peter Jones [ 17/Sep/19 ] |
|
Landed for 2.13 |
| Comment by Gerrit Updater [ 17/Sep/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36215 |
| Comment by Gerrit Updater [ 04/Oct/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36215/ |