Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.3.0
-
None
-
3
-
4475
Description
Lustre 2.1 client fails to connect to Lustre 2.2 server
> c0-0c2s6n1 LustreError: 11-0: an error occurred while communicating with 10.149.3.5@o2ib. The mgs_config_read operation failed with -524
> c0-0c2s6n1 LustreError: 4645:0:(mgc_request.c:1917:mgc_process_config()) Cannot process recover llog -524
> c0-0c2s6n1 LustreError: 15c-8: MGC10.149.3.5@o2ib: The configuration from log 'snxs2-client' failed (-524). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
> c0-0c2s6n1 LustreError: 4645:0:(llite_lib.c:983:ll_fill_super()) Unable to process log: -524
the race can be reproduced with following patch:
diff --git a/lustre/ptlrpc/import.c b/lustre/ptlrpc/import.c index 2953352..a69e6b9 100644 — a/lustre/ptlrpc/import.c +++ b/lustre/ptlrpc/import.c @@ -805,6 +805,7 @@ static int ptlrpc_connect_interpret(const struct lu_env *env, } else { IMPORT_SET_STATE(imp, LUSTRE_IMP_FULL); ptlrpc_activate_import(imp); + OBD_FAIL_TIMEOUT(0x5555, 2); } GOTO(finish, rc = 0);
Attachments
Activity
Fix Version/s | New: Lustre 2.1.4 [ 10158 ] |
Fix Version/s | New: Lustre 2.3.0 [ 10117 ] | |
Fix Version/s | New: Lustre 2.4.0 [ 10154 ] | |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Open [ 1 ] | New: Resolved [ 5 ] |
Summary | Original: 2.1 client fails to mount 2.2 filesystem | New: Race in setting connection flags and using them on 2.x client connect |
Assignee | Original: WC Triage [ wc-triage ] | New: Bob Glossman [ bogl ] |
Affects Version/s | New: Lustre 2.3.0 [ 10117 ] | |
Priority | Original: Minor [ 4 ] | New: Blocker [ 1 ] |
Description |
Original:
Lustre 2.1 client fails to connect to Lustre 2.2 server > c0-0c2s6n1 LustreError: 11-0: an error occurred while communicating with 10.149.3.5@o2ib. The mgs_config_read operation failed with -524 > c0-0c2s6n1 LustreError: 4645:0:(mgc_request.c:1917:mgc_process_config()) Cannot process recover llog -524 > c0-0c2s6n1 LustreError: 15c-8: MGC10.149.3.5@o2ib: The configuration from log 'snxs2-client' failed (-524). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. > c0-0c2s6n1 LustreError: 4645:0:(llite_lib.c:983:ll_fill_super()) Unable to process log: -524 the race can be reproduced with following patch: diff --git a/lustre/ptlrpc/import.c b/lustre/ptlrpc/import.c index 2953352..a69e6b9 100644 — a/lustre/ptlrpc/import.c +++ b/lustre/ptlrpc/import.c @@ -805,6 +805,7 @@ static int ptlrpc_connect_interpret(const struct lu_env *env, } else { IMPORT_SET_STATE(imp, LUSTRE_IMP_FULL); ptlrpc_activate_import(imp); + OBD_FAIL_TIMEOUT(0x5555, 2); } GOTO(finish, rc = 0); |
New:
Lustre 2.1 client fails to connect to Lustre 2.2 server > c0-0c2s6n1 LustreError: 11-0: an error occurred while communicating with 10.149.3.5@o2ib. The mgs_config_read operation failed with -524 > c0-0c2s6n1 LustreError: 4645:0:(mgc_request.c:1917:mgc_process_config()) Cannot process recover llog -524 > c0-0c2s6n1 LustreError: 15c-8: MGC10.149.3.5@o2ib: The configuration from log 'snxs2-client' failed (-524). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. > c0-0c2s6n1 LustreError: 4645:0:(llite_lib.c:983:ll_fill_super()) Unable to process log: -524 the race can be reproduced with following patch: {noformat} diff --git a/lustre/ptlrpc/import.c b/lustre/ptlrpc/import.c index 2953352..a69e6b9 100644 — a/lustre/ptlrpc/import.c +++ b/lustre/ptlrpc/import.c @@ -805,6 +805,7 @@ static int ptlrpc_connect_interpret(const struct lu_env *env, } else { IMPORT_SET_STATE(imp, LUSTRE_IMP_FULL); ptlrpc_activate_import(imp); + OBD_FAIL_TIMEOUT(0x5555, 2); } GOTO(finish, rc = 0); {noformat} |