Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
Lustre 2.12.8
-
lustre-2.12.8_6.llnl
3.10.0-1160.53.1.1chaos.ch6.x86_64
RHEL7.9
zfs-0.7.11-9.8llnl
-
3
-
9223372036854775807
Description
We upgraded a lustre server cluster from lustre-2.12.7_2.llnl to lustre-2.12.8_6.llnl.
The node on which the MGS runs, copper1, began reporting "new MDS connections" from NIDs that are assigned to client nodes:
Lustre: MGS: Received new MDS connection from 192.168.128.68@o2ib38, keep former export from same NID Lustre: MGS: Received new MDS connection from 192.168.128.8@o2ib42, keep former export from same NID Lustre: MGS: Received new MDS connection from 192.168.131.78@o2ib39, keep former export from same NID Lustre: MGS: Received new MDS connection from 192.168.132.204@o2ib39, keep former export from same NID Lustre: MGS: Received new MDS connection from 192.168.134.127@o2ib27, keep former export from same NID
Clients connect flags includes "mds_mds_connection":
[root@quartz7:lustre]# head */*/connect_flags ==> mgc/MGC172.19.3.1@o2ib600/connect_flags <== flags=0x2000011005002020 flags2=0x0 version barrier adaptive_timeouts mds_mds_connection full20 imp_recov bulk_mbits
The clients are running lustre lustre-2.12.7_2.llnl, which does not have "LU-13356 client: don't use OBD_CONNECT_MNE_SWAB".
Shutting down the servers and restoring them to lustre-2.12.7_2.llnl did not change the symptoms.
Patch stacks are:
https://github.com/LLNL/lustre/releases/tag/2.12.8_6.llnl
https://github.com/LLNL/lustre/releases/tag/2.12.7_2.llnl
Seen during the same lustre server update where we saw LU-15541 but appears to be a separate issue
Attachments
Issue Links
- is related to
-
LU-13356 lctl conf_param hung on the MGS node
- Resolved