Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.12.8
Labels:
- llnl
Environment:
lustre-2.12.8_6.llnl
3.10.0-1160.53.1.1chaos.ch6.x86_64
RHEL7.9
zfs-0.7.11-9.8llnl

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

We upgraded a lustre server cluster from lustre-2.12.7_2.llnl to lustre-2.12.8_6.llnl.

The node on which the MGS runs, copper1, began reporting "new MDS connections" from NIDs that are assigned to client nodes:

Lustre: MGS: Received new MDS connection from 192.168.128.68@o2ib38, keep former export from same NID
Lustre: MGS: Received new MDS connection from 192.168.128.8@o2ib42, keep former export from same NID
Lustre: MGS: Received new MDS connection from 192.168.131.78@o2ib39, keep former export from same NID
Lustre: MGS: Received new MDS connection from 192.168.132.204@o2ib39, keep former export from same NID
Lustre: MGS: Received new MDS connection from 192.168.134.127@o2ib27, keep former export from same NID

Clients connect flags includes "mds_mds_connection":

[root@quartz7:lustre]# head */*/connect_flags
==> mgc/MGC172.19.3.1@o2ib600/connect_flags <==
flags=0x2000011005002020
flags2=0x0
version
barrier
adaptive_timeouts
mds_mds_connection
full20
imp_recov
bulk_mbits

The clients are running lustre lustre-2.12.7_2.llnl, which does not have "~~LU-13356~~ client: don't use OBD_CONNECT_MNE_SWAB".

Shutting down the servers and restoring them to lustre-2.12.7_2.llnl did not change the symptoms.

Patch stacks are:
https://github.com/LLNL/lustre/releases/tag/2.12.8_6.llnl
https://github.com/LLNL/lustre/releases/tag/2.12.7_2.llnl

Seen during the same lustre server update where we saw ~~LU-15541~~ but appears to be a separate issue

Attachments

Issue Links

is related to

LU-13356 lctl conf_param hung on the MGS node

Resolved

Activity

People

Assignee:: Lai Siyao

Reporter:: Olaf Faaland

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 09/Feb/22 2:08 AM

Updated:: 01/Apr/22 10:50 PM

Resolved:: 21/Mar/22 8:05 PM