Loading...

XML

Word

Printable

Details

Type: Improvement
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.17.0
Affects Version/s: Lustre 2.16.0, Lustre 2.15.2
Labels:
- lad24dd
- medium
- usability
- utils

Rank (Obsolete):
9223372036854775807

Description

When the MGS has multiple failover hosts, and each host has multiple NIDs, the mount command line can become excessively long:

# mount -t lustre 172.16.0.26@o2ib,172.16.0.27@o2ib:172.16.0.30@o2ib,172.16.0.31@o2ib:172.16.0.28@o2ib,172.16.0.29@o2ib:172.16.0.24@o2ib,172.16.0.25@o2ib:/testfs /lustre/testfs/client

Not only is this very inconvenient to type and prone to error, but it is also unsightly in "df" and "mount" output because of the very long "device" name. This will become significantly worse once IPv6 NIDs are in use (each addresse will be about twice as long and hard to visually parse), and may even limit the number of NIDs that can be specified on the command-line due to the length.

It is already possible (though uncommonly used, for some reason) to use DNS hostname lookup for the MGS NID in the case of TCP hostnames. The mount.lustre command will do DNS hostname resolution and pass the numeric NIDs to the kernel:

# mount -t lustre mgs-fs1-1a@o2ib,mgs-fs1-1b@oi2b:mgs-fs1-2a@o2ib,mgs-fs1-2b@o2ib:mgs-fs1-3a@o2ib,mgs-fs1-3b@oi2b:mgs-fs1-4a@oi2b,mgs-fs1-4b@o2ib:/testfs /lustre/testfs/client

This is an improvement, but still requires that all of the MGS hostnames are supplied at mount, but at least avoids the need for manually specifying all of the IP addresses. If the network type is not specified, then it defaults to @tcp0, but it is possible to use @o2ib since this will also resolve to an IP address.

A few improvements could be made to shorten this command-line:

allow host or NID ranges to be specified:
```
# mount -t lustre 172.16.0.[24-30/2]@o2ib,172.16.0.[25-31/2]@o2ib:/testfs /lustre/testfs/client
# mount -t lustre mgs-fs1-[1-4]a@o2ib,mgs-fs1-[1-4]b@o2ib:/testfs /lustre/testfs/client
```
This will be significantly less to type than the full NID list, but is somewhat opaque and would still need to be expanded internally by mount.lustre to the full NID list for passing to the kernel. The full list would still be visible in the "mount" and "df" output.
allow round-robin DNS for the MGS interfaces from a single hostname would avoid the need to specify multiple MGS hostnames on the command-line:
```
# host mgs-fs1
mgs-fs1 has address 141.193.213.10
mgs-fs1 has address 141.193.213.11
# mount -t mgs-fs1@o2ib:/testfs /lustre/testfs/client
```
This should work for both IPv4 and IPv6 hostname resolution, but requires that mount.lustre and/or libcfs_str2nid() are able to resolve a single name to multiple NIDs.

All of the "mount.lustre" improvements only shorten the input name given on the command line. However, /proc/mounts would still show the full list of MGS NIDs for df and mount output.

One (possibly controversial) option would be to shorten the command-line displayed in /proc/mounts to only list the NID(s) of the currently-active MGS. That would keep the command-line down to ~2 NIDs, but would mean that copying the shortened displayed command-line to other nodes would potentially cause them problems if the MGS failed over, because they would only know one set of NIDs.

Another option would be (in the case of hostnames or round-robin DNS) would be to internally pass the original MGS name as an "MGS display name" as an option the kernel mount(2) command (e.g. "--mgs-display=mgs-cli1@o2ib"). This would be shown by /proc/mounts and should work equally well for other clients, since they should also be able to resolve the hostname similarly.

Attachments

Issue Links

is related to

LU-17629 Regressions with `lnetctl ping`

Resolved

LU-19915 keep client configurations on MDTs/OSTs in sync with MGS0

Open

LU-18417 Finish IPv6 support

Open

LU-17186 Replace deprecated gethostby*() calls with get*info() to support resolv ordering

Resolved

is related to

LU-10391 LNET: Support IPv6

Resolved

LU-17379 try MGS NIDs more quickly at initial mount

Resolved

LU-16722 MGS config log restructuring

Open

LU-19412 implement DNS multi-address name resolution

Open

LU-107 Lustre init scripts with heartbeat v1 integration

Resolved

(4 is related to )

Sub-Tasks

Progress

try MGS NIDs more quickly at initial mount

Resolved

Mikhail Pershin

Activity

People

Assignee:: Chakshu Kansal

Reporter:: Andreas Dilger

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 14/Apr/23 6:12 PM

Updated:: 30/Apr/26 4:42 PM

Resolved:: 13/Sep/25 9:54 PM