Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
Lustre 2.11.0
-
Separate MGS and MDS or separate MGT and MDT
-
3
-
9223372036854775807
Description
conf-sanity test 33a fails when run on a Lustre file system configured with a separate MDS and MGS, even with a separate MDT and MGT on the same node. The error message is
conf-sanity test_33a: @@@@@@ FAIL: mount -t lustre failed
In the test log, we see that mounting the MDS for the newly created file system fails
CMD: onyx-50vm7 mkdir -p /mnt/lustre-fs2mds; mount -t lustre /dev/lvm-Role_MDS/S1 /mnt/lustre-fs2mds onyx-50vm7: mount.lustre: mount /dev/mapper/lvm--Role_MDS-S1 at /mnt/lustre-fs2mds failed: Operation already in progress onyx-50vm7: The target service is already running. (/dev/mapper/lvm--Role_MDS-S1)
This failure looks like it is due to the mgs flags used when the MDT is formatted; “—mgs” flag. When an MGS already exists, specifying that this node will be the MGS for the new file system causes problems.
In addition, the lctl call in the following code needs to be run on the MGS
2418 do_facet $SINGLEMDS "$LCTL conf_param $FSNAME2.sys.timeout=200" || 2419 error "$LCTL conf_param $FSNAME2.sys.timeout=200 failed"
We see similar failures for conf-sanity tests 43b, 53b, 54b
Logs that capture test 33a failures are at
https://testing.hpdd.intel.com/test_sets/5176f130-729c-11e7-a0a2-5254006e85c2
https://testing.hpdd.intel.com/test_sets/8df40f4c-729e-11e7-a0a2-5254006e85c2
Note: This tickets description was modified to state the problem more clearly and not only subscribe a solution.