Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.10.0
-
None
-
Lustre: Build Version: 2.10.0_5_gbb3c407
-
3
-
9223372036854775807
Description
When mount -t lustre ... has failed to actually mount a target, the exit code of mount does not reflect this:
# mount -t lustre zfs_pool_scsi0QEMU_QEMU_HARDDISK_disk13/MGS /mnt/MGS e2label: No such file or directory while trying to open zfs_pool_scsi0QEMU_QEMU_HARDDISK_disk13/MGS Couldn't find valid filesystem superblock. # echo $? 0
This of course wreaks havoc on systems such as IML which rely on the exit code of one step in the process of starting a filesystem to decide if it should continue with subsequent steps.
Attachments
Issue Links
- is duplicated by
-
LU-9853 mount.lustre noisy on mount
-
- Resolved
-
utopiabound:
It's probably moot, but just for clarity, the ZFS version is whatever is built by Jenkins with b2_10. e2fsprogs is most recent GA and O/S is RHEL 7.4. I doubt any of these are particularly relevant though.
I think you got much more than just "sort of close". I think you got an exact reproduction. The names of pools, etc. I think is quite irrelevant.
The question is still though, when the e2label call is only in the ldiskfs OSD codepath, in ldiskfs_read_ldd(), why is that being hit for a ZFS formatted target?
My reading of the code is that by the time osd_read_ldd() is supposed to call either zfs_read_ldd() or ldiskfs_read_ldd(), the format of the target is known and stored in ldd->ldd_mount_type, so only the relevant one of either zfs_read_ldd() or ldiskfs_read_ldd() should be called, not both and so why are we getting an error from the e2label that is only in ldiskfs_read_ldd()?