Details
-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
Lustre 2.4.0
-
3
-
9506
Description
A script was erroneously running tunefs.lustre --erase_params against a running MGS. I would have expected this to refuse to run (similar to how mkfs.lustre will refuse to run on a device that is a running target). Instead, it does run. The first and second times it is run, it appears to succeed. After the third run the MGS appears to be corrupted.
After some experimentation I think this only happens when passing tunefs.lustre a device node symlink. On this system this was happening like this:
# ls -l /dev/disk/by-id/scsi-1dev.target0 lrwxrwxrwx 1 root root 9 Jul 30 10:10 /dev/disk/by-id/scsi-1dev.target0 -> ../../sdb
Running against the /dev/sdb path seems to be safe (gives a 17 status code):
# tunefs.lustre --erase-params /dev/sdb ; echo $?
checking for existing Lustre data: found
Reading CONFIGS/mountdata
Read previous values:
Target: MGS
Index: unassigned
Lustre FS:
Mount type: ldiskfs
Flags: 0x74
(MGS needs_index first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:
Permanent disk data:
Target: MGS
Index: unassigned
Lustre FS:
Mount type: ldiskfs
Flags: 0x74
(MGS needs_index first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:
17
...while running against the symlink seems to be unsafe (note that it returns 0 twice, before returning an error code and junk output):
# tunefs.lustre --erase-params /dev/disk/by-id/scsi-1dev.target0 ; echo $? checking for existing Lustre data: found Reading CONFIGS/mountdata Read previous values: Target: MGS Index: unassigned Lustre FS: Mount type: ldiskfs Flags: 0x74 (MGS needs_index first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: Permanent disk data: Target: MGS Index: unassigned Lustre FS: Mount type: ldiskfs Flags: 0x74 (MGS needs_index first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: Writing CONFIGS/mountdata 0 [root@storage-0 log]# tunefs.lustre --erase-params /dev/disk/by-id/scsi-1dev.target0 ; echo $? checking for existing Lustre data: found Reading CONFIGS/mountdata Read previous values: Target: MGS Index: unassigned Lustre FS: Mount type: ldiskfs Flags: 0x74 (MGS needs_index first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: Permanent disk data: Target: MGS Index: unassigned Lustre FS: Mount type: ldiskfs Flags: 0x74 (MGS needs_index first_time update ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: Writing CONFIGS/mountdata 0 [root@storage-0 log]# tunefs.lustre --erase-params /dev/disk/by-id/scsi-1dev.target0 ; echo $? checking for existing Lustre data: found Reading CONFIGS/mountdata Read previous values: Target: Index: 10 Lustre FS: Mount type: h Flags: 0 () Persistent mount opts: Parameters:� tunefs.lustre FATAL: must set target type: MDT,OST,MGS tunefs.lustre: exiting with 22 (Invalid argument) 22
At the least, this tool should resolve symlinks to prevent running against a running target. Ideally, it would also use multi mount protection to be safe even when run from a different server when the target is mounted somewhere else.