[LU-10520] Cannot format large MDT with ldiskfs Created: 16/Jan/18 Updated: 18/Apr/19 Resolved: 08/Mar/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0, Lustre 2.11.0 |
| Fix Version/s: | Lustre 2.11.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Joe Grund | Assignee: | Yang Sheng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
Taken from: https://github.com/intel-hpdd/intel-manager-for-lustre/issues/450
DescriptionWhen trying to create a MDT on a 29TB volume, the mke2fs command receive ^extents instead of extents which cause the command to fails. Repro
Create an MDT on a large LUN.
modprobe osd_ldiskfs: 0
mkfs.lustre --mdt --mgsnode=172.21.61.200@tcp0 --mgsnode=172.21.61.206@tcp0 --failnode=172.21.61.200@tcp0 --reformat --index=0 --mkfsoptions=-I 512 -i 2048 -J size=2048 --backfstype=ldiskfs --fsname=BIGSI01 /dev/mapper/mpathb: 1
Permanent disk data:
Target: BIGSI01:MDT0000
Index: 0
Lustre FS: BIGSI01
Mount type: ldiskfs
Flags: 0x61
(MDT first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: mgsnode=172.21.61.200@tcp:172.21.61.206@tcp failover.node=172.21.61.200@tcp
device size = 30501008MB
formatting backing filesystem ldiskfs on /dev/mapper/mpathb
target name BIGSI01:MDT0000
4k blocks 7808258048
options -I 512 -i 2048 -J size=2048 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,64bit,flex_bg -E lazy_journal_init -F
mkfs_cmd = mke2fs -j -b 4096 -L BIGSI01:MDT0000 -I 512 -i 2048 -J size=2048 -q -O dirdata,uninit_bg,^extents,mmp,dir_nlink,quota,huge_file,64bit,flex_bg -E lazy_journal_init -F /dev/mapper/mpathb 7808258048
Found a gpt partition table in /dev/mapper/mpathb
Extents MUST be enabled for a 64-bit filesystem. Pass -O extents to rectify.
mkfs.lustre FATAL: Unable to build fs /dev/mapper/mpathb (256)
mkfs.lustre FATAL: mkfs failed 256
If I manually try the command and remove the "^" character, the command succeeds. |
| Comments |
| Comment by Joe Grund [ 16/Jan/18 ] |
|
Ok found where the "^extends" come from : It's in build in the mkfs.lustre binary from lustre-2.10.2-1.src.rpm Exactly it seems to come from here: The line is the same for all the 2.10 releases, so this might not be entirely related to the issue (or else this probably would have shown in more places). Although, I am not sure that line 601 is supposed to look like that, as line 600 which add "uninit_bg" to the options list is : lustre/utils/mount_utils_ldiskfs.c:600: append_unique(anchor, ",", "uninit_bg", NULL, maxbuflen); |
| Comment by Peter Jones [ 16/Jan/18 ] |
|
Yang Sheng Could you please look into this? Thanks Peter |
| Comment by Andreas Dilger [ 16/Jan/18 ] |
|
There is no point in formatting an MDT filesystem larger than about 8-16TB, unless the new Data-on-MDT (DoM) feature is used, but that feature will not be available until the Lustre 2.11 release. Since the MDT (without the DoM feature) only holds inodes (1KB in size for 2.10 and later) plus directories, xattrs, and some Lustre log files (average 2KB per inode), and there is an upper limit of 4B inodes, 4B * 2KB = 8TB. Having a larger MDT is largely a waste of space, since the extra space above 8TB cannot be used until the DoM feature is available. If you are formatting this very large MDT in anticipation of the DoM feature, and are aware of this limitation that is OK. We need to make a patch to libmount_utils_ldiskfs.c to enable the extents feature only for MDT filesystems over 16TB in size. |
| Comment by Louis Bailleul [ 16/Jan/18 ] |
|
Hi, Thanks for the quick clarification. I was suspecting an issue with the size of the MDT as reducing the LUN from 29TB to 2TB allows it to format properly. Although I still get the weird "^extents" in the mke2fs parameters list (is this a typo, or does '^' has special meaning ?).
Also last thing, creating a zpool of 29TB and creating an MDT on top of it works, even if as you mentioned, while running 2.10 this is mostly wasting space. |
| Comment by Yang Sheng [ 17/Jan/18 ] |
|
Hi, Louis, The '^' means disabled feature. That said, we disable extents feature on MDT default. Thanks, |
| Comment by Andreas Dilger [ 17/Jan/18 ] |
|
Note that formatting a ZFS MDT if 29 TB is not a necessarily a waste of space, since ZFS dynamically allocates inodes in the filesystem, though it uses about twice as much space per inode (4KB vs 2KB) compared to ldiskfs. That means if it is used to create a lot of files (as the MDT is traditionally used, it could hold about 7B inodes, or if it was used to hold 64KB files for DoM it could hold about 450M files. |
| Comment by Gerrit Updater [ 26/Jan/18 ] |
|
Yang Sheng (yang.sheng@intel.com) uploaded a new patch: https://review.whamcloud.com/31037 |
| Comment by Sebastien Buisson (Inactive) [ 31/Jan/18 ] |
Hi, Cheers, |
| Comment by Gerrit Updater [ 08/Mar/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31037/ |
| Comment by Peter Jones [ 08/Mar/18 ] |
|
Landed for 2.11 |