[LU-12204] mke2fs in e2fsprogs-1.44.5.wc1 fails for large device Created: 19/Apr/19  Updated: 16/Jul/19  Resolved: 16/Jul/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Shuichi Ihara Assignee: Dongyang Li
Resolution: Fixed Votes: 0
Labels: None
Environment:

e2fsprogs-1.44.5.wc1, rhel7.5, msater branch


Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

If mke2fs formats large device (e.g. 900TB), it fails becouse device too big.

# ls -l /dev/ddn/scratch0_ost0003 
lrwxrwxrwx 1 root root 6 Apr 19 11:19 /dev/ddn/scratch0_ost0003 -> ../sdd
# fdisk -l /dev/sdd

Disk /dev/sdd: 952451.9 GB, 952451947560960 bytes, 232532213760 sectors
Units = sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 2097152 bytes / 2097152 bytes
# mke2fs -V
mke2fs 1.44.5.wc1 (15-Dec-2018)
	Using EXT2FS Library version 1.44.5.wc1
# mkfs.lustre --ost --servicenode=192.168.0.2@tcp --fsname=scratch0 --index=3 --mgsnode=192.168.0.1@tcp --mkfsoptions='-E lazy_itable_init=0,lazy_journal_init=0 -m1 -J size=4096 -O meta_bg' --reformat --backfstype=ldiskfs /dev/ddn/scratch0_ost0003

   Permanent disk data:
Target:     scratch0:OST0003
Index:      3
Lustre FS:  scratch0
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters: failover.node=192.168.0.2@tcp mgsnode=192.168.0.1@tcp

device size = 908328960MB
formatting backing filesystem ldiskfs on /dev/ddn/scratch0_ost0003
	target name   scratch0:OST0003
	4k blocks     232532213760
	options        -m1 -J size=4096  -I 512 -i 1048576 -q -O meta_bg,extents,uninit_bg,mmp,dir_nlink,quota,huge_file,64bit,flex_bg -G 256 -E lazy_itable_init=0,lazy_journal_init=0 -F
mkfs_cmd = mke2fs -j -b 4096 -L scratch0:OST0003  -m1 -J size=4096  -I 512 -i 1048576 -q -O meta_bg,extents,uninit_bg,mmp,dir_nlink,quota,huge_file,64bit,flex_bg -G 256 -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/ddn/scratch0_ost0003 232532213760
   mke2fs: Size of device (0x3624000000 blocks) /dev/ddn/scratch0_ost0003 too big to create
   	a filesystem using a blocksize of 4096.

mkfs.lustre FATAL: Unable to build fs /dev/ddn/scratch0_ost0003 (256)

mkfs.lustre FATAL: mkfs failed 256

However, mke2fs in 1.42.13.wc6 works well.

# mke2fs -V
mke2fs 1.42.13.wc6 (05-Feb-2017)
	Using EXT2FS Library version 1.42.13.wc6

# mkfs.lustre --ost --servicenode=192.168.0.2@tcp --fsname=scratch0 --index=3 --mgsnode=192.168.0.1@tcp --mkfsoptions='-E lazy_itable_init=0,lazy_journal_init=0 -m1 -J size=4096 -O meta_bg' --reformat --backfstype=ldiskfs /dev/ddn/scratch0_ost0003

   Permanent disk data:
Target:     scratch0:OST0003
Index:      3
Lustre FS:  scratch0
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters: failover.node=192.168.0.2@tcp mgsnode=192.168.0.1@tcp

device size = 908328960MB
formatting backing filesystem ldiskfs on /dev/ddn/scratch0_ost0003
	target name   scratch0:OST0003
	4k blocks     232532213760
	options        -m1 -J size=4096  -I 512 -i 1048576 -q -O meta_bg,extents,uninit_bg,mmp,dir_nlink,quota,huge_file,64bit,flex_bg -G 256 -E lazy_itable_init=0,lazy_journal_init=0 -F
mkfs_cmd = mke2fs -j -b 4096 -L scratch0:OST0003  -m1 -J size=4096  -I 512 -i 1048576 -q -O meta_bg,extents,uninit_bg,mmp,dir_nlink,quota,huge_file,64bit,flex_bg -G 256 -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/ddn/scratch0_ost0003 232532213760
Writing CONFIGS/mountdata


 Comments   
Comment by Andreas Dilger [ 19/Apr/19 ]

Dongyang, can you please take a look at this. 

Comment by Dongyang Li [ 25/Apr/19 ]

The check was introduced in upstream 1.43.4:

commit 101ef2e93c253ae62320628e8958067d2d2a4e2a
Author: Jan Kara <jack@suse.cz>
Date:   Tue Oct 25 14:08:59 2016 -0400    mke2fs: Avoid crashes / infinite loops for absurdly large devices
    
    When a device reports absurdly high size, some arithmetics in mke2fs can
    overflow (e.g. number of block descriptors) and we end in an infinite
    loop. Fix that by checking and refusing insanely large devices.
    
    Signed-off-by: Jan Kara <jack@suse.cz>
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index d98e71e0..6a83bd9f 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -2089,6 +2089,18 @@ profile_error:
                        EXT2_BLOCK_SIZE(&fs_param));
                exit(1);
        }
+       /*
+        * Guard against group descriptor count overflowing... Mostly to avoid
+        * strange results for absurdly large devices.
+        */
+       if (fs_blocks_count > ((1ULL << (fs_param.s_log_block_size + 3 + 32)) - 1)) {
+               fprintf(stderr, _("%s: Size of device (0x%llx blocks) %s "
+                                 "too big to create\n\t"
+                                 "a filesystem using a blocksize of %d.\n"),
+                       program_name, fs_blocks_count, device_name,
+                       EXT2_BLOCK_SIZE(&fs_param));
+               exit(1);
+       }
 
        ext2fs_blocks_count_set(&fs_param, fs_blocks_count);

 

block size is 2 ^ (10 + s_log_block_size), so for us with a 4096 blocksize, s_log_block_size is 2 here. so we are really checking if fs_block_count > ((1 << (2 + 3 + 32)) - 1),

which means the target should be no bigger than 512T. if we have 64bit feature enabled(which we do), the target should be a lot bigger than that.

I'm confused about how does the check work here, as well as the magic number 3 and 32.

Andreas can you please shed some light on this?

Thanks

Comment by Andreas Dilger [ 25/Apr/19 ]

I suspect this is a bug? If s_log_block_size=2 instead of =12 as one would expect then this is 2^10=1024x too small. I think the limit is intended to be the number of blocks in 2^32 block groups, each of which can hold blocksize * 8 blocks (= number of bits in a block bitmap).

Comment by Andreas Dilger [ 25/Apr/19 ]

I will push a patch upstream.

Comment by Gerrit Updater [ 25/Apr/19 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34762
Subject: LU-12204 mke2fs: fix check for absurdly large devices
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set: 1
Commit: a231aadd0f5f6a1f44e6ae0a951f778aaa51f6d2

Comment by Shuichi Ihara [ 26/Apr/19 ]

Thanks Andreas. patch https://review.whamcloud.com/34762 works and mke2fs succeeded.

# mke2fs -V
mke2fs 1.44.5.wc1 (15-Dec-2018)
	Using EXT2FS Library version 1.44.5.wc1

# time mkfs.lustre --ost --servicenode=127.0.0.2@tcp --fsname=scratch0 --index=3 --mgsnode=127.0.0.2@tcp --mkfsoptions='-E lazy_itable_init=0,lazy_journal_init=0,stripe_width=512,stride=512 -m1 -J size=4096 -O meta_bg,^resize_inode' --backfstype=ldiskfs --reformat /dev/ddn/scratch0_ost0003

   Permanent disk data:
Target:     scratch0:OST0003
Index:      3
Lustre FS:  scratch0
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters: failover.node=127.0.0.2@tcp mgsnode=127.0.0.2@tcp

device size = 908328960MB
formatting backing filesystem ldiskfs on /dev/ddn/scratch0_ost0003
	target name   scratch0:OST0003
	kilobytes     930128855040
	options        -m1 -J size=4096  -I 512 -i 1048576 -q -O meta_bg,^resize_inode,extents,uninit_bg,mmp,dir_nlink,quota,huge_file,64bit,flex_bg -G 256 -E lazy_itable_init=0,lazy_journal_init=0,stripe_width=512,stride=512 -F
mkfs_cmd = mke2fs -j -b 4096 -L scratch0:OST0003  -m1 -J size=4096  -I 512 -i 1048576 -q -O meta_bg,^resize_inode,extents,uninit_bg,mmp,dir_nlink,quota,huge_file,64bit,flex_bg -G 256 -E lazy_itable_init=0,lazy_journal_init=0,stripe_width=512,stride=512 -F /dev/ddn/scratch0_ost0003 930128855040k
Writing CONFIGS/mountdata

real	7m28.018s
user	0m35.627s
sys	5m9.036s
Comment by Gerrit Updater [ 07/Jun/19 ]

Andreas Dilger (adilger@whamcloud.com) merged in patch https://review.whamcloud.com/34762/
Subject: LU-12204 mke2fs: fix check for absurdly large devices
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set:
Commit: dec51e82bc03b0eb5619c99863ae5d914b814ea7

Comment by Peter Jones [ 16/Jul/19 ]

Can this ticket be considered RESOLVED in 1.45.2-wc1?

Comment by Dongyang Li [ 16/Jul/19 ]

Correct Peter, I will close it as RESOLVED.

Generated at Sat Feb 10 02:50:32 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.