[LU-13241] mount.lustre for large filesystem runs slow debugfs commands Created: 11/Feb/20 Updated: 31/Jan/22 Resolved: 18/Mar/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | Dongyang Li |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Running strace of mount.lustre shows an interesting reason why it spent a huge mount of time on a large filesystem. mount.lustre invokes several 'debugfs' command internally, but they took a huge amount of time here. Here is example. 10056 10:35:21.590086 execve("/sbin/debugfs", ["debugfs", "-c", "-R", "stat CONFIGS/mountdata", "/dev/sda"]
10056 10:39:00.257586 +++ exited with 0 +++
10134 10:39:00.259343 execve("/bin/sh", ["sh", "-c", "debugfs -c -R 'dump /CONFIGS/mou"...]
10134 10:42:39.116052 +++ exited with 0 +++
Total mount time was 982 sec in this time. But, It took 419 sec for just those two debugfs calls against 982 sec. These two calls are just reading the CONFIGS/mountdata file from the filesystem, but that could be done directly via libext2fs, along with many of the other operations in libmount_utuls_ldiskfs.c, rather than launching external binaries to do the work. We should take care not to load the whole filesystem metadata, if that can be avoided. |
| Comments |
| Comment by Andreas Dilger [ 11/Feb/20 ] |
|
It may be that commit v1.45.4-16-ge6069a05 "Teach ext2fs_open2() to honor the EXT2_FLAG_SUPER_ONLY flag" in the upstream e2fsprogs could improve the performance of debugfs in this case, since it will avoid reading the group descriptors in this case. |
| Comment by Dongyang Li [ 17/Feb/20 ] |
|
That commit will reduce the time for tune2fs, and dumpe2fs -h, so we should see improvement when enabling the quota and mmp features. debugfs doesn't use that flag and if we are reading/writing the mountdata, we need to read the group descriptors anyway. I'm working on making libmount_utils_ldiskfs using the libext2fs, but we still need to do ext2fs_open() like you mentioned. Need to test if there's any improvement after that. |
| Comment by Andreas Dilger [ 17/Feb/20 ] |
|
We shouldn't need to read the group descriptors or bitmaps just to read the mountdata file, only when writing it (usually only right after formatting, when most of the filesystem is empty). |
| Comment by Gerrit Updater [ 21/Feb/20 ] |
|
Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/37656 |
| Comment by Dongyang Li [ 21/Feb/20 ] |
|
Looks like libext2fs doesn't handle quota, so if we use libext2fs to write the ldd, e2fsck will complain about inconsistent usage and quota accounting on the target, so I left the write part out. we also need to install e2fsprogs-devel on the build box from now on, I'm not sure if Jenkins is setup that way.
|
| Comment by Andreas Dilger [ 21/Feb/20 ] |
|
Running e2fsck definitely handles file quota, but maybe only at the end. There is a quota library internal to e2fsprogs, but it isn't included as part of the -devel install. |
| Comment by Dongyang Li [ 21/Feb/20 ] |
|
correct, I noticed there's a quota library but it's internal to e2fsprogs. I actually got the ldiskfs_write_ldd working with libext2fs as well, with all the lookup/allocate inode/link parent/write inode dance. However e2fsck complains the quota with the just created fs. maybe we can still write the ldd with libext2fs and use e2fsck to fix the quota at the end of mkfs.lustre? |
| Comment by Gerrit Updater [ 03/Mar/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37656/ |
| Comment by Shuichi Ihara [ 09/Mar/20 ] |
|
Although patch was alraedy landed, here is test resutls without/with patch https://review.whamcloud.com/37656/ 1 x 1.2PB OST and 'fake' filled up 50% fileystem by patch https://review.whamcloud.com/#/c/37329/. Average mount time (Tried 3 times) Without patch: 72.3 sec (77s,70s,70s) With patch: 48.6 sec (51s,48s,47s) So, patch contributed 33% speedup of mount time. |
| Comment by Peter Jones [ 18/Mar/20 ] |
|
Landed for 2.14 |
| Comment by Gerrit Updater [ 23/Mar/20 ] |
|
Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/38027 |
| Comment by Gerrit Updater [ 23/Mar/20 ] |
|
Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/38028 |
| Comment by Gerrit Updater [ 28/May/20 ] |
|
Andreas Dilger (adilger@whamcloud.com) merged in patch https://review.whamcloud.com/38027/ |