Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13241

mount.lustre for large filesystem runs slow debugfs commands

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • None
    • 9223372036854775807

    Description

      Running strace of mount.lustre shows an interesting reason why it spent a huge mount of time on a large filesystem. mount.lustre invokes several 'debugfs' command internally, but they took a huge amount of time here. Here is example.

      10056 10:35:21.590086 execve("/sbin/debugfs", ["debugfs", "-c", "-R", "stat CONFIGS/mountdata", "/dev/sda"]
      10056 10:39:00.257586 +++ exited with 0 +++
      
      10134 10:39:00.259343 execve("/bin/sh", ["sh", "-c", "debugfs -c -R 'dump /CONFIGS/mou"...]
      10134 10:42:39.116052 +++ exited with 0 +++
      

      Total mount time was 982 sec in this time. But, It took 419 sec for just those two debugfs calls against 982 sec.

      These two calls are just reading the CONFIGS/mountdata file from the filesystem, but that could be done directly via libext2fs, along with many of the other operations in libmount_utuls_ldiskfs.c, rather than launching external binaries to do the work. We should take care not to load the whole filesystem metadata, if that can be avoided.

      Attachments

        Issue Links

          Activity

            [LU-13241] mount.lustre for large filesystem runs slow debugfs commands
            pjones Peter Jones added a comment -

            Landed for 2.14

            pjones Peter Jones added a comment - Landed for 2.14

            Although patch was alraedy landed, here is test resutls without/with patch https://review.whamcloud.com/37656/

            1 x 1.2PB OST and 'fake' filled up 50% fileystem by patch https://review.whamcloud.com/#/c/37329/.

            Average mount time (Tried 3 times)
            Without patch: 72.3 sec (77s,70s,70s)
            With    patch: 48.6 sec (51s,48s,47s)
            

            So, patch contributed 33% speedup of mount time.

            sihara Shuichi Ihara added a comment - Although patch was alraedy landed, here is test resutls without/with patch https://review.whamcloud.com/37656/ 1 x 1.2PB OST and 'fake' filled up 50% fileystem by patch https://review.whamcloud.com/#/c/37329/ . Average mount time (Tried 3 times) Without patch: 72.3 sec (77s,70s,70s) With patch: 48.6 sec (51s,48s,47s) So, patch contributed 33% speedup of mount time.

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37656/
            Subject: LU-13241 utils: use libext2fs for ldiskfs operations
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7dc8aa7e7848f800c54eb18ecd59d665484ce822

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37656/ Subject: LU-13241 utils: use libext2fs for ldiskfs operations Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7dc8aa7e7848f800c54eb18ecd59d665484ce822
            dongyang Dongyang Li added a comment -

            correct, I noticed there's a quota library but it's internal to e2fsprogs. I actually got the ldiskfs_write_ldd working with libext2fs as well, with all the lookup/allocate inode/link parent/write inode dance. However e2fsck complains the quota with the just created fs. maybe we can still write the ldd with libext2fs and use e2fsck to fix the quota at the end of mkfs.lustre?

            dongyang Dongyang Li added a comment - correct, I noticed there's a quota library but it's internal to e2fsprogs. I actually got the ldiskfs_write_ldd working with libext2fs as well, with all the lookup/allocate inode/link parent/write inode dance. However e2fsck complains the quota with the just created fs. maybe we can still write the ldd with libext2fs and use e2fsck to fix the quota at the end of mkfs.lustre?

            Running e2fsck definitely handles file quota, but maybe only at the end. There is a quota library internal to e2fsprogs, but it isn't included as part of the -devel install.

            adilger Andreas Dilger added a comment - Running e2fsck definitely handles file quota, but maybe only at the end. There is a quota library internal to e2fsprogs, but it isn't included as part of the -devel install.
            dongyang Dongyang Li added a comment -

            Looks like libext2fs doesn't handle quota, so if we use libext2fs to write the ldd, e2fsck will complain about inconsistent usage and quota accounting on the target, so I left the write part out.

            we also need to install e2fsprogs-devel on the build box from now on, I'm not sure if Jenkins is setup that way.

             

            dongyang Dongyang Li added a comment - Looks like libext2fs doesn't handle quota, so if we use libext2fs to write the ldd, e2fsck will complain about inconsistent usage and quota accounting on the target, so I left the write part out. we also need to install e2fsprogs-devel on the build box from now on, I'm not sure if Jenkins is setup that way.  

            Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/37656
            Subject: LU-13241 utils: use libext2fs for ldiskfs operations
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: bd12ea6a75e185f7d7c457164a0e0c528c26a824

            gerrit Gerrit Updater added a comment - Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/37656 Subject: LU-13241 utils: use libext2fs for ldiskfs operations Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: bd12ea6a75e185f7d7c457164a0e0c528c26a824

            We shouldn't need to read the group descriptors or bitmaps just to read the mountdata file, only when writing it (usually only right after formatting, when most of the filesystem is empty).

            adilger Andreas Dilger added a comment - We shouldn't need to read the group descriptors or bitmaps just to read the mountdata file, only when writing it (usually only right after formatting, when most of the filesystem is empty).
            dongyang Dongyang Li added a comment -

            That commit will reduce the time for tune2fs, and dumpe2fs -h, so we should see improvement when enabling the quota and mmp features.

            debugfs doesn't use that flag and if we are reading/writing the mountdata, we need to read the group descriptors anyway.

            I'm working on making libmount_utils_ldiskfs using the libext2fs, but we still need to do ext2fs_open() like you mentioned. Need to test if there's any improvement after that.

            dongyang Dongyang Li added a comment - That commit will reduce the time for tune2fs, and dumpe2fs -h, so we should see improvement when enabling the quota and mmp features. debugfs doesn't use that flag and if we are reading/writing the mountdata, we need to read the group descriptors anyway. I'm working on making libmount_utils_ldiskfs using the libext2fs, but we still need to do ext2fs_open() like you mentioned. Need to test if there's any improvement after that.

            It may be that commit v1.45.4-16-ge6069a05 "Teach ext2fs_open2() to honor the EXT2_FLAG_SUPER_ONLY flag" in the upstream e2fsprogs could improve the performance of debugfs in this case, since it will avoid reading the group descriptors in this case.

            adilger Andreas Dilger added a comment - It may be that commit v1.45.4-16-ge6069a05 " Teach ext2fs_open2() to honor the EXT2_FLAG_SUPER_ONLY flag " in the upstream e2fsprogs could improve the performance of debugfs in this case, since it will avoid reading the group descriptors in this case.

            People

              dongyang Dongyang Li
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: