Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12988

improve mount time on huge ldiskfs filesystem

Details

    • 9223372036854775807

    Description

      during Lustre server startup few small files need to be updated (e.g. config backup).
      at this point buddy/bitmap cache is empty but mballoc wants to find a big chunk of free space for group preallocation and reads bitmaps one by one.
      sometimes this can take a very long time.
      one possisble workaround is to disable preallocation during mount.
      a long term plan is to limit scanning and prefetch bitmaps, but this needs more efforts and will be tracked separately (LU-12970)

      Attachments

        Issue Links

          Activity

            [LU-12988] improve mount time on huge ldiskfs filesystem

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37539/
            Subject: LU-12988 ldiskfs: skip non-loaded groups at cr=0/1
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: de994667dda925109e862edadb4aa4feaecd0e6b

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37539/ Subject: LU-12988 ldiskfs: skip non-loaded groups at cr=0/1 Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: de994667dda925109e862edadb4aa4feaecd0e6b

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37811/
            Subject: LU-12988 ldiskfs: port ext4-mballoc-prefetch.patch to RHEL 8.1
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 896e12c2e4fc98cbc15c675ec2894e9511aa92a7

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37811/ Subject: LU-12988 ldiskfs: port ext4-mballoc-prefetch.patch to RHEL 8.1 Project: fs/lustre-release Branch: master Current Patch Set: Commit: 896e12c2e4fc98cbc15c675ec2894e9511aa92a7

            Jian Yu (yujian@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37811
            Subject: LU-12988 ldiskfs: port ext4-mballoc-prefetch.patch to RHEL 8.1
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 89fd17997924aafb78ab0cc24debaf12b0e17b87

            gerrit Gerrit Updater added a comment - Jian Yu (yujian@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37811 Subject: LU-12988 ldiskfs: port ext4-mballoc-prefetch.patch to RHEL 8.1 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 89fd17997924aafb78ab0cc24debaf12b0e17b87

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37633/
            Subject: LU-12988 ldiskfs: mballoc to prefetch groups
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: b7cd65a3d1d665f1bee5eb8ad3b989b12be7de08

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37633/ Subject: LU-12988 ldiskfs: mballoc to prefetch groups Project: fs/lustre-release Branch: master Current Patch Set: Commit: b7cd65a3d1d665f1bee5eb8ad3b989b12be7de08

            Alex, please see LU-13290, there is another perforamnce regression in ldiksfs. I think there are two major performance regressions in ldiskfs. Although patch https://review.whamcloud.com/#/c/37619 was odd behaviors, we also need to fix another regression is caused by LU-12988.

            sihara Shuichi Ihara added a comment - Alex, please see LU-13290 , there is another perforamnce regression in ldiksfs. I think there are two major performance regressions in ldiskfs. Although patch https://review.whamcloud.com/#/c/37619 was odd behaviors, we also need to fix another regression is caused by LU-12988 .

            thanks for the report. have you got performance back to normal with mballoc-prefetch patch reverted? or it's still below expectation?

            bzzz Alex Zhuravlev added a comment - thanks for the report. have you got performance back to normal with mballoc-prefetch patch reverted? or it's still below expectation?

            Although patch https://review.whamcloud.com/#/c/37619 was just reverted other reason, but that patch also caused a big perforamnce regression on large OST (280TB).
            during IOR (FPP, 1MB), there were only few IOs and stucking, then few IO and stucking again below. Is new revised patch aware of it?

            procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
             r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
             1  0      0 45456188 9647936 1017916    0    0    15 10610   27   10  0  2 98  0  0
             0  0      0 45456312 9647936 1017916    0    0     0     4  205  200  0  0 100  0  0
             0  0      0 45456312 9647936 1017916    0    0     0     4  150  142  0  0 100  0  0
             0  0      0 45456312 9647936 1017916    0    0     0     4  131  113  0  0 100  0  0
             0  0      0 45456312 9647936 1017916    0    0     0     4  269  360  0  0 100  0  0
             0  0      0 45456288 9647936 1017916    0    0     0     0  370  527  0  0 100  0  0
             1  1      0 45440948 9648208 1018852    0    0     0 1955144 19798 24610  0  5 93  2  0
             9  1      0 45375612 9648208 1018852    0    0     0 3232300 33648 35086  0 27 67  6  0
             9  0      0 45379668 9648208 1018852    0    0     0  1868 14431 8833  0 50 46  3  0
             9  0      0 45379460 9648216 1018844    0    0     0    16 8694  844  0 50 50  0  0
             9  0      0 45379460 9648216 1018852    0    0     0     0 8641  800  0 50 50  0  0
             9  0      0 45379460 9648224 1018852    0    0     0  1396 9604  838  0 50 50  0  0
             8  0      0 45379460 9648224 1018852    0    0     0 16388 12385  909  0 47 53  0  0
             2  0      0 45380196 9648224 1018856    0    0     0 79060 11568 1360  0 41 59  0  0
             6  0      0 45378456 9648232 1018852    0    0     0 640900 14445 7692  0 30 68  2  0
             6  0      0 45378784 9648232 1018856    0    0     0     0 7422  751  0 38 62  0  0
             6  0      0 45378784 9648232 1018856    0    0     0     4 7304  717  0 38 63  0  0
             4  0      0 45378784 9648232 1018856    0    0     0 61820 7182 1135  0 35 65  0  0
             7  0      0 45378832 9648240 1018860    0    0     0 692700 10136 5355  0 27 72  1  0
             7  0      0 45378832 9648240 1018860    0    0     0   284 10282  844  0 38 62  0  0
            
            sihara Shuichi Ihara added a comment - Although patch https://review.whamcloud.com/#/c/37619 was just reverted other reason, but that patch also caused a big perforamnce regression on large OST (280TB). during IOR (FPP, 1MB), there were only few IOs and stucking, then few IO and stucking again below. Is new revised patch aware of it? procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 0 45456188 9647936 1017916 0 0 15 10610 27 10 0 2 98 0 0 0 0 0 45456312 9647936 1017916 0 0 0 4 205 200 0 0 100 0 0 0 0 0 45456312 9647936 1017916 0 0 0 4 150 142 0 0 100 0 0 0 0 0 45456312 9647936 1017916 0 0 0 4 131 113 0 0 100 0 0 0 0 0 45456312 9647936 1017916 0 0 0 4 269 360 0 0 100 0 0 0 0 0 45456288 9647936 1017916 0 0 0 0 370 527 0 0 100 0 0 1 1 0 45440948 9648208 1018852 0 0 0 1955144 19798 24610 0 5 93 2 0 9 1 0 45375612 9648208 1018852 0 0 0 3232300 33648 35086 0 27 67 6 0 9 0 0 45379668 9648208 1018852 0 0 0 1868 14431 8833 0 50 46 3 0 9 0 0 45379460 9648216 1018844 0 0 0 16 8694 844 0 50 50 0 0 9 0 0 45379460 9648216 1018852 0 0 0 0 8641 800 0 50 50 0 0 9 0 0 45379460 9648224 1018852 0 0 0 1396 9604 838 0 50 50 0 0 8 0 0 45379460 9648224 1018852 0 0 0 16388 12385 909 0 47 53 0 0 2 0 0 45380196 9648224 1018856 0 0 0 79060 11568 1360 0 41 59 0 0 6 0 0 45378456 9648232 1018852 0 0 0 640900 14445 7692 0 30 68 2 0 6 0 0 45378784 9648232 1018856 0 0 0 0 7422 751 0 38 62 0 0 6 0 0 45378784 9648232 1018856 0 0 0 4 7304 717 0 38 63 0 0 4 0 0 45378784 9648232 1018856 0 0 0 61820 7182 1135 0 35 65 0 0 7 0 0 45378832 9648240 1018860 0 0 0 692700 10136 5355 0 27 72 1 0 7 0 0 45378832 9648240 1018860 0 0 0 284 10282 844 0 38 62 0 0

            Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37633
            Subject: LU-12988 ldiskfs: mballoc to prefetch groups
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d8e16c01a5d2d4b6448c3a766bf83ade35268f9d

            gerrit Gerrit Updater added a comment - Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37633 Subject: LU-12988 ldiskfs: mballoc to prefetch groups Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d8e16c01a5d2d4b6448c3a766bf83ade35268f9d

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37619/
            Subject: LU-12988 ldiskfs: revert prefetch patch
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 2c5700fcb4cb15056dc901fedf97001d9b9fd845

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37619/ Subject: LU-12988 ldiskfs: revert prefetch patch Project: fs/lustre-release Branch: master Current Patch Set: Commit: 2c5700fcb4cb15056dc901fedf97001d9b9fd845

            Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37619
            Subject: LU-12988 ldiskfs: revert prefetch patch
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 353e0756371f92a7f0f2da80b2a803b278126bec

            gerrit Gerrit Updater added a comment - Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37619 Subject: LU-12988 ldiskfs: revert prefetch patch Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 353e0756371f92a7f0f2da80b2a803b278126bec

            People

              bzzz Alex Zhuravlev
              bzzz Alex Zhuravlev
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: