Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • None
    • None
    • 9223372036854775807

    Description

      We have observed slow performance on mmap read on some applications.

      The problem is if access pattern is neither sequential nor stride reading, but
      still adjacent in a small range and then seek a random position.

      So the pattern could be something like this:

      [1M data] [hole..] [0.5M data] [hole]......[0.7M data]......[1M data]

      So every time application want to access some data, the data is not only 4K but
      cosed to let's say 1M range, so if we could predict this kind of behavior and next time we hit page miss, we use last time range for prefetching.

      Attachments

        Issue Links

          Activity

            [LU-13669] improve mmap performances

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41228/
            Subject: LU-13669 llite: make readahead aware of hints
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7542820698696ed5853ded30c9bf7fd5a78f0937

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41228/ Subject: LU-13669 llite: make readahead aware of hints Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7542820698696ed5853ded30c9bf7fd5a78f0937

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38916/
            Subject: LU-13669 llite: try to improve mmap performance
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 0c5ad4b6df5bf35b291842fc6d42c2720246a026

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38916/ Subject: LU-13669 llite: try to improve mmap performance Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0c5ad4b6df5bf35b291842fc6d42c2720246a026

            Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39926
            Subject: LU-13669 readahead: increase ra window progressively
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fc72ad754bc8fcc86ac96661329d9b074e3703e8

            gerrit Gerrit Updater added a comment - Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39926 Subject: LU-13669 readahead: increase ra window progressively Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fc72ad754bc8fcc86ac96661329d9b074e3703e8

            Here is a benchmarking from Ihara:

            master
            [root@amd01 ~]#  echo 3 > /proc/sys/vm/drop_caches; fio --name=randread --directory=/ai400/fio --rw=randread --ioengine=mmap --bs=128K --numjobs=32 --filesize=200G --filename=randread --time_based --status-interval=10s --runtime=30s --allow_file_create=1 --group_reporting --disable_lat=1 --disable_clat=1 --disable_slat=1 --disk_util=0 --aux-path=/tmp --randrepeat=0 --unique_filename=0 --fallocate=0 
            [root@amd01 ~]# lctl get_param llite.*.stats
            llite.ai400-ffff8d855cf6a800.stats=
            snapshot_time             1594299686.416386775 secs.nsecs
            open                      49 samples [usec] 0 1401 22183
            close                     49 samples [usec] 6 246 1117
            mmap                      32 samples [usec] 7 17 381
            page_fault                1867904 samples [usec] 0 37698 956414944
            getattr                   34 samples [usec] 1 25 107
            inode_permission          214 samples [usec] 0 22 407
            master + patch
            [root@amd01 ~]# echo 3 > /proc/sys/vm/drop_caches; fio --name=randread --directory=/ai400/fio --rw=randread --ioengine=mmap --bs=128K --numjobs=32 --filesize=200G --filename=randread --time_based --status-interval=10s --runtime=30s --allow_file_create=1 --group_reporting --disable_lat=1 --disable_clat=1 --disable_slat=1 --disk_util=0 --aux-path=/tmp --randrepeat=0 --unique_filename=0 --fallocate=0
            [root@amd01 ~]# lctl get_param llite.*.stats
            llite.ai400-ffff8d862cbc6000.stats=
            snapshot_time             1594299943.816443708 secs.nsecs
            open                      63 samples [usec] 1 4007 118809
            close                     63 samples [usec] 4 250 1410
            mmap                      32 samples [usec] 8 51 659
            page_fault                17773312 samples [usec] 0 6543 933220595
            getattr                   34 samples [usec] 2 224 440
            inode_permission          256 samples [usec] 0 276 1292
            

            tested first 30sec, mostly no cached mmap
            512usec vs 52usec
            10x latency reduction.

            max latency 37698usec vs 6543usec
            max latency is also very small

            wshilong Wang Shilong (Inactive) added a comment - Here is a benchmarking from Ihara: master [root@amd01 ~]# echo 3 > /proc/sys/vm/drop_caches; fio --name=randread --directory=/ai400/fio --rw=randread --ioengine=mmap --bs=128K --numjobs=32 --filesize=200G --filename=randread --time_based --status-interval=10s --runtime=30s --allow_file_create=1 --group_reporting --disable_lat=1 --disable_clat=1 --disable_slat=1 --disk_util=0 --aux-path=/tmp --randrepeat=0 --unique_filename=0 --fallocate=0 [root@amd01 ~]# lctl get_param llite.*.stats llite.ai400-ffff8d855cf6a800.stats= snapshot_time 1594299686.416386775 secs.nsecs open 49 samples [usec] 0 1401 22183 close 49 samples [usec] 6 246 1117 mmap 32 samples [usec] 7 17 381 page_fault 1867904 samples [usec] 0 37698 956414944 getattr 34 samples [usec] 1 25 107 inode_permission 214 samples [usec] 0 22 407 master + patch [root@amd01 ~]# echo 3 > /proc/sys/vm/drop_caches; fio --name=randread --directory=/ai400/fio --rw=randread --ioengine=mmap --bs=128K --numjobs=32 --filesize=200G --filename=randread --time_based --status-interval=10s --runtime=30s --allow_file_create=1 --group_reporting --disable_lat=1 --disable_clat=1 --disable_slat=1 --disk_util=0 --aux-path=/tmp --randrepeat=0 --unique_filename=0 --fallocate=0 [root@amd01 ~]# lctl get_param llite.*.stats llite.ai400-ffff8d862cbc6000.stats= snapshot_time 1594299943.816443708 secs.nsecs open 63 samples [usec] 1 4007 118809 close 63 samples [usec] 4 250 1410 mmap 32 samples [usec] 8 51 659 page_fault 17773312 samples [usec] 0 6543 933220595 getattr 34 samples [usec] 2 224 440 inode_permission 256 samples [usec] 0 276 1292 tested first 30sec, mostly no cached mmap 512usec vs 52usec 10x latency reduction. max latency 37698usec vs 6543usec max latency is also very small

            Wang Shilong (wshilong@ddn.com) uploaded a new patch: https://review.whamcloud.com/38916
            Subject: LU-13669 llite: try to improve mmap performance
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 84116d4451917e861671079ddf7ef57c722565c2

            gerrit Gerrit Updater added a comment - Wang Shilong (wshilong@ddn.com) uploaded a new patch: https://review.whamcloud.com/38916 Subject: LU-13669 llite: try to improve mmap performance Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 84116d4451917e861671079ddf7ef57c722565c2

            People

              wshilong Wang Shilong (Inactive)
              wshilong Wang Shilong (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: