Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16321

Rewrite regression discovered using obdfilter-survey on OST with Lustre 2.15 but also impacts fiemap logic performance

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • None
    • None
    • 2
    • 9223372036854775807

    Description

      On 2.12, we observe a 30 GiB/s throughput, while on 2.15, we see 25 GiB/s.

      Both perf captures and ftrace data (with the help of git blame) have led me down the path to a root cause:

      commit d0337cab8e845efcdbfb9e26e573feb18f28e303
      Author: Mr NeilBrown <neilb@suse.de>
      Date:   Wed Dec 9 13:00:16 2020 +1100
       
          LU-14195 osd: don't use set_fs() for ->fiemap() calls.
       
          ->fiemap() only accesses kernel-space data, so does not need, and
          never has needed, set_fs() calls.
          In Linux 5.10, these calls are deprecated.
          So remove the unnecessary code.

      If I revert the above commit for LU-14195, the obdfilter-survey rewrite throughput is immediately restored.

      Reverting this commit, performance is actually a bit improved to 31 GiB/s.

      Note - Example obdfilter-survey command:

       tests_str="write rewrite" nobjlo=1 nobjhi=1 thrlo=1024 thrhi=1024 rszlo=4096 rszhi=4096 size=524288 obdfilter-survey

      After speaking with Alexander Boyko about this exact issue and the impact it will have once we move to a post 5.10 Linux kernel:

      Petros Koutoupis  8:06 AM
      Hello. I do have a question though. Does it make sense to you how d0337cab8 would impact the obdflilter-survey rewrites (by that much too)?
      
      Alexander Boyko  8:18 AM
      yeap I had a discussion with team. obdfilter executes from ioctl and doesnot have KERNEL_DS
      
      Petros Koutoupis  8:18 AM
      I ask only because in 5.10 set_fs is deprecated
      ok
      
      Alexander Boyko  8:20 AM
      so without a revert it fall to copy_from_user, it returns EFAULT and fiemap handle this as worst case
      this should not affect general Lustre IO 

      More comments from Alexander:

      Unfortunately the LU-14195 patch has a more serious impact on whole Lustre, not only obdfilter-survey. Yesterday I discussed it with Andrew Perepechko, the current idea is

       ldiskfs_fiemap..()->..->copy_to_user() 

      returns EFAULT. EFAULT leads to unmapped block logic, so Lustre would call block allocator, calculate grants, call quota logic for overwrite. All Lustre ioctl requests are affected. There is no simple way for fixing it at new kernels, since all parts with set_fs() were changed to ITER_KVEC. But using set_fs() for fiemap is only Lustre trick, and internal fiemap logic does not support iterator. I think LU-14195 should be reverted until 5.x kernel.

      Attachments

        Activity

          People

            koutoupis Petros Koutoupis
            koutoupis Petros Koutoupis
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: