Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6823

Performance regression on servers with LU-5264

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.5.3
    • None
    • 3
    • 9223372036854775807

    Description

      Since the introduction of LU-5264, we hit large performance regression on our filesystem with some user code that do lots of IOPS. This has an huge impact on the MDS, and btw on all the lustre clients. On the MDS, ptlrpcd, mdt and ldlm threads are overloaded, waiting in _spin_lock().

      The perf report recorded during a slow-down window of the FS is attached. Here is a sample from perf.report-dso:

      #
      # Overhead             Shared Object
      # ........  ........................
      #
          98.08%  [kernel.kallsyms]
                  |
                  --- _spin_lock
                     |
                     |--97.33%-- 0xffffffffa05c12dc
                     |          |
                     |          |--53.24%-- 0xffffffffa075b629
                     |          |          kthread
                     |          |          child_rip
                     |          |
                     |          |--45.63%-- 0xffffffffa075b676
                     |          |          kthread
                     |          |          child_rip
                     |          |
                     |           --1.12%-- 0xffffffffa076aa58
                     |                     kthread
                     |                     child_rip
                     |
      

      Callers:

      crash> kmem 0xffffffffa05c12dc
      ffffffffa05c12dc (t) lu_context_exit+188 [obdclass] 
      
         VM_STRUCT                 ADDRESS RANGE               SIZE
      ffff881877c91a40  ffffffffa0571000 - ffffffffa06a5000  1261568
      
            PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
      ffffea0055920d28 1872df3000                0  e4d1079  1 140000000000000
      crash> kmem 0xffffffffa075b676
      ffffffffa075b676 (t) ptlrpc_main+2806 [ptlrpc] ../debug/lustre-2.5.3.90/lustre/ptlrpc/service.c: 2356
      
         VM_STRUCT                 ADDRESS RANGE               SIZE
      ffff881877c91400  ffffffffa06fb000 - ffffffffa0895000  1679360
      
            PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
      ffffea004b685228 158b853000                0      110  1 140000000000000
      crash> kmem 0xffffffffa075b629
      ffffffffa075b629 (t) ptlrpc_main+2729 [ptlrpc] ../debug/lustre-2.5.3.90/lustre/ptlrpc/service.c: 2534
      
         VM_STRUCT                 ADDRESS RANGE               SIZE
      ffff881877c91400  ffffffffa06fb000 - ffffffffa0895000  1679360
      
            PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
      ffffea004b685228 158b853000                0      110  1 140000000000000
      

      This _spin_lock() has been introduced by LU-5264, see http://review.whamcloud.com/#/c/13103/ . I can't find any backport to b2_5 in gerrit, but it seems that our engineering got the agreement to use this patch with Lustre 2.5.

      For now, as a workaround, we removed both patches from LU-5264 and LU-6049.

      My guess is that it should happen as well on OSS.

      Could you help us to fix this? Do we miss some patches on top of 2.5.3.90? Is there any other conflicting patch?

      I attach some debug logs (dump_log.tgz) from the MDS during the observed issue, if this can help.

      If you need further information, please let me know.

      Attachments

        1. dump_log.tgz
          3.20 MB
        2. perf.report.gz
          2.45 MB
        3. perf.report-dso
          1.69 MB

        Issue Links

          Activity

            [LU-6823] Performance regression on servers with LU-5264

            Closing as a duplicate of LU-6800.

            adilger Andreas Dilger added a comment - Closing as a duplicate of LU-6800 .

            Indeed, this is a duplicate of LU-6800. Gaëtan is working on this issue with me, feel free to contact him if you need further information. A crash dump is available at the customer site if you need more "real-world" data.

            bruno.travouillon Bruno Travouillon (Inactive) added a comment - Indeed, this is a duplicate of LU-6800 . Gaëtan is working on this issue with me, feel free to contact him if you need further information. A crash dump is available at the customer site if you need more "real-world" data.
            bfaccini Bruno Faccini (Inactive) added a comment - - edited

            Hello Bruno,
            This is likely a duplicate of LU-6800. And my LU-5264 patch is the culprit here.
            A first fix (changing spin-lock in a rw-lock) is currently under testing and others possible way to fix/improve too.
            Will let you know asap how it goes.
            Also, thanks for your "real-world" profiling infos.

            bfaccini Bruno Faccini (Inactive) added a comment - - edited Hello Bruno, This is likely a duplicate of LU-6800 . And my LU-5264 patch is the culprit here. A first fix (changing spin-lock in a rw-lock) is currently under testing and others possible way to fix/improve too. Will let you know asap how it goes. Also, thanks for your "real-world" profiling infos.

            People

              bfaccini Bruno Faccini (Inactive)
              bruno.travouillon Bruno Travouillon (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: