Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1057

low performance maybe related to quota

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.3.0, Lustre 2.1.4
    • None
    • Lustre 2.1 with Bull patches, bullxlinux6.1 x86_64 (based on Redhat 6.1), server bullx S6010-4
    • 3
    • 4513

    Description

      When running a performance test (sequential data IOs, 15 tasks writing in one file each) on a Lustre file-system, installed with Lustre 2.1 plus a few Bull patches, I observe very low throughput compared to what I usually measure on the same hardware.

      Write bandwidth is varying between 150MB/s and 500 MB/s running with a standard user. With the exact same parameters and configuration, but running under the root user, I get around 2000 MB/s write bandwidth. This second value is what I observe usually.
      With the root user, I suppose the flag OBD_BRW_NOQUOTA is set (but I have not been able to confirm that from the source code), which makes the request processing skip the lquota_chkdq() quota check in osc_queue_async_io().

      The profiling of the Lustre client indicates more than 50% of time is spent in osc_quota_chkdq() routine. So this seems related to the quota subsystem and certainly explains why root user is not impacted by the problem. I will attach the profiling reports to this ticket.

      The Lustre client is a bullx S6010-4, which has 128 cores and a large NUMIOA factor. The same performance measure on a bullx S6010, which has only 32 cores and smaller NUMIOA factor, gives around 3000 MB/s write bandwidth, so it is not impacted by the performance issue.

      I have recompiled the lquota module after removing the cfs_spin_lock()/cfs_spin_unlock() calls on qinfo_list_lock in osc_quota_chkdq() routine and the performance is back to the expected level. Note that the qinfo_hash[] table in the Lustre client is empty since quota are disabled.

      How many asynchronous IO requests can be generated by only 15 writing tasks ? Is there so many requests in parallel that the qinfo_list_lock becomes a congestion point ?

      Is there more latency in the spin_lock()/spin_unlock() routines when the NUMIOA factor is high ?

      Attachments

        1. oprofile.client.S6010.report.txt
          267 kB
          Sebastien Buisson
        2. oprofile.client.S6010-4.report.txt
          243 kB
          Sebastien Buisson
        3. oprofile.client.S6010-4.root.report.txt
          241 kB
          Sebastien Buisson

        Activity

          [LU-1057] low performance maybe related to quota
          pjones Peter Jones made changes -
          Fix Version/s New: Lustre 2.1.4 [ 10158 ]

          I have backported the patch into b2_1: http://review.whamcloud.com/#change,4184.

          The tests show the contention on quota (osc_quota_chkdq() routine) has been fixed.

          Could this patch been reviews ?

          Thanks.

          pichong Gregoire Pichon added a comment - I have backported the patch into b2_1: http://review.whamcloud.com/#change,4184 . The tests show the contention on quota (osc_quota_chkdq() routine) has been fixed. Could this patch been reviews ? Thanks.
          jlevi Jodi Levi (Inactive) made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]

          Please let me know if there is outstanding work on this ticket.

          jlevi Jodi Levi (Inactive) added a comment - Please let me know if there is outstanding work on this ticket.
          jlevi Jodi Levi (Inactive) made changes -
          Fix Version/s New: Lustre 2.3.0 [ 10117 ]

          the patch has been merged (1b044fecb42c1f72ca2d2bc2bf80a4345b9ccf11)

          hongchao.zhang Hongchao Zhang added a comment - the patch has been merged (1b044fecb42c1f72ca2d2bc2bf80a4345b9ccf11)

          status update:

          the updated path using cfs_hash_t is under test.

          hongchao.zhang Hongchao Zhang added a comment - status update: the updated path using cfs_hash_t is under test.
          pjones Peter Jones made changes -
          Assignee Original: Jian Yu [ yujian ] New: Hongchao Zhang [ hongchao.zhang ]
          pjones Peter Jones added a comment -

          Reassign to Hongchao

          pjones Peter Jones added a comment - Reassign to Hongchao
          pjones Peter Jones made changes -
          Assignee Original: WC Triage [ wc-triage ] New: Jian Yu [ yujian ]

          People

            hongchao.zhang Hongchao Zhang
            pichong Gregoire Pichon
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: