Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4507

Server hangs and terrible performance - ZFS IOR

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Blocker
    • None
    • Lustre 2.6.0
    • None
    • Hyperion/LLNL
    • 3
    • 12333

    Description

      For sometime now we have been observing terrible read performance when running ZFS IOR file-per-proccess. The system will see ~7 GB/s reading with ldiskfs, at higher client counts the ZFS read performance on this test will drop to ~400 MB/s which is roughly a single client level.
      Observing the OSTs we typically see one or two of the 12 OSTs with a very high load, the rest idle. The busy OST with then timeout, frequently evict several clients, and move forward. Stack dumps and errors from two servers are attached. These tests are ongoing, please advise what further data needs to be collected.

      Attachments

        1. h-agb15.errors.txt
          9 kB
          Cliff White
        2. h-agb15.log.dump.txt
          1.66 MB
          Cliff White
        3. h-agb21.zfs.read.txt
          1.41 MB
          Cliff White
        4. MDTEST performance.xlsx
          35 kB
          Cliff White

        Issue Links

          Activity

            People

              isaac Isaac Huang (Inactive)
              cliffw Cliff White (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: