Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2032

small random read i/o performance regression

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.1.3
    • None
    • 2.1.3 server with 2.1.3 and 1.8.8 clients. centos6, x86_64, QDR IB
    • 3
    • 4173

    Description

      I've been doing some 2.1.3 pre-rollout testing, and there seems to be a client problem with small random reads. performance is considerably worse on 2.1.3 clients than 1.8.8 clients. it's about a 35x slowdown for 4k random read i/o.

      tests use the same files on a rhel6 x86_64 2.1.3 server (stock 2.1.3 rpm is used), QDR IB fabric, single disk or md8+2 lun for an OST, all client & server VFS caches dropped between trials.

      checksums on or off and client rpc's 8 or 32 makes little difference. I've also tried umount'ing the fs from the 1.8.8 client between tests to make sure there's no hidden caching, but that didn't change anything.

      random read ->
      IOR -a POSIX -C -r -F -k -e -t $i -z -b 1024m -o /mnt/yo96/rjh/blah

      i/o size client version single disk lun md8+2 lun
      i=4k 1.8.8 20.70 MB/s 22.0 MB/s
      i=4k 2.1.3 0.55 MB/s 0.6 MB/s
      i/o size client version single disk lun md8+2 lun
      i=1M 1.8.8 87 MB/s 137 MB/s
      i=1M 2.1.3 63 MB/s 83 MB/s

      although these numbers are for a single process, the same trend applies when the IOR is scaled up to 8 processes/node and to multiple nodes.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              rjh Robin Humble (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated: