Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4257

parallel dds are slower than serial dds

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • Lustre 2.5.0
    • None
    • 3
    • 11618

    Description

      Sanger has an interesting test in which they read from the same file from 20 processes. They first run in parallel and then run serially (after flushing cache). Their expected result is that the serial and parallel runs should take about the same amount of time. What they see however is that parallel reads are about 50% slower than serial reads:

      client1# cat readfile.sh
      #!/bin/sh
      
      dd if=/lustre/scratch110/sanger/jb23/test/delete bs=4M of=/dev/null
      
      client1# for i in `seq -w 1 20 `
      do
        (time $LOC/readfile.sh )  > $LOC/results/${i}_out 2>&1 &
      done
      

      In parallel

      01_out:real 3m36.228s
      02_out:real 3m36.227s
      03_out:real 3m36.226s
      04_out:real 3m36.224s
      05_out:real 3m36.224s
      06_out:real 3m36.224s
      07_out:real 3m36.222s
      08_out:real 3m36.221s
      09_out:real 3m36.228s
      10_out:real 3m36.222s
      11_out:real 3m36.220s
      12_out:real 3m36.220s
      13_out:real 3m36.228s
      14_out:real 3m36.219s
      15_out:real 3m36.217s
      16_out:real 3m36.218s
      17_out:real 3m36.214s
      18_out:real 3m36.214s
      19_out:real 3m36.211s
      20_out:real 3m36.212s

      A serial read ( I expect all the time to be in the first read ).

      grep -i real *_serial
      01_out_serial:real 2m31.372s
      02_out_serial:real 0m1.190s
      03_out_serial:real 0m0.654s
      04_out_serial:real 0m0.562s
      05_out_serial:real 0m0.574s
      06_out_serial:real 0m0.570s
      07_out_serial:real 0m0.574s
      08_out_serial:real 0m0.461s
      09_out_serial:real 0m0.456s
      10_out_serial:real 0m0.462s
      11_out_serial:real 0m0.475s
      12_out_serial:real 0m0.473s
      13_out_serial:real 0m0.582s
      14_out_serial:real 0m0.580s
      15_out_serial:real 0m0.569s
      16_out_serial:real 0m0.679s
      17_out_serial:real 0m0.565s
      18_out_serial:real 0m0.573s
      19_out_serial:real 0m0.579s
      20_out_serial:real 0m0.472s

      And try the same experiment with nfs

      Serial access.

      root@farm3-head4:~/tmp/test/results# grep -i real *
      results/01_out_serial:real 0m19.923s
      results/02_out_serial:real 0m1.373s
      results/03_out_serial:real 0m1.237s
      results/04_out_serial:real 0m1.276s
      results/05_out_serial:real 0m1.289s
      results/06_out_serial:real 0m1.297s
      results/07_out_serial:real 0m1.265s
      results/08_out_serial:real 0m1.278s
      results/09_out_serial:real 0m1.224s
      results/10_out_serial:real 0m1.225s
      results/11_out_serial:real 0m1.221s
      ...

      So the question is:
      Why is the access slower if we are accessing the file in parallel and it is not in the cache ?

      Is there some lock contention going on with multiple readers? Or is the Lustre client sending multiple RPCs for the same data, even though there is already an outstanding request? They have tried this on 1.8.x clients as well as 2.5.0.

      Thanks.

      Attachments

        1. lu-4257.tar.gz
          0.2 kB
        2. debug_file.out.gz
          0.2 kB
        3. io.png
          io.png
          75 kB
        4. lustre_1.8.9
          850 kB
        5. lustre_2.5
          798 kB
        6. readfile.sh
          0.4 kB
        7. test.sh
          2 kB

        Issue Links

          Activity

            People

              jay Jinshan Xiong (Inactive)
              ihara Shuichi Ihara (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              29 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: