Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3277

LU-2139 may cause the performance regression

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.6.0
    • Lustre 2.4.0
    • RHEL6.3 and current master
    • 2
    • 8114

    Description

      There is a performance regression on the current master(c864582b5d4541c7830d628457e55cd859aee005) if we have multiple IOR threads per client. As far as I can test, LU-2576 might cause this performance regression. Here is quick test results on each commit.

      client : commit ac37e7b4d101761bbff401ed12fcf671d6b68f9c

      # mpirun -np 8 /lustre/IOR -w -b 8g -t 1m -e -C -F -vv -o /lustre/ior.out/file
      IOR-2.10.3: MPI Coordinated Test of Parallel I/O
      
      Run began: Sun May  5 12:24:09 2013
      Command line used: /lustre/IOR -w -b 8g -t 1m -e -C -F -vv -o /lustre/ior.out/file
      Machine: Linux s08 2.6.32-279.19.1.el6_lustre.x86_64 #1 SMP Sat Feb 9 21:55:32 PST 2013 x86_64
      Using synchronized MPI timer
      Start time skew across all tasks: 0.00 sec
      Path: /lustre/ior.out
      FS: 683.5 TiB   Used FS: 0.0%   Inodes: 5.0 Mi   Used Inodes: 0.0%
      Participating tasks: 8
      Using reorderTasks '-C' (expecting block, not cyclic, task assignment)
      task 0 on s08
      task 1 on s08
      task 2 on s08
      task 3 on s08
      task 4 on s08
      task 5 on s08
      task 6 on s08
      task 7 on s08
      
      Summary:
      	api                = POSIX
      	test filename      = /lustre/ior.out/file
      	access             = file-per-process
      	pattern            = segmented (1 segment)
      	ordering in a file = sequential offsets
      	ordering inter file=constant task offsets = 1
      	clients            = 8 (8 per node)
      	repetitions        = 1
      	xfersize           = 1 MiB
      	blocksize          = 8 GiB
      	aggregate filesize = 64 GiB
      
      Using Time Stamp 1367781849 (0x5186b1d9) for Data Signature
      Commencing write performance test.
      Sun May  5 12:24:09 2013
      
      access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s) total(s)  iter
      ------    ---------  ---------- ---------  --------   --------   --------  --------   ----
      write     3228.38    8388608    1024.00    0.001871   20.30      1.34       20.30      0    XXCEL
      Operation  Max (MiB)  Min (MiB)  Mean (MiB)   Std Dev  Max (OPs)  Min (OPs)  Mean (OPs)   Std Dev  Mean (s)  Op grep #Tasks tPN reps  fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize
      
      ---------  ---------  ---------  ----------   -------  ---------  ---------  ----------   -------  --------
      write        3228.38    3228.38     3228.38      0.00    3228.38    3228.38     3228.38      0.00  20.29996   8 8 1 1 1 1 0 0 1 8589934592 1048576 68719476736 -1 POSIX EXCEL
      
      Max Write: 3228.38 MiB/sec (3385.20 MB/sec)
      
      Run finished: Sun May  5 12:24:30 2013
      

      client : commit 5661651b2cc6414686e7da581589c2ea0e1f1969

      # mpirun -np 8 /lustre/IOR -w -b 8g -t 1m -e -C -F -vv -o /lustre/ior.out/file
      IOR-2.10.3: MPI Coordinated Test of Parallel I/O
      
      Run began: Sun May  5 12:16:35 2013
      Command line used: /lustre/IOR -w -b 8g -t 1m -e -C -F -vv -o /lustre/ior.out/file
      Machine: Linux s08 2.6.32-279.19.1.el6_lustre.x86_64 #1 SMP Sat Feb 9 21:55:32 PST 2013 x86_64
      Using synchronized MPI timer
      Start time skew across all tasks: 0.00 sec
      Path: /lustre/ior.out
      FS: 683.5 TiB   Used FS: 0.0%   Inodes: 5.0 Mi   Used Inodes: 0.0%
      Participating tasks: 8
      Using reorderTasks '-C' (expecting block, not cyclic, task assignment)
      task 0 on s08
      task 1 on s08
      task 2 on s08
      task 3 on s08
      task 4 on s08
      task 5 on s08
      task 6 on s08
      task 7 on s08
      
      Summary:
      	api                = POSIX
      	test filename      = /lustre/ior.out/file
      	access             = file-per-process
      	pattern            = segmented (1 segment)
      	ordering in a file = sequential offsets
      	ordering inter file=constant task offsets = 1
      	clients            = 8 (8 per node)
      	repetitions        = 1
      	xfersize           = 1 MiB
      	blocksize          = 8 GiB
      	aggregate filesize = 64 GiB
      
      Using Time Stamp 1367781395 (0x5186b013) for Data Signature
      Commencing write performance test.
      Sun May  5 12:16:35 2013
      
      access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s) total(s)  iter
      ------    ---------  ---------- ---------  --------   --------   --------  --------   ----
      write     550.28     8388608    1024.00    0.001730   119.10     2.76       119.10     0    XXCEL
      Operation  Max (MiB)  Min (MiB)  Mean (MiB)   Std Dev  Max (OPs)  Min (OPs)  Mean (OPs)   Std Dev  Mean (s)  Op grep #Tasks tPN reps  fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize
      
      ---------  ---------  ---------  ----------   -------  ---------  ---------  ----------   -------  --------
      write         550.28     550.28      550.28      0.00     550.28     550.28      550.28      0.00 119.09623   8 8 1 1 1 1 0 0 1 8589934592 1048576 68719476736 -1 POSIX EXCEL
      
      Max Write: 550.28 MiB/sec (577.01 MB/sec)
      
      Run finished: Sun May  5 12:18:34 2013
      

      Both tests, the servers are running current master (c864582b5d4541c7830d628457e55cd859aee005)

      Attachments

        1. collectl.log
          13 kB
        2. stat.log
          16 kB

        Issue Links

          Activity

            [LU-3277] LU-2139 may cause the performance regression
            prakash Prakash Surya (Inactive) made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            adilger Andreas Dilger made changes -
            Assignee Original: Niu Yawei [ niu ] New: Prakash Surya [ prakash ]
            adilger Andreas Dilger made changes -
            Link New: This issue is blocking LU-2139 [ LU-2139 ]
            adilger Andreas Dilger made changes -
            Fix Version/s New: Lustre 2.6.0 [ 10595 ]
            jlevi Jodi Levi (Inactive) made changes -
            Priority Original: Blocker [ 1 ] New: Major [ 3 ]
            ihara Shuichi Ihara (Inactive) made changes -
            Attachment New: collectl.log [ 12639 ]
            Attachment New: stat.log [ 12640 ]
            jlevi Jodi Levi (Inactive) made changes -
            Labels Original: MB New: HB
            Summary Original: LU-2576/LU-2139 may cause the performance regression New: LU-2139 may cause the performance regression
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-2576 [ LU-2576 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-2139 [ LU-2139 ]
            adilger Andreas Dilger made changes -
            Summary Original: LU-2139 may cause the performance regression New: LU-2576/LU-2139 may cause the performance regression

            People

              prakash Prakash Surya (Inactive)
              ihara Shuichi Ihara (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: