Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2481

performance issue of parallel read

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.1.5
    • Lustre 2.1.3
    • lustre 2.1.3
      kernel 2.6.32-220
      ofed 1.5.4
      bullxlinux 6.2 (based on redhat 6.2)

    Description

      One of our customer (TGCC - lustre 2.1.3) is facing a performance issue while reading in the same file from many tasks on one client at the same time. Performance drops while increasing the number of tasks.

      I have reproduced the issue and obtained these results:

          tasks     Mean    Total (MB/s)
      dd      1      339      339
      dd      2      107      214
      dd      4       78      312
      dd      8       64      512
      dd     16       26      428
      

      Applying the patch http://review.whamcloud.com/#change,3627 on lustre 2.1.3 gives no real improvement.

          tasks     Mean    Total (MB/s)
      dd      1      542      542
      dd      2      118      236
      dd      4       74      298
      dd      8       73      591
      dd     16       52      833
      

      The profiling shows contention in CLIO page routines.
      (I suppose the libpython entry comes from the mpirun synchronization of the 16 tasks)

      samples  %        linenr info                 app name                 symbol name
      1149893  34.8489  (no location information)   libpython2.6.so.1.0      /usr/lib64/libpython2.6.so.1.0
      338676   10.2640  dec_and_lock.c:21           vmlinux                  _atomic_dec_and_lock
      335241   10.1599  cl_page.c:404               obdclass.ko              cl_page_find0
      293448    8.8933  cl_page.c:729               obdclass.ko              cl_vmpage_page
      235856    7.1479  filemap.c:963               vmlinux                  grab_cache_page_nowait
      142686    4.3243  cl_page.c:661               obdclass.ko              cl_page_put
      80065     2.4265  filemap.c:667               vmlinux                  find_get_page
      46623     1.4130  copy_user_64.S:240          vmlinux                  copy_user_generic_string
      34127     1.0343  (no location information)   libpthread-2.12.so       pthread_rwlock_rdlock
      33685     1.0209  (no location information)   libc-2.12.so             getenv
      31725     0.9615  (no location information)   libpthread-2.12.so       pthread_rwlock_unlock
      30492     0.9241  swap.c:288                  vmlinux                  activate_page
      28528     0.8646  (no location information)   libc-2.12.so             __dcigettext
      23703     0.7183  (no location information)   libc-2.12.so             __strlen_sse42
      22345     0.6772  vvp_page.c:120              lustre.ko                vvp_page_unassume
      20646     0.6257  swap.c:183                  vmlinux                  put_page
      18525     0.5614  intel_idle.c:215            vmlinux                  intel_idle
      12774     0.3871  open.c:961                  vmlinux                  sys_close
      12702     0.3849  filemap.c:579               vmlinux                  unlock_page
      12685     0.3844  (no location information)   libc-2.12.so             __memcpy_ssse3
      12134     0.3677  cl_page.c:874               obdclass.ko              cl_page_invoid
      11734     0.3556  (no location information)   libpthread-2.12.so       __close_nocancel
      10573     0.3204  cl_page.c:884               obdclass.ko              cl_page_owner_clear
      10510     0.3185  (no location information)   libc-2.12.so             strerror_r
      9783      0.2965  cl_page.c:1068              obdclass.ko              cl_page_unassume
      9472      0.2871  (no location information)   oprofiled                /usr/bin/oprofiled
      9438      0.2860  (no location information)   libc-2.12.so             __strncmp_sse2
      8826      0.2675  rw.c:724                    lustre.ko                ll_readahead
      8769      0.2658  filemap.c:527               vmlinux                  page_waitqueue
      8095      0.2453  entry_64.S:470              vmlinux                  system_call_after_swapgs
      7756      0.2351  cl_page.c:561               obdclass.ko              cl_page_state_set0
      7615      0.2308  (no location information)   libc-2.12.so             __stpcpy_sse2
      7366      0.2232  entry_64.S:462              vmlinux                  system_call
      7097      0.2151  sched.c:4430                vmlinux                  find_busiest_group
      6898      0.2091  (no location information)   libperl.so               /usr/lib64/perl5/CORE/libperl.so
      6887      0.2087  (no location information)   libc-2.12.so             __ctype_b_loc
      6780      0.2055  wait.c:251                  vmlinux                  __wake_up_bit
      6631      0.2010  cl_page.c:139               obdclass.ko              cl_page_at_trusted
      6588      0.1997  (no location information)   libc-2.12.so             __strlen_sse2
      6336      0.1920  lov_io.c:239                lov.ko                   lov_page_stripe
      5289      0.1603  lov_io.c:208                lov.ko                   lov_sub_get
      5263      0.1595  ring_buffer.c:2834          vmlinux                  rb_get_reader_page
      4595      0.1393  lvfs_lib.c:76               lvfs.ko                  lprocfs_counter_add
      4373      0.1325  (no location information)   libc-2.12.so             strerror
      4347      0.1317  lov_io.c:252                lov.ko                   lov_page_subio
      4337      0.1314  cl_page.c:1036              obdclass.ko              cl_page_assume
      4310      0.1306  (no location information)   libc-2.12.so             __mempcpy_sse2
      3903      0.1183  lov_page.c:92               lov.ko                   lov_page_own
      3866      0.1172  mutex.c:409                 vmlinux                  __mutex_lock_slowpath
      3841      0.1164  lcommon_cl.c:1097           lustre.ko                cl2vm_page
      3698      0.1121  radix-tree.c:414            vmlinux                  radix_tree_lookup_slot
      3506      0.1063  vvp_page.c:109              lustre.ko                vvp_page_assume
      3491      0.1058  lu_object.c:1115            obdclass.ko              lu_object_locate
      3490      0.1058  (no location information)   ipmi_si.ko               port_inb
      3457      0.1048  ring_buffer.c:3221          vmlinux                  ring_buffer_consume
      3431      0.1040  cl_page.c:898               obdclass.ko              cl_page_owner_set
      ...
      

      I would have expected the per-task performance to maintain at the level (or slightly lower) of the one-reader case. It's what I observe on ext4 for instance.

      I can easily reproduce the issue and can provide additional information (lustre traces, system traces, system dump, profiling, ...) if needed.

      Please, note that this issue is serious for this customer because its cluster has lustre-client nodes with many cores (up to 128 cores) and this impacts dramatically applications run time.

      Attachments

        1. oprofile.report.lu1666.16tasks.txt
          220 kB
          Gregoire Pichon
        2. run.stats.gz
          43 kB
          Gregoire Pichon
        3. testcase.sh
          0.8 kB
          Gregoire Pichon

        Issue Links

          Activity

            People

              jay Jinshan Xiong (Inactive)
              pichong Gregoire Pichon
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: