Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16365

cached 'ls -l' is slow

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      While testing LU-14139, there is an observed performance behavior.
      Here is test workload

      # echo 3 > /proc/sys/vm/drop_caches
      # time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ 
      # time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ 
      

      In theory, when 1st 'ls -l' finishes, client keeps data, metadata and locks in the cache, then second 'ls -l' output should come from it.
      It would expect 2nd 'ls -l'could be significant faster than 1st 'ls -l', but it's not very much.

      Here is 'ls -l' results for 1M files in single directory.

      [root@ec01 ~]# clush -w ec01,ai400x2-1-vm[1-4] "echo 3 > /proc/sys/vm/drop_caches"
      [sihara@ec01 ~]$ time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ > /dev/null
      
      real	0m27.385s
      user	0m8.994s
      sys	0m13.131s
      
      [sihara@ec01 ~]$ time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ > /dev/null
      
      real	0m25.309s
      user	0m8.937s
      sys	0m16.327s
      

      There are no RPCs to go out in 2nd 'ls -l' below. I only saw only 16 x LNET messages on 2nd 'ls -l' against 1.1M LNET messages on 1st 'ls -l', but still almost same elapsed time. most of time costs is 'ls' itself and Lustre client side.

      [root@ec01 ~]# clush -w ai400x2-1-vm[1-4],ec01 " echo 3 > /proc/sys/vm/drop_caches "
      [root@ec01 ~]# lnetctl net show -v| grep _count; time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ > /dev/null; lnetctl net show -v | grep _count
                    send_count: 0
                    recv_count: 0
                    drop_count: 0
                    send_count: 65363661
                    recv_count: 62095891
                    drop_count: 1
      
      real	0m26.145s
      user	0m9.070s
      sys	0m13.552s
                    send_count: 0
                    recv_count: 0
                    drop_count: 0
                    send_count: 66482277
                    recv_count: 63233245
                    drop_count: 1
      [root@ec01 ~]# lnetctl net show -v| grep _count; time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ > /dev/null; lnetctl net show -v | grep _count
                    send_count: 0
                    recv_count: 0
                    drop_count: 0
                    send_count: 66482277
                    recv_count: 63233245
                    drop_count: 1
      
      real	0m25.569s
      user	0m8.987s
      sys	0m16.537s
                    send_count: 0
                    recv_count: 0
                    drop_count: 0
                    send_count: 66482293
                    recv_count: 63233261
                    drop_count: 1
      

      This is same test for 1M files in ext4 of local disk and /dev/shm on client.

      [root@ec01 ~]# echo 3 > /proc/sys/vm/drop_caches
      [sihara@ec01 ~]$ time ls -l /tmp/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/  > /dev/null
      
      real	0m16.999s
      user	0m8.956s
      sys	0m5.855s
      [sihara@ec01 ~]$ time ls -l /tmp/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/  > /dev/null
      
      real	0m11.832s
      user	0m8.765s
      sys	0m3.051s
      
      [root@ec01 ~]# echo 3 > /proc/sys/vm/drop_caches
      [sihara@ec01 ~]$ time ls -l /dev/shm/testdir/test-dir.0-0/mdtest_tree.0/ > /dev/null
      
      real	0m8.296s
      user	0m5.465s
      sys	0m2.813s
      [sihara@ec01 ~]$ time ls -l /dev/shm/testdir/test-dir.0-0/mdtest_tree.0/ > /dev/null
      
      real	0m8.273s
      user	0m5.414s
      sys	0m2.847s
      

      Lustre can be similar performance of ext4 and memcache if everything in the cache, can't it?

      Attachments

        1. ls.svg
          141 kB
          Andreas Dilger

        Issue Links

          Activity

            People

              wc-triage WC Triage
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: