Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16365

cached 'ls -l' is slow

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      While testing LU-14139, there is an observed performance behavior.
      Here is test workload

      # echo 3 > /proc/sys/vm/drop_caches
      # time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ 
      # time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ 
      

      In theory, when 1st 'ls -l' finishes, client keeps data, metadata and locks in the cache, then second 'ls -l' output should come from it.
      It would expect 2nd 'ls -l'could be significant faster than 1st 'ls -l', but it's not very much.

      Here is 'ls -l' results for 1M files in single directory.

      [root@ec01 ~]# clush -w ec01,ai400x2-1-vm[1-4] "echo 3 > /proc/sys/vm/drop_caches"
      [sihara@ec01 ~]$ time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ > /dev/null
      
      real	0m27.385s
      user	0m8.994s
      sys	0m13.131s
      
      [sihara@ec01 ~]$ time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ > /dev/null
      
      real	0m25.309s
      user	0m8.937s
      sys	0m16.327s
      

      There are no RPCs to go out in 2nd 'ls -l' below. I only saw only 16 x LNET messages on 2nd 'ls -l' against 1.1M LNET messages on 1st 'ls -l', but still almost same elapsed time. most of time costs is 'ls' itself and Lustre client side.

      [root@ec01 ~]# clush -w ai400x2-1-vm[1-4],ec01 " echo 3 > /proc/sys/vm/drop_caches "
      [root@ec01 ~]# lnetctl net show -v| grep _count; time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ > /dev/null; lnetctl net show -v | grep _count
                    send_count: 0
                    recv_count: 0
                    drop_count: 0
                    send_count: 65363661
                    recv_count: 62095891
                    drop_count: 1
      
      real	0m26.145s
      user	0m9.070s
      sys	0m13.552s
                    send_count: 0
                    recv_count: 0
                    drop_count: 0
                    send_count: 66482277
                    recv_count: 63233245
                    drop_count: 1
      [root@ec01 ~]# lnetctl net show -v| grep _count; time ls -l /exafs/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/ > /dev/null; lnetctl net show -v | grep _count
                    send_count: 0
                    recv_count: 0
                    drop_count: 0
                    send_count: 66482277
                    recv_count: 63233245
                    drop_count: 1
      
      real	0m25.569s
      user	0m8.987s
      sys	0m16.537s
                    send_count: 0
                    recv_count: 0
                    drop_count: 0
                    send_count: 66482293
                    recv_count: 63233261
                    drop_count: 1
      

      This is same test for 1M files in ext4 of local disk and /dev/shm on client.

      [root@ec01 ~]# echo 3 > /proc/sys/vm/drop_caches
      [sihara@ec01 ~]$ time ls -l /tmp/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/  > /dev/null
      
      real	0m16.999s
      user	0m8.956s
      sys	0m5.855s
      [sihara@ec01 ~]$ time ls -l /tmp/testdir/mdtest.out/test-dir.0-0/mdtest_tree.0/  > /dev/null
      
      real	0m11.832s
      user	0m8.765s
      sys	0m3.051s
      
      [root@ec01 ~]# echo 3 > /proc/sys/vm/drop_caches
      [sihara@ec01 ~]$ time ls -l /dev/shm/testdir/test-dir.0-0/mdtest_tree.0/ > /dev/null
      
      real	0m8.296s
      user	0m5.465s
      sys	0m2.813s
      [sihara@ec01 ~]$ time ls -l /dev/shm/testdir/test-dir.0-0/mdtest_tree.0/ > /dev/null
      
      real	0m8.273s
      user	0m5.414s
      sys	0m2.847s
      

      Lustre can be similar performance of ext4 and memcache if everything in the cache, can't it?

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: