Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8118

very slow metadata performance with shared striped directory

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.8.0
    • 3
    • 9223372036854775807

    Description

      We are experiencing a severe metadata performance issue intermittently. So far it has occurred with a striped directory being alter
      ed by many threads on multiple nodes.

      Setup:
      Multiple MDTs, one MDT per MDS
      A directory striped across those MDTs with lfs setdirstripe -D set
      several processes on each of several nodes making metadata changes in that striped dir

      Symptoms:
      Create rates in 10s of creates/second total across all the MDTs hosting the shards
      getdents() times >1000 seconds per getdents() call

      In one case:
      The directory was striped across 10 MDTs and 16 OSTs
      Mdtest had been run as follows:

      srun -N 10 -n 80 mdtest -d /p/lustre/faaland1/mdtest -n 128000 -F
      

      The create rate started out about 30,000 creates/second total across all 10 MDTs. After some time it dropped to 10-20 creates/secon
      d. On a separate node, which mounted the same filesystem but was not running any of the mdtest processes and was entirely idle, I r
      an ls and observed the very slow getdents() calls.
      On yet another idle node, I created another directory, striped across the same MDTs, and created 10000 files within it. Create rate
      was good. Listing that directory produced getdents() times of about 0.003 seconds.
      There were no indications of network problems within the nodes at the time, nor before or after our test (this is on catalyst and th
      e nodes are normally used for compute jobs and monitored 24x7).

      In the other case:
      This filesystem has 4 MDTs, each on its own MDS, and 2 OSTs.
      The directory is striped across all 4 MDTs and has -D set.

      The workload involved 10 clients, each running 96 threads. Shell scripts were randomly invoking mkdir, touch, rmdir, or rm (the lat
      ter two having chosen a file or directory to remove).

      Create rate started at about 10,000/sec concurrent with thousands of unlinks, mkdirs, rmdirs, and stats per sec. All those operatio
      ns slowed to single digit per-second rates. An ls in that common directory, on a node not running the job, also produced >1000 seco
      nd getdents() calls.

      strace -T output:

      getdents(3, /* 1061 entries */, 32768)  = 32760 <1887.749809>
      getdents(3, /* 1102 entries */, 32768)  = 32760 <1990.174707>
      getdents(3, /* 1087 entries */, 32768)  = 32752 <1994.781547>
      getdents(3, /* 1056 entries */, 32768)  = 32768 <1907.404333>
      brk(0xcf0000)                           = 0xcf0000 <0.000030>
      getdents(3, /* 1091 entries */, 32768)  = 32752 <1860.720958>
      

      Attachments

        Activity

          People

            laisiyao Lai Siyao
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: