Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5228

HSM: posix copytool can (and do) run out of file descriptors

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.5.1
    • None
    • Centos 6.5
      Lustre 2.5.56
    • 3
    • 14566

    Description

      When archiving a lot of files at once, the posix copytool can run out of file descriptors.

      ...
      lhsmtool_posix[21880]: cannot open '/vsm/tasfs1/32d7/0000/0400/0000/0002/0000/0x200000400:0x32d7:0x0_tmp' for write: Too many open files (24)
      lhsmtool_posix[21878]: cannot open '/vsm/tasfs1/32c2/0000/0400/0000/0002/0000/0x200000400:0x32c2:0x0_tmp' for write: Too many open files (24)
      lhsmtool_posix[21894]: cannot open '/vsm/tasfs1/32d2/0000/0400/0000/0002/0000/0x200000400:0x32d2:0x0_tmp.lov': Too many open files (24)
      lhsmtool_posix[21894]: cannot save file striping info of '/mnt/tas01/.lustre/fid/0x200000400:0x32d2:0x0' in '/vsm/tasfs1/32d2/0000/0400/0000/0002/0000/0x200000400:0x32d2:0x0_tmp': Too many open files (24)
      ...
      

      The root cause is that there is no limit on the amount of threads created to process each request, which leads to the error.

      The files in error are also not restarted, and the archive request is drop.

      In my test, out of 11418 archive request, only 11159 were actually archived. The other requests were dropped.

      Attachments

        Activity

          People

            wc-triage WC Triage
            fzago Frank Zago (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: