Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17814

"lfs find" to scan with multiple threads

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      It would be useful for "lfs find" to perform directory scans with multiple threads in parallel. It could (potentially) fork a new thread (or put a work item into a pool) for each subdirectory so that they can be scanned in parallel.

      There is "libcircle" and "libpcircle" that can perform workload sharing to speed up directory traversal. Also, the pfind code in IO500 is also doing efficient parallel directory traversal, including splitting up large directories by hash index to traverse in parallel.

      Integrating one of these algorithms into "lfs find" with pthreads would allow a many-fold improvement in directory scanning performance.

      Attachments

        Issue Links

          Activity

            [LU-17814] "lfs find" to scan with multiple threads
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-19052 [ LU-19052 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-17699 [ LU-17699 ]
            mrasobarnett Matt Rásó-Barnett made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 40836 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-18586 [ LU-18586 ]
            adilger Andreas Dilger made changes -
            Assignee Original: WC Triage [ wc-triage ] New: Patrick Farrell [ paf0186 ]
            adilger Andreas Dilger made changes -
            Labels Original: medium utils New: medium performance utils
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-5170 [ LU-5170 ]
            adilger Andreas Dilger made changes -
            Labels Original: medium New: medium utils
            adilger Andreas Dilger made changes -
            Description Original: It would be useful for "{{lfs find}}" to perform directory scans with multiple threads in parallel. It could (potentially) fork a new thread (or put a work item into a pool) for each subdirectory so that they can be scanned in parallel.

            There is "libcircle" and "libpcircle" that can perform workload sharing to speed up directory traversal. Also, the "{{pfind}}" code in IO500 is also doing efficient parallel directory traversal, including splitting up large directories by hash index to traverse in parallel.

            Integrating one of these algorithms into "{{lfs find}}" with pthreads would allow a many-fold improvement in directory scanning performance.
            New: It would be useful for "{{lfs find}}" to perform directory scans with multiple threads in parallel. It could (potentially) fork a new thread (or put a work item into a pool) for each subdirectory so that they can be scanned in parallel.

            There is "libcircle" and "libpcircle" that can perform workload sharing to speed up directory traversal. Also, the [{{pfind}}|https://github.com/VI4IO/pfind.git] code in IO500 is also doing efficient parallel directory traversal, including splitting up large directories by hash index to traverse in parallel.

            Integrating one of these algorithms into "{{lfs find}}" with pthreads would allow a many-fold improvement in directory scanning performance.
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-14610 [ LU-14610 ]
            adilger Andreas Dilger created issue -

            People

              paf0186 Patrick Farrell
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: