Details

    • Technical task
    • Resolution: Fixed
    • Major
    • Lustre 2.8.0
    • Lustre 2.7.0
    • None
    • 17278

    Description

      Currently, for namespace LFSCK routine check without inconsistency repaired, the best bundle performance is under 4-MDTs configuration. As more MDTs join, the performance decreased. It is totally out of our expectation, should be resolved.

      Attachments

        Activity

          [LU-6177] LFSCK 4: namespace LFSCK scalability

          Related patches have been landed to master

          yong.fan nasf (Inactive) added a comment - Related patches have been landed to master

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14014/
          Subject: LU-6177 lfsck: calculate the phase2 time correctly
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 0f4875343e22bcdfe18708806e172aa234da23a6

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14014/ Subject: LU-6177 lfsck: calculate the phase2 time correctly Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0f4875343e22bcdfe18708806e172aa234da23a6

          Above patch fixed an serious issue that will cause the phase2 time is longer than the real used time by the second-stage scanning.

          yong.fan nasf (Inactive) added a comment - Above patch fixed an serious issue that will cause the phase2 time is longer than the real used time by the second-stage scanning.

          Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/14014
          Subject: LU-6177 lfsck: calculate the phase2 time correctly
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: edf9f948ad9f5c86ddf1a891dae8ce0cdde07593

          gerrit Gerrit Updater added a comment - Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/14014 Subject: LU-6177 lfsck: calculate the phase2 time correctly Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: edf9f948ad9f5c86ddf1a891dae8ce0cdde07593

          The striped directories were created under each sub-directory. The master MDT-object of the striped directory should reside on the same MDT as its parent directory, but because of test scripts issue, it was created on the MDT0 always. On the other hand, the test scripts did not handle the remote sub-directory properly, and caused the remote sub-directory were also unbalanced among the MDTs. I have fixed the test scripts and made them to be balanced.

          yong.fan nasf (Inactive) added a comment - The striped directories were created under each sub-directory. The master MDT-object of the striped directory should reside on the same MDT as its parent directory, but because of test scripts issue, it was created on the MDT0 always. On the other hand, the test scripts did not handle the remote sub-directory properly, and caused the remote sub-directory were also unbalanced among the MDTs. I have fixed the test scripts and made them to be balanced.

          It should be, but unfortunately, because of the test script issue, the master MDT-object of striped directory is always created on MDT0, as to the objects count on the MDTs are not balance unexpectedly.

          Is that because all of the striped directories are created at the top level directory (on MDT0)? Otherwise, I would think that the master MDT object should be on the same MDT as the parent directory. If not, I think that is a bug in the DNE code.

          Secondly, even if the master MDT object of each striped directory is on MDT0, this should only be a few thousand more objects, but the actual files created inside the striped directories should be balanced evenly across all MDTs, or again this would be a bug in the DNE code.

          adilger Andreas Dilger added a comment - It should be, but unfortunately, because of the test script issue, the master MDT-object of striped directory is always created on MDT0, as to the objects count on the MDTs are not balance unexpectedly. Is that because all of the striped directories are created at the top level directory (on MDT0)? Otherwise, I would think that the master MDT object should be on the same MDT as the parent directory. If not, I think that is a bug in the DNE code. Secondly, even if the master MDT object of each striped directory is on MDT0, this should only be a few thousand more objects, but the actual files created inside the striped directories should be balanced evenly across all MDTs, or again this would be a bug in the DNE code.

          As the MDTs increased, the waiting time (as described above) increased also, so the aggregated performance does not scale as expected.

          yong.fan nasf (Inactive) added a comment - As the MDTs increased, the waiting time (as described above) increased also, so the aggregated performance does not scale as expected.

          even so, that should give us performance multiplied by (#MDTs-1), it shouldn't stop to scale?

          bzzz Alex Zhuravlev added a comment - even so, that should give us performance multiplied by (#MDTs-1), it shouldn't stop to scale?
          yong.fan nasf (Inactive) added a comment - - edited

          It should be, but unfortunately, because of the test script issue, the master MDT-object of striped directory is always created on MDT0, as to the objects count on the MDTs are not balance unexpectedly.

          On the other hand, we should not assume that every MDT has the same processing capability. We still need to adjust the performance calculating method.

          yong.fan nasf (Inactive) added a comment - - edited It should be, but unfortunately, because of the test script issue, the master MDT-object of striped directory is always created on MDT0, as to the objects count on the MDTs are not balance unexpectedly. On the other hand, we should not assume that every MDT has the same processing capability. We still need to adjust the performance calculating method.

          Shouldn't the number of files per MDT be about the same? Should the test config create balanced file creation? I thought the top-level directories are spread across all MDTs and then all the files are created in those directories?

          adilger Andreas Dilger added a comment - Shouldn't the number of files per MDT be about the same? Should the test config create balanced file creation? I thought the top-level directories are spread across all MDTs and then all the files are created in those directories?

          People

            yong.fan nasf (Inactive)
            yong.fan nasf (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: