Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.5.0, Lustre 2.4.2
    • Lustre 2.4.0
    • None
    • 3
    • 10408

    Description

      Under Lustre 2.4.0, the lctl subcommand lfsck_start ignores the n/-dryrun command. For instance:

      lctl lfsck_start --dryrun on -M <MDT name>

      That currently results in a real run of the OI scrub, and real modification to the filesystem, directly in contradiction to the documentation.

      I can understand if implementing that is going to take more work than we want to spend right now, but until the functionality is implemented the cmomand must return an error. It should not go ahead and make changes.

      Further, I am not particularly fond of the "--dryrun on" syntax. Unless there is a really, really good reason that -n and --dryrun need to take on/off options, the command line interface should not be designed this way. I think that -n/--dryrun should be optionless. (If they are present on the command line that means dryrun mode must be enabled.)

      That would match most sysadmins' expected behavior. Every other command I think that I have ever seen makes the dryrun command optionless. For instance:

      • fsck -N
      • rsync -n/--dryrun

      Attachments

        Issue Links

          Activity

            [LU-3935] lfsck_start ignores -n/--dryrun

            Patch landed to Master so closing ticket. If more work is needed let me know and I will reopen

            jlevi Jodi Levi (Inactive) added a comment - Patch landed to Master so closing ticket. If more work is needed let me know and I will reopen
            yong.fan nasf (Inactive) added a comment - - edited

            The patch for dryrun mode OI scrub on master:

            http://review.whamcloud.com/#/c/7720/

            yong.fan nasf (Inactive) added a comment - - edited The patch for dryrun mode OI scrub on master: http://review.whamcloud.com/#/c/7720/

            I agree that "--dry-run" should not make any changes to the filesystem, so that should be fixed.

            The tricky part is that OI Scrub is supposed to start automatically at mount time to fix a broken/missing OI file in order to repair it promptly.

            adilger Andreas Dilger added a comment - I agree that "--dry-run" should not make any changes to the filesystem, so that should be fixed. The tricky part is that OI Scrub is supposed to start automatically at mount time to fix a broken/missing OI file in order to repair it promptly.

            My current idea for that is

            1) If it is pure OI scrub without other LFSCK components scanning, then we should support "--dryrun".

            2) If it is a combined running (OI scrub plus others) with "--dryrun" mode, then we should notice up layer LFSCK components to stopped/paused if found inconsistent OI mappings, and rescanning after OI scrub repaired inconsistent OI mappings.

            Andreas, what's your suggestion?

            yong.fan nasf (Inactive) added a comment - My current idea for that is 1) If it is pure OI scrub without other LFSCK components scanning, then we should support "--dryrun". 2) If it is a combined running (OI scrub plus others) with "--dryrun" mode, then we should notice up layer LFSCK components to stopped/paused if found inconsistent OI mappings, and rescanning after OI scrub repaired inconsistent OI mappings. Andreas, what's your suggestion?

            I don't care how beneficial OI scrub is.

            The man page clearly states:

                     -n, --dryrun <on|off>
                          Perform a trial run with no changes made.
            

            When I used it, there were changes made. That is not acceptable.

            So maybe you rename the option to "--dont-do-as-much" and clearly document what it does and does not alter about the scan's behavior.

            morrone Christopher Morrone (Inactive) added a comment - I don't care how beneficial OI scrub is. The man page clearly states: -n, --dryrun <on|off> Perform a trial run with no changes made. When I used it, there were changes made. That is not acceptable. So maybe you rename the option to "--dont-do-as-much" and clearly document what it does and does not alter about the scan's behavior.

            It is not because of dryrun mode will need more time than expected. The key reason is that inode-table based iteration is a basic infrastructure for other up layer LFSCK components. We allow single LFSCK scanning to verify kinds of system inconsistency (such as OI mapping inconsistency, namespace inconsistency, MDT-OST layout inconsistency, MDT-MDT DNE inconsistency) to improve the efficiency. That is important for a very large system. If we does not fix the found OI mappings inconsistency during OI scrub, then other up layer LFSCK components will get invalid objects via the wrong OI mappings. Such invalidation may cannot be aware by up layer LFSCK components in time, as to cause some strange behaviour, and even incorrect reparation (or to be repaired).

            So it is NOT the "--dryrun" does not work, it works on up layer LFSCK components, such as "lctl lfsck_start -M <MDT-device> -t namespace".

            yong.fan nasf (Inactive) added a comment - It is not because of dryrun mode will need more time than expected. The key reason is that inode-table based iteration is a basic infrastructure for other up layer LFSCK components. We allow single LFSCK scanning to verify kinds of system inconsistency (such as OI mapping inconsistency, namespace inconsistency, MDT-OST layout inconsistency, MDT-MDT DNE inconsistency) to improve the efficiency. That is important for a very large system. If we does not fix the found OI mappings inconsistency during OI scrub, then other up layer LFSCK components will get invalid objects via the wrong OI mappings. Such invalidation may cannot be aware by up layer LFSCK components in time, as to cause some strange behaviour, and even incorrect reparation (or to be repaired). So it is NOT the "--dryrun" does not work, it works on up layer LFSCK components, such as "lctl lfsck_start -M <MDT-device> -t namespace".
            pjones Peter Jones added a comment -

            Fan Yong

            Could you please comment on this one?

            thanks

            Peter

            pjones Peter Jones added a comment - Fan Yong Could you please comment on this one? thanks Peter

            People

              yong.fan nasf (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: