Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3056

conf-sanity test_66 - replace nids failed

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.4.0
    • 3
    • 7450

    Description

      This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

      This issue relates to the following test suite runs:
      https://maloo.whamcloud.com/test_sets/95fcddea-97b0-11e2-a652-52540035b04c
      https://maloo.whamcloud.com/test_sets/810da798-9760-11e2-9ec7-52540035b04c

      The sub-test test_66 failed with the following error:

      replace nids failed

      Info required for matching: conf-sanity 66

      All subsequent ZFS test suites (recovery-small, etc) fail with the following error:

      Starting mds1: -o user_xattr,acl  lustre-mdt1/mdt1 /mnt/mds1
      CMD: wtm-16vm3 mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl  		                   lustre-mdt1/mdt1 /mnt/mds1
      wtm-16vm3: mount.lustre: according to /etc/mtab lustre-mdt1/mdt1 is already mounted on /mnt/mds1
      

      Attachments

        Issue Links

          Activity

            [LU-3056] conf-sanity test_66 - replace nids failed
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-3793 [ LU-3793 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-5137 [ LU-5137 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-11990 [ LU-11990 ]
            jlevi Jodi Levi (Inactive) made changes -
            Fix Version/s Original: Lustre 2.5.0 [ 10295 ]
            keith Keith Mannthey (Inactive) made changes -
            Fix Version/s New: Lustre 2.5.0 [ 10295 ]
            Resolution New: Cannot Reproduce [ 5 ]
            Status Original: Reopened [ 4 ] New: Resolved [ 5 ]

            We no longer see this issue. Please reopen if this starts to trigger again. There is on sense landing a debug patch for a problem that does not happen.

            keith Keith Mannthey (Inactive) added a comment - We no longer see this issue. Please reopen if this starts to trigger again. There is on sense landing a debug patch for a problem that does not happen.

            Still no sign of the "replace nids failed" errors.

            keith Keith Mannthey (Inactive) added a comment - Still no sign of the "replace nids failed" errors.

            Quick update:

            conf_sanity test_66 has not failed in a few weeks. The "replace nids failed" error really dropped off after 2013-04-29. Perhaps some code path has changed.

            keith Keith Mannthey (Inactive) added a comment - Quick update: conf_sanity test_66 has not failed in a few weeks. The "replace nids failed" error really dropped off after 2013-04-29. Perhaps some code path has changed.

            http://review.whamcloud.com/5940 has been resubmitted for testing in the effort to land it after the 2.4 split. We still see the issue on Master a few times a week and it will be good to know more about out what is causing the issue.

            keith Keith Mannthey (Inactive) added a comment - http://review.whamcloud.com/5940 has been resubmitted for testing in the effort to land it after the 2.4 split. We still see the issue on Master a few times a week and it will be good to know more about out what is causing the issue.

            I definitely don't think this needs to be a 2.4.0 blocker, since replace_nids is a very rarely used code path. The only potential reason for increased priority might be the frequency to other patches failing due to this bug, but I don't see very many failures due to this specific bug (several other conf-sanity failures are increasing the test failure rates).

            adilger Andreas Dilger added a comment - I definitely don't think this needs to be a 2.4.0 blocker, since replace_nids is a very rarely used code path. The only potential reason for increased priority might be the frequency to other patches failing due to this bug, but I don't see very many failures due to this specific bug (several other conf-sanity failures are increasing the test failure rates).

            People

              keith Keith Mannthey (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: