Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6200

Failover recovery-mds-scale test_failover_ost: test_failover_ost returned 1

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.7.0, Lustre 2.8.0, Lustre 2.10.0, Lustre 2.11.0, Lustre 2.10.4
    • client and server: lustre-master build # 2835 RHEL6
    • 3
    • 17329

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/be3ebe76-a817-11e4-93dd-5254006e85c2.

      The sub-test test_failover_ost failed with the following error:

      test_failover_ost returned 1
      

      client 3 shows

      tar: etc/sysconfig/quota_nld: Cannot write: No such file or directory
      tar: etc/sysconfig/quota_nld: Cannot utime: No such file or directory
      tar: etc/sysconfig/sandbox: Cannot write: No such file or directory
      tar: etc/sysconfig/nfs: Cannot write: No such file or directory
      tar: Exiting with failure status due to previous errors
      

      Attachments

        Issue Links

          Activity

            [LU-6200] Failover recovery-mds-scale test_failover_ost: test_failover_ost returned 1

            Resolved as duplicate of LU-11765

            hongchao.zhang Hongchao Zhang added a comment - Resolved as duplicate of LU-11765

            Hi Sergey,

            Thanks!
            This should be the same as LU-11765, only the fixing ways are different, the patch in this ticket recreates the object
            if it doesn't exist and the patch in LU-11765 returns -EAGAIN to notify the caller to retry.

            hongchao.zhang Hongchao Zhang added a comment - Hi Sergey, Thanks! This should be the same as LU-11765 , only the fixing ways are different, the patch in this ticket recreates the object if it doesn't exist and the patch in LU-11765 returns -EAGAIN to notify the caller to retry.

            Hi,

            Description of the problem looks similar with I've already fixed in https://review.whamcloud.com/#/c/33836/ .
            Please look carefully and If I am right this could be resolved as a dup of LU-11765.

            scherementsev Sergey Cheremencev added a comment - Hi, Description of the problem looks similar with I've already fixed in https://review.whamcloud.com/#/c/33836/  . Please look carefully and If I am right this could be resolved as a dup of LU-11765 .

            the patch https://review.whamcloud.com/#/c/13668/ has been updated

            hongchao.zhang Hongchao Zhang added a comment - the patch https://review.whamcloud.com/#/c/13668/ has been updated
            jcasper James Casper (Inactive) added a comment - 2.11.0: https://testing.hpdd.intel.com/test_sessions/e6578085-2eed-486d-8601-e5214bac4bb0
            hongchao.zhang Hongchao Zhang added a comment - - edited

            the patch https://review.whamcloud.com/#/c/13668/ has been updated.

            hongchao.zhang Hongchao Zhang added a comment - - edited the patch https://review.whamcloud.com/#/c/13668/ has been updated.

            the patch has been updated as per the review feedback.

            hongchao.zhang Hongchao Zhang added a comment - the patch has been updated as per the review feedback.
            shadow Alexey Lyashkov added a comment - - edited

            Hongchao,

            i don't have an access to gerrit now, but you patch
            https://git.hpdd.intel.com/?p=fs/lustre-release.git;a=commitdiff;h=daa98c46817c98d6fbf70dafa9fbdde678f8b9ba;hp=32d1a1c5d610d054ad4609c1cf332172e8310805
            is bad.

            Looks, You can't use a

            + /* Do sync create if the seq is about to used up */
            + if (fid_seq_is_idif(seq) || fid_seq_is_mdt0(seq)) {
            + if (unlikely(oid >= IDIF_MAX_OID - 1))
            + sync = 1;

            because ost id in this case need to account lower 16 bits from seq, please look to the ost id macros.

            shadow Alexey Lyashkov added a comment - - edited Hongchao, i don't have an access to gerrit now, but you patch https://git.hpdd.intel.com/?p=fs/lustre-release.git;a=commitdiff;h=daa98c46817c98d6fbf70dafa9fbdde678f8b9ba;hp=32d1a1c5d610d054ad4609c1cf332172e8310805 is bad. Looks, You can't use a + /* Do sync create if the seq is about to used up */ + if (fid_seq_is_idif(seq) || fid_seq_is_mdt0(seq)) { + if (unlikely(oid >= IDIF_MAX_OID - 1)) + sync = 1; because ost id in this case need to account lower 16 bits from seq, please look to the ost id macros.

            the patch https://review.whamcloud.com/#/c/13668/ has been updated.

            hongchao.zhang Hongchao Zhang added a comment - the patch https://review.whamcloud.com/#/c/13668/ has been updated.
            standan Saurabh Tandan (Inactive) added a comment - - edited Another instance found on b2_8 for failover testing , build# 6. https://testing.hpdd.intel.com/test_sessions/0aed3028-da39-11e5-a8a6-5254006e85c2 https://testing.hpdd.intel.com/test_sessions/eaf85780-d65e-11e5-afe8-5254006e85c2 https://testing.hpdd.intel.com/test_sessions/54ec62da-d99d-11e5-9ebe-5254006e85c2 https://testing.hpdd.intel.com/test_sessions/eb9f29ec-d8da-11e5-83e2-5254006e85c2 https://testing.hpdd.intel.com/test_sessions/2f0aa9f6-d5a5-11e5-9cc2-5254006e85c2 https://testing.hpdd.intel.com/test_sessions/c5a8e44c-d9c7-11e5-85dd-5254006e85c2

            People

              hongchao.zhang Hongchao Zhang
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: