Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7428

conf-sanity test_84, replay-dual 0a: /dev/lvm-Role_MDS/P1 failed to initialize!

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.9.0
    • Lustre 2.8.0
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/5d42a610-8187-11e5-a41e-5254006e85c2.

      The sub-test test_84 failed with the following error:

      CMD: shadow-10vm4 e2label /dev/lvm-Role_MDS/P1 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      CMD: shadow-10vm4 e2label /dev/lvm-Role_MDS/P1 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      CMD: shadow-10vm4 e2label /dev/lvm-Role_MDS/P1 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      Update not seen after 90s: wanted '' got 'lustre:MDT0000'
       conf-sanity test_84: @@@@@@ FAIL: /dev/lvm-Role_MDS/P1 failed to initialize! 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:4843:error()
        = /usr/lib64/lustre/tests/test-framework.sh:1270:mount_facet()
        = /usr/lib64/lustre/tests/test-framework.sh:1188:mount_facets()
        = /usr/lib64/lustre/tests/test-framework.sh:2513:facet_failover()
        = /usr/lib64/lustre/tests/conf-sanity.sh:5594:test_84()
        = /usr/lib64/lustre/tests/test-framework.sh:5090:run_one()
        = /usr/lib64/lustre/tests/test-framework.sh:5127:run_one_logged()
        = /usr/lib64/lustre/tests/test-framework.sh:4992:run_test()
      

      Please provide additional information about the failure here.

      Info required for matching: conf-sanity 84
      Info required for matching: replay-dual 0a

      Attachments

        Issue Links

          Activity

            [LU-7428] conf-sanity test_84, replay-dual 0a: /dev/lvm-Role_MDS/P1 failed to initialize!

            Reopen until patch enabling test_84 actually lands.

            adilger Andreas Dilger added a comment - Reopen until patch enabling test_84 actually lands.

            the patch http://review.whamcloud.com/#/c/20194/ to remove test 84 has been refreshed, and have passed the tests in Maloo now.

            hongchao.zhang Hongchao Zhang added a comment - the patch http://review.whamcloud.com/#/c/20194/ to remove test 84 has been refreshed, and have passed the tests in Maloo now.
            jhammond John Hammond added a comment -

            Did the landing of http://review.whamcloud.com/20586/ resolve this issue? I see that we still have 84 in ALWAYS_EXCEPT.

            jhammond John Hammond added a comment - Did the landing of http://review.whamcloud.com/20586/ resolve this issue? I see that we still have 84 in ALWAYS_EXCEPT .
            pjones Peter Jones added a comment -

            Landed for 2.9

            pjones Peter Jones added a comment - Landed for 2.9

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20586/
            Subject: LU-7428 osd: set device read-only correctly
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: a079ade7913b923b795ea5c01df4e69bf1a87691

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20586/ Subject: LU-7428 osd: set device read-only correctly Project: fs/lustre-release Branch: master Current Patch Set: Commit: a079ade7913b923b795ea5c01df4e69bf1a87691

            No, this won't replace LU-684.

            This patch is to (hopefully) fix a problem where the device is sync'd and set read-only, but loses some recent writes, for an unknown reason. This shows up with a variety of different symptoms, and may be a result of bad interactions with LVM and VM virtual block devices, or it may be caused by the dev readonly patches.

            adilger Andreas Dilger added a comment - No, this won't replace LU-684 . This patch is to (hopefully) fix a problem where the device is sync'd and set read-only, but loses some recent writes, for an unknown reason. This shows up with a variety of different symptoms, and may be a result of bad interactions with LVM and VM virtual block devices, or it may be caused by the dev readonly patches.

            Does this patch mean we don't need LU-684 anymore?

            simmonsja James A Simmons added a comment - Does this patch mean we don't need LU-684 anymore?
            hongchao.zhang Hongchao Zhang added a comment - the patch ported from MRP-2135 ( https://github.com/Xyratex/lustre-stable/commit/6197a27f174e683d3c66137db8976bddc7ef179b ) is tracked at http://review.whamcloud.com/20586

            Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: http://review.whamcloud.com/20586
            Subject: LU-7428 osd: set rdonly correctly
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 521b8290fbf0b47d4ad03272a206d038f648db2d

            gerrit Gerrit Updater added a comment - Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: http://review.whamcloud.com/20586 Subject: LU-7428 osd: set rdonly correctly Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 521b8290fbf0b47d4ad03272a206d038f648db2d

            Gu Zheng (gzheng@ddn.com) uploaded a new patch: http://review.whamcloud.com/20535
            Subject: LU-7428 osd: freeze fs before set device readonly
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: cb2a762f99cef9ff86dc76445248f91b40f0199b

            gerrit Gerrit Updater added a comment - Gu Zheng (gzheng@ddn.com) uploaded a new patch: http://review.whamcloud.com/20535 Subject: LU-7428 osd: freeze fs before set device readonly Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: cb2a762f99cef9ff86dc76445248f91b40f0199b

            It might be worthwhile to test the patch from https://github.com/Xyratex/lustre-stable/commit/6197a27f174e683d3c66137db8976bddc7ef179b to see if that is fixing the problem? I think that patch could be simplified to just call sb->s_op->s_freeze() before marking the device read-only.

            adilger Andreas Dilger added a comment - It might be worthwhile to test the patch from https://github.com/Xyratex/lustre-stable/commit/6197a27f174e683d3c66137db8976bddc7ef179b to see if that is fixing the problem? I think that patch could be simplified to just call sb->s_op->s_freeze() before marking the device read-only.

            People

              simmonsja James A Simmons
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: