Details

    • 3
    • 9223372036854775807

    Description

      With a LU-8367 deadlock between osp_precreate_reserve() and 

      osp_precreate_cleanup_orphans(), I've found a problem with MDT failover.

      00000020:02000400:31.0:1644539398.776433:0:454249:0:(obd_config.c:854:class_cleanup()) Failing over kjcf05-MDT0001
      ...
      00010000:02020000:20.0:1644539461.204784:0:454249:0:(ldlm_resource.c:1188:__ldlm_namespace_free()) 0-0: Forced cleanup waiting for mdt-kjcf05-MDT0001_UUID namespace with 46 resources in use, (rc=-110)
      00010000:02020000:8.0:1644539699.332763:0:454249:0:(ldlm_resource.c:1188:__ldlm_namespace_free()) 0-0: Forced cleanup waiting for mdt-kjcf05-MDT0001_UUID namespace with 46 resources in use, (rc=-110)
      

      So the situation is - MDT failover does not produce disconnect event, so osp_precreate_cleanup_orphans() cannot be awakened. Also it does not cleanup opd_pre_recovering and osp_precreate_reserve() wait skips wakeup signal. This hang would be ended after ~obd_timeout.

      Attachments

        Issue Links

          Activity

            [LU-15724] MDT failover hang

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48549/
            Subject: LU-15724 tests: MDT failover hang reproducer
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 9d1805c8b9cc1067b9b3ba186e5e3531112e08a3

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48549/ Subject: LU-15724 tests: MDT failover hang reproducer Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 9d1805c8b9cc1067b9b3ba186e5e3531112e08a3

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48548/
            Subject: LU-15724 osp: wakeup all precreate threads
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 4eede4aab35296ed9417b77b955cf43a83827fdb

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48548/ Subject: LU-15724 osp: wakeup all precreate threads Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 4eede4aab35296ed9417b77b955cf43a83827fdb

            "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48549
            Subject: LU-15724 tests: MDT failover hang reproducer
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 2844bbe7a5c9915c5f2bf376a6b4554e5683081c

            gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48549 Subject: LU-15724 tests: MDT failover hang reproducer Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 2844bbe7a5c9915c5f2bf376a6b4554e5683081c

            "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48548
            Subject: LU-15724 osp: wakeup all precreate threads
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: c274853358c793555fb1a20741f72d9254a0147d

            gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/48548 Subject: LU-15724 osp: wakeup all precreate threads Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: c274853358c793555fb1a20741f72d9254a0147d
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47006/
            Subject: LU-15724 tests: MDT failover hang reproducer
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: aa6250b7412e7baf6760fe4010a81f4f22187127

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47006/ Subject: LU-15724 tests: MDT failover hang reproducer Project: fs/lustre-release Branch: master Current Patch Set: Commit: aa6250b7412e7baf6760fe4010a81f4f22187127

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47005/
            Subject: LU-15724 osp: wakeup all precreate threads
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: e55fc043679cdfadfff6874ef78e2e0128ec37ac

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47005/ Subject: LU-15724 osp: wakeup all precreate threads Project: fs/lustre-release Branch: master Current Patch Set: Commit: e55fc043679cdfadfff6874ef78e2e0128ec37ac

            "Alexander Boyko <alexander.boyko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47006
            Subject: LU-15724 tests: MDT failover hang reproducer
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 300cd635638acf195124a12c4a5228dbdc85c116

            gerrit Gerrit Updater added a comment - "Alexander Boyko <alexander.boyko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47006 Subject: LU-15724 tests: MDT failover hang reproducer Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 300cd635638acf195124a12c4a5228dbdc85c116

            "Alexander Boyko <alexander.boyko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47005
            Subject: LU-15724 osp: wakeup all precreate threads
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 6080ef0513b4832c76b0dfa04efc185987f2e61b

            gerrit Gerrit Updater added a comment - "Alexander Boyko <alexander.boyko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/47005 Subject: LU-15724 osp: wakeup all precreate threads Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 6080ef0513b4832c76b0dfa04efc185987f2e61b

            People

              aboyko Alexander Boyko
              aboyko Alexander Boyko
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: