Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15724

MDT failover hang

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      With a LU-8367 deadlock between osp_precreate_reserve() and 

      osp_precreate_cleanup_orphans(), I've found a problem with MDT failover.

      00000020:02000400:31.0:1644539398.776433:0:454249:0:(obd_config.c:854:class_cleanup()) Failing over kjcf05-MDT0001
      ...
      00010000:02020000:20.0:1644539461.204784:0:454249:0:(ldlm_resource.c:1188:__ldlm_namespace_free()) 0-0: Forced cleanup waiting for mdt-kjcf05-MDT0001_UUID namespace with 46 resources in use, (rc=-110)
      00010000:02020000:8.0:1644539699.332763:0:454249:0:(ldlm_resource.c:1188:__ldlm_namespace_free()) 0-0: Forced cleanup waiting for mdt-kjcf05-MDT0001_UUID namespace with 46 resources in use, (rc=-110)
      

      So the situation is - MDT failover does not produce disconnect event, so osp_precreate_cleanup_orphans() cannot be awakened. Also it does not cleanup opd_pre_recovering and osp_precreate_reserve() wait skips wakeup signal. This hang would be ended after ~obd_timeout.

      Attachments

        Issue Links

          Activity

            People

              aboyko Alexander Boyko
              aboyko Alexander Boyko
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: