Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15453

MDT shutdown hangs on mutex_lock, possibly cld_lock

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • None
    • lustre-2.12.7_2.llnl-2.ch6.x86_64
      zfs-0.7.11-9.8llnl.ch6.x86_64
      3.10.0-1160.45.1.1chaos.ch6.x86_64
    • 3
    • 9223372036854775807

    Description

      LNet issues (See LU-15234 and LU-14026) result in clients and lustre servers reporting via console logs that they lost connection to the MGS.

      We are working on solving the LNet issues, but this may also be revealing error-path issues that should be fixed.

      MDT0, which is usually running on the same server as the MGS, is one of the targets which reports a lost connection (they are separate devices, stored in distinct datasets, started/stopped separately):

      MGC172.19.3.98@o2ib600: Connection to MGS (at 0@lo) was lost 

      Attempting to shutdown the MDT hangs, with this stack reported by the watchdog:

       schedule_preempt_disabled+0x39/0x90
       __mutex_lock_slowpath+0x10f/0x250
       mutex_lock+0x32/0x42
       mgc_process_config+0x21a/0x1420 [mgc]
       obd_process_config.constprop.14+0x75/0x210 [obdclass]
       ? lprocfs_counter_add+0xf9/0x160 [obdclass]
       lustre_end_log+0x1ff/0x550 [obdclass]
       server_put_super+0x82e/0xd00 [obdclass]
       generic_shutdown_super+0x6d/0x110
       kill_anon_super+0x12/0x20
       lustre_kill_super+0x32/0x50 [obdclass]
       deactivate_locked_super+0x4e/0x70
       deactivate_super+0x46/0x60
       cleanup_mnt+0x3f/0x80
       __cleanup_mnt+0x12/0x20
       task_work_run+0xbb/0xf0
       do_notify_resume+0xa5/0xc0
       int_signal+0x12/0x17
      

      The server was crashed and a dump collected.  The stacks for the umount process and the ll_cfg_requeue process both have pointers to the "ls1-mdtir" config_llog_data structure; I believe cld->cld_lock is held by ll_cfg_requeue and umount is waiting on it.

      PID: 4504   TASK: ffff8e8c9edc8000  CPU: 24  COMMAND: "ll_cfg_requeue"
       #0 [ffff8e8ac474f970] __schedule at ffffffff9d3b6788
       #1 [ffff8e8ac474f9d8] schedule at ffffffff9d3b6ce9
       #2 [ffff8e8ac474f9e8] schedule_timeout at ffffffff9d3b4528
       #3 [ffff8e8ac474fa98] ldlm_completion_ast at ffffffffc14ac650 [ptlrpc]
       #4 [ffff8e8ac474fb40] ldlm_cli_enqueue_fini at ffffffffc14ae83f [ptlrpc]
       #5 [ffff8e8ac474fbf0] ldlm_cli_enqueue at ffffffffc14b10d1 [ptlrpc]
       #6 [ffff8e8ac474fca8] mgc_enqueue at ffffffffc0fb94cf [mgc]
       #7 [ffff8e8ac474fd70] mgc_process_log at ffffffffc0fbf393 [mgc]
       #8 [ffff8e8ac474fe30] mgc_requeue_thread at ffffffffc0fc1b10 [mgc]
       #9 [ffff8e8ac474fec8] kthread at ffffffff9cccb221
      

      I can provide console logs and the crash dump.  I do not have lustre debug logs.

      Attachments

        1. bt.a.txt
          51 kB
        2. foreach.bt.txt
          571 kB

        Issue Links

          Activity

            [LU-15453] MDT shutdown hangs on mutex_lock, possibly cld_lock

            Hi Olaf,

            I don't think so. Our servers are running 2.12.7 with:

            • LU-13356 client: don't use OBD_CONNECT_MNE_SWAB (41309)
            • LU-14688 mdt: changelog purge deletes plain llog (43990)

            Our clients are now slowly moving to 2.12.8 + LU-13356

            sthiell Stephane Thiell added a comment - Hi Olaf, I don't think so. Our servers are running 2.12.7 with: LU-13356 client: don't use OBD_CONNECT_MNE_SWAB (41309) LU-14688 mdt: changelog purge deletes plain llog (43990) Our clients are now slowly moving to 2.12.8 + LU-13356

            Stephane,

            Do you have any other patches in your stack related to recovery?

            thanks

            ofaaland Olaf Faaland added a comment - Stephane, Do you have any other patches in your stack related to recovery? thanks

            OK! Thanks Andreas!

            sthiell Stephane Thiell added a comment - OK! Thanks Andreas!

            sthiell, I don't think anyone is against landing 41309 on b2_12 because of 2.2 interop, just that it hasn't landed yet.

            adilger Andreas Dilger added a comment - sthiell , I don't think anyone is against landing 41309 on b2_12 because of 2.2 interop, just that it hasn't landed yet.

            Honestly it is a bit ridiculous to not land change 41309 to b2_12 at this time because of compat issue with old Lustre 2.2. Without this patch, the MGS on 2.12.x is not stable, even in a full 2.12 environment. We have patched all our clients and servers with it (we're running 2.12.x everywhere now, mostly 2.12.7 and now deploying 2.12.8 that also requires patching). Just saying.  

            sthiell Stephane Thiell added a comment - Honestly it is a bit ridiculous to not land change 41309 to b2_12 at this time because of compat issue with old Lustre 2.2. Without this patch, the MGS on 2.12.x is not stable, even in a full 2.12 environment. We have patched all our clients and servers with it (we're running 2.12.x everywhere now, mostly 2.12.7 and now deploying 2.12.8 that also requires patching). Just saying.  
            ofaaland Olaf Faaland added a comment -

            Mikhail,

            > As for question #2 - do you mean will there be an alternative solution in b2_12?

            Yes, that was my question.

            thanks!

            ofaaland Olaf Faaland added a comment - Mikhail, > As for question #2 - do you mean will there be an alternative solution in b2_12? Yes, that was my question. thanks!

            Olaf, the patch can be added to your stack if there is no need for 2.2 interop. As for question #2 - do you mean will there be an alternative solution in b2_12?

            As for other information to collect, it seems we can only rely on symptoms here, since related code has no any debug messages directly connected with the situation

            tappro Mikhail Pershin added a comment - Olaf, the patch can be added to your stack if there is no need for 2.2 interop. As for question #2 - do you mean will there be an alternative solution in b2_12? As for other information to collect, it seems we can only rely on symptoms here, since related code has no any debug messages directly connected with the situation
            ofaaland Olaf Faaland added a comment -

            Hi Mikhail,

            Yes, it does look a lot like LU-13356.  I see Etienne's comment about change #41309 removing interop support with v2.2 clients and servers, and that the patch therefore cannot be landed to b2_12. 

            1. At our site, we we have only Lustre 2.10.8 routers and Lustre {2.12.8,2.14) clients/servers/routers.  We do not have v2.2 running anywhere.  Can we safely add that patch to our stack?  It would be useful to hear back about this today, if possible.
            2. If change #41309 cannot be landed to b2_12, what are some other options?  This question is not as urgent.
            3. If we see this symptom again before we have any patches landed to address it, is there other information I can gather that would help confirm this theory?

            thanks

            ofaaland Olaf Faaland added a comment - Hi Mikhail, Yes, it does look a lot like LU-13356 .  I see Etienne's comment about change #41309 removing interop support with v2.2 clients and servers, and that the patch therefore cannot be landed to b2_12.  At our site, we we have only Lustre 2.10.8 routers and Lustre {2.12.8,2.14) clients/servers/routers.  We do not have v2.2 running anywhere.  Can we safely add that patch to our stack?  It would be useful to hear back about this today, if possible. If change #41309 cannot be landed to b2_12, what are some other options?  This question is not as urgent. If we see this symptom again before we have any patches landed to address it, is there other information I can gather that would help confirm this theory? thanks
            tappro Mikhail Pershin added a comment - - edited

            Symptoms remind me ticket LU-13356, the related patch is not yet landed in b2_12: https://review.whamcloud.com/41309 

            Another thought is LU-15020 which is about the waiting for OST_DISCONNECT, but the first one looks closer to what we have here

            tappro Mikhail Pershin added a comment - - edited Symptoms remind me ticket LU-13356 , the related patch is not yet landed in b2_12: https://review.whamcloud.com/41309   Another thought is LU-15020 which is about the waiting for OST_DISCONNECT, but the first one looks closer to what we have here

            Hi, sorry for the delay.  I've attached:
            "bt -a" output in bt.a.txt (stack traces of the active task on each CPU)
            "foreach bt" output in foreach.bt.txt (stack traces of all processes)

            ofaaland Olaf Faaland added a comment - Hi, sorry for the delay.  I've attached: "bt -a" output in bt.a.txt (stack traces of the active task on each CPU) "foreach bt" output in foreach.bt.txt (stack traces of all processes)

            People

              tappro Mikhail Pershin
              ofaaland Olaf Faaland
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: