Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2683

Client deadlock in cl_lock_mutex_get

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.4.0
    • Lustre 2.4.0
    • Sequoia ppc64 client, 2.3.58-5chaos, servers are x86_64 running 2.3.58-6chaos
    • 3
    • 6270

    Description

      With 2.3.58-5chaos ppc64 clients, we are seeing Lustre hang with many threads waiting on a mutex under cl_lock_mutex_get(). In the attached file "RA0-ID-J03.log.txt", you can see the sysrq-t and sysrq-l output. The node is responsive, but lustre is doing nothing because all ptlrpcd threads are stuck in the same path:

      2013-01-25 11:51:10.788870 {DefaultControlEventListener} [mmcs]{131}.0.0: ptlrpcd_0     D 0000000000000000     0  3268      2 0x00000000
      2013-01-25 11:51:10.788922 {DefaultControlEventListener} [mmcs]{131}.0.0: Call Trace:
      2013-01-25 11:51:10.788974 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621ee70] [c00000000068f010] svc_rdma_ops+0xda18/0x1a900 (unreliable)
      2013-01-25 11:51:10.789026 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f040] [c000000000009b2c] .__switch_to+0xc4/0x100
      2013-01-25 11:51:10.789078 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f0d0] [c00000000042a418] .schedule+0x7d4/0x944
      2013-01-25 11:51:10.789129 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f380] [c00000000042b4a8] .__mutex_lock_slowpath+0x208/0x390
      2013-01-25 11:51:10.789181 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f480] [c00000000042bf44] .mutex_lock+0x38/0x58
      2013-01-25 11:51:10.789234 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f500] [8000000002501cf8] .cl_lock_mutex_get+0xc8/0x110 [obdclass]
      2013-01-25 11:51:10.789285 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f590] [800000000514d124] .lovsub_parent_lock+0x94/0x260 [lov]
      2013-01-25 11:51:10.789337 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f640] [800000000514de70] .lovsub_lock_state+0xd0/0x300 [lov]
      2013-01-25 11:51:10.789389 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f710] [80000000024fe344] .cl_lock_state_signal+0xd4/0x2c0 [obdclass]
      2013-01-25 11:51:10.789441 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f7d0] [80000000024ff448] .cl_lock_signal+0xa8/0x260 [obdclass]
      2013-01-25 11:51:10.789492 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f890] [80000000046adb24] .osc_lock_upcall+0x194/0x810 [osc]
      2013-01-25 11:51:10.789544 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f970] [8000000004688a7c] .osc_enqueue_fini+0xfc/0x3f0 [osc]
      2013-01-25 11:51:10.789596 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621fa60] [8000000004697364] .osc_enqueue_interpret+0x104/0x240 [osc]
      2013-01-25 11:51:10.789647 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621fb40] [8000000003b7a308] .ptlrpc_check_set+0x3c8/0x4e50 [ptlrpc]
      2013-01-25 11:51:10.789700 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621fd20] [8000000003bcffec] .ptlrpcd_check+0x66c/0x870 [ptlrpc]
      2013-01-25 11:51:10.789751 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621fe40] [8000000003bd054c] .ptlrpcd+0x35c/0x510 [ptlrpc]
      2013-01-25 11:51:10.789802 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621ff90] [c00000000001b9a0] .kernel_thread+0x54/0x70
      

      Attachments

        1. RA0-ID-J03.log.txt
          1.22 MB
          Christopher Morrone

        Issue Links

          Activity

            [LU-2683] Client deadlock in cl_lock_mutex_get

            As part of LU-4112 attempts to reproduce, I trigger the same dead-lock running with 2.1.6 stock version from http://downloads.whamcloud.com/public/lustre/lustre-2.1.6/el6/server/RPMS/x86_64/ :

            PID: 22597  TASK: ffff8805969c2ae0  CPU: 4   COMMAND: "LU-4112-reprodu"
            #0 [ffff88059324d748] schedule at ffffffff8150de92
            #1 [ffff88059324d810] __mutex_lock_slowpath at ffffffff8150f53e    <<<--- cll_guard is owned by 22615
            #2 [ffff88059324d880] mutex_lock at ffffffff8150f3db
            #3 [ffff88059324d8a0] cl_lock_mutex_get at ffffffffa05a3a47 [obdclass]
            #4 [ffff88059324d8d0] osc_ldlm_completion_ast at ffffffffa09aa4e2 [osc]
            #5 [ffff88059324d930] ldlm_lock_match at ffffffffa06aca4c [ptlrpc]
            #6 [ffff88059324da00] osc_enqueue_base at ffffffffa098f2d9 [osc]
            #7 [ffff88059324dab0] osc_lock_enqueue at ffffffffa09a863e [osc]
            #8 [ffff88059324db40] cl_enqueue_try at ffffffffa05a54bc [obdclass]
            #9 [ffff88059324db90] lov_lock_enqueue at ffffffffa0a2529b [lov]
            #10 [ffff88059324dc20] cl_enqueue_try at ffffffffa05a54bc [obdclass]
            #11 [ffff88059324dc70] cl_enqueue_locked at ffffffffa05a68ad [obdclass]
            #12 [ffff88059324dcb0] cl_lock_request at ffffffffa05a753e [obdclass]
            #13 [ffff88059324dd10] cl_io_lock at ffffffffa05ac84c [obdclass]
            #14 [ffff88059324dd70] cl_io_loop at ffffffffa05aca9a [obdclass]
            #15 [ffff88059324ddb0] ll_file_io_generic at ffffffffa0ab8047 [lustre]
            #16 [ffff88059324de20] ll_file_aio_read at ffffffffa0ab8339 [lustre]
            #17 [ffff88059324de80] ll_file_read at ffffffffa0ab87f9 [lustre]
            #18 [ffff88059324def0] vfs_read at ffffffff81181805
            #19 [ffff88059324df30] sys_read at ffffffff81181941
            #20 [ffff88059324df80] system_call_fastpath at ffffffff8100b072
               RIP: 000000312d60e54d  RSP: 00007fffcf09d4a8  RFLAGS: 00010246
               RAX: 0000000000000000  RBX: ffffffff8100b072  RCX: 0000000000000000
               RDX: 000000000000c000  RSI: 0000000001135450  RDI: 0000000000000018
               RBP: 00007fffcf09dd70   R8: 0000000000000000   R9: 0000000000000000
               R10: 0000000000000000  R11: 0000000000000293  R12: 00007fffcf09ef70
               R13: 0000000000400a50  R14: 0000000000401843  R15: 0000000000400a50
               ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b
            
            PID: 22615  TASK: ffff880c33976040  CPU: 1   COMMAND: "LU-4112-reprodu"
            #0 [ffff880b9f2b9a08] schedule at ffffffff8150de92
            #1 [ffff880b9f2b9ad0] __mutex_lock_slowpath at ffffffff8150f53e    <<<--- cll_guard is owned by 22597
            #2 [ffff880b9f2b9b40] mutex_lock at ffffffff8150f3db
            #3 [ffff880b9f2b9b60] cl_lock_mutex_get at ffffffffa05a3a47 [obdclass]
            #4 [ffff880b9f2b9b90] lov_lock_enqueue at ffffffffa0a25370 [lov]
            #5 [ffff880b9f2b9c20] cl_enqueue_try at ffffffffa05a54bc [obdclass]
            #6 [ffff880b9f2b9c70] cl_enqueue_locked at ffffffffa05a68ad [obdclass]
            #7 [ffff880b9f2b9cb0] cl_lock_request at ffffffffa05a753e [obdclass]
            #8 [ffff880b9f2b9d10] cl_io_lock at ffffffffa05ac84c [obdclass]
            #9 [ffff880b9f2b9d70] cl_io_loop at ffffffffa05aca9a [obdclass]
            #10 [ffff880b9f2b9db0] ll_file_io_generic at ffffffffa0ab8047 [lustre]
            #11 [ffff880b9f2b9e20] ll_file_aio_read at ffffffffa0ab8339 [lustre]
            #12 [ffff880b9f2b9e80] ll_file_read at ffffffffa0ab87f9 [lustre]
            #13 [ffff880b9f2b9ef0] vfs_read at ffffffff81181805
            #14 [ffff880b9f2b9f30] sys_read at ffffffff81181941
            #15 [ffff880b9f2b9f80] system_call_fastpath at ffffffff8100b072
               RIP: 000000312d60e54d  RSP: 00007fff91890748  RFLAGS: 00010246
               RAX: 0000000000000000  RBX: ffffffff8100b072  RCX: 0000000000000000
               RDX: 000000000000c000  RSI: 0000000001a50210  RDI: 0000000000000017
               RBP: 00007fff91891010   R8: 0000000000000000   R9: 0000000000000000
               R10: 0000000000000000  R11: 0000000000000293  R12: 00007fff91892210
               R13: 0000000000400a50  R14: 0000000000401843  R15: 0000000000400a50
               ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b
            

            thus I back-ported change #5208 patch-set #2 in b2_1 tree at http://review.whamcloud.com/8300.

            bfaccini Bruno Faccini (Inactive) added a comment - As part of LU-4112 attempts to reproduce, I trigger the same dead-lock running with 2.1.6 stock version from http://downloads.whamcloud.com/public/lustre/lustre-2.1.6/el6/server/RPMS/x86_64/ : PID: 22597 TASK: ffff8805969c2ae0 CPU: 4 COMMAND: "LU-4112-reprodu" #0 [ffff88059324d748] schedule at ffffffff8150de92 #1 [ffff88059324d810] __mutex_lock_slowpath at ffffffff8150f53e <<<--- cll_guard is owned by 22615 #2 [ffff88059324d880] mutex_lock at ffffffff8150f3db #3 [ffff88059324d8a0] cl_lock_mutex_get at ffffffffa05a3a47 [obdclass] #4 [ffff88059324d8d0] osc_ldlm_completion_ast at ffffffffa09aa4e2 [osc] #5 [ffff88059324d930] ldlm_lock_match at ffffffffa06aca4c [ptlrpc] #6 [ffff88059324da00] osc_enqueue_base at ffffffffa098f2d9 [osc] #7 [ffff88059324dab0] osc_lock_enqueue at ffffffffa09a863e [osc] #8 [ffff88059324db40] cl_enqueue_try at ffffffffa05a54bc [obdclass] #9 [ffff88059324db90] lov_lock_enqueue at ffffffffa0a2529b [lov] #10 [ffff88059324dc20] cl_enqueue_try at ffffffffa05a54bc [obdclass] #11 [ffff88059324dc70] cl_enqueue_locked at ffffffffa05a68ad [obdclass] #12 [ffff88059324dcb0] cl_lock_request at ffffffffa05a753e [obdclass] #13 [ffff88059324dd10] cl_io_lock at ffffffffa05ac84c [obdclass] #14 [ffff88059324dd70] cl_io_loop at ffffffffa05aca9a [obdclass] #15 [ffff88059324ddb0] ll_file_io_generic at ffffffffa0ab8047 [lustre] #16 [ffff88059324de20] ll_file_aio_read at ffffffffa0ab8339 [lustre] #17 [ffff88059324de80] ll_file_read at ffffffffa0ab87f9 [lustre] #18 [ffff88059324def0] vfs_read at ffffffff81181805 #19 [ffff88059324df30] sys_read at ffffffff81181941 #20 [ffff88059324df80] system_call_fastpath at ffffffff8100b072 RIP: 000000312d60e54d RSP: 00007fffcf09d4a8 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffff8100b072 RCX: 0000000000000000 RDX: 000000000000c000 RSI: 0000000001135450 RDI: 0000000000000018 RBP: 00007fffcf09dd70 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 00007fffcf09ef70 R13: 0000000000400a50 R14: 0000000000401843 R15: 0000000000400a50 ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b PID: 22615 TASK: ffff880c33976040 CPU: 1 COMMAND: "LU-4112-reprodu" #0 [ffff880b9f2b9a08] schedule at ffffffff8150de92 #1 [ffff880b9f2b9ad0] __mutex_lock_slowpath at ffffffff8150f53e <<<--- cll_guard is owned by 22597 #2 [ffff880b9f2b9b40] mutex_lock at ffffffff8150f3db #3 [ffff880b9f2b9b60] cl_lock_mutex_get at ffffffffa05a3a47 [obdclass] #4 [ffff880b9f2b9b90] lov_lock_enqueue at ffffffffa0a25370 [lov] #5 [ffff880b9f2b9c20] cl_enqueue_try at ffffffffa05a54bc [obdclass] #6 [ffff880b9f2b9c70] cl_enqueue_locked at ffffffffa05a68ad [obdclass] #7 [ffff880b9f2b9cb0] cl_lock_request at ffffffffa05a753e [obdclass] #8 [ffff880b9f2b9d10] cl_io_lock at ffffffffa05ac84c [obdclass] #9 [ffff880b9f2b9d70] cl_io_loop at ffffffffa05aca9a [obdclass] #10 [ffff880b9f2b9db0] ll_file_io_generic at ffffffffa0ab8047 [lustre] #11 [ffff880b9f2b9e20] ll_file_aio_read at ffffffffa0ab8339 [lustre] #12 [ffff880b9f2b9e80] ll_file_read at ffffffffa0ab87f9 [lustre] #13 [ffff880b9f2b9ef0] vfs_read at ffffffff81181805 #14 [ffff880b9f2b9f30] sys_read at ffffffff81181941 #15 [ffff880b9f2b9f80] system_call_fastpath at ffffffff8100b072 RIP: 000000312d60e54d RSP: 00007fff91890748 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffff8100b072 RCX: 0000000000000000 RDX: 000000000000c000 RSI: 0000000001a50210 RDI: 0000000000000017 RBP: 00007fff91891010 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 00007fff91892210 R13: 0000000000400a50 R14: 0000000000401843 R15: 0000000000400a50 ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b thus I back-ported change #5208 patch-set #2 in b2_1 tree at http://review.whamcloud.com/8300 .
            pjones Peter Jones added a comment -

            Landed for 2.4

            pjones Peter Jones added a comment - Landed for 2.4

            With change 5208, we have seen no repeats of the problem for four days. It looks like the patched has fixed the problem. Thanks!

            morrone Christopher Morrone (Inactive) added a comment - With change 5208 , we have seen no repeats of the problem for four days. It looks like the patched has fixed the problem. Thanks!

            We have about a day of testing with patch set 2 of change 5208, and so far we have had no occurrences of the client deadlock.

            Probably too soon to be definitive, but it is promising.

            morrone Christopher Morrone (Inactive) added a comment - We have about a day of testing with patch set 2 of change 5208 , and so far we have had no occurrences of the client deadlock. Probably too soon to be definitive, but it is promising.

            Bruno has claimed that this patch can at least fix the original problem so let's land this patch for now.

            jay Jinshan Xiong (Inactive) added a comment - Bruno has claimed that this patch can at least fix the original problem so let's land this patch for now.

            I see, thanks for explanation.

            jay Jinshan Xiong (Inactive) added a comment - I see, thanks for explanation.
            prakash Prakash Surya (Inactive) added a comment - - edited

            Well in total, I see that message 12 times for this particular occurrence (from Feb. 11th not running with 5208). AFAIK, all of the OSTs are healthy but when the IONs hit this issue, some of the OSTs will report as "inactive" when running "lfs check servers" on the client. That is how the admins discover IONs in this state, and then manually reboot them.

            With that said, I may have jumped the gun in my previous comments here and here. What I reported in those comments happened on a login node and wasn't actually a deadlock as far as I can tell. I think those were just "slow" threads (cause unknown) as they eventually cleared up (but then hit LU-2779).

            So unless you think differently , it probably isn't worth spending more time on this until it's seen on one of the IONs running with the patch applied. Sorry for the confusion.

            prakash Prakash Surya (Inactive) added a comment - - edited Well in total, I see that message 12 times for this particular occurrence (from Feb. 11th not running with 5208 ). AFAIK, all of the OSTs are healthy but when the IONs hit this issue, some of the OSTs will report as "inactive" when running "lfs check servers" on the client. That is how the admins discover IONs in this state, and then manually reboot them. With that said, I may have jumped the gun in my previous comments here and here . What I reported in those comments happened on a login node and wasn't actually a deadlock as far as I can tell. I think those were just "slow" threads (cause unknown) as they eventually cleared up (but then hit LU-2779 ). So unless you think differently , it probably isn't worth spending more time on this until it's seen on one of the IONs running with the patch applied. Sorry for the confusion.

            why were there so many of them? Is one OST in recovery at that time?

            It will be helpful to dump log, with DLMTRACE, CACHE and VFSTRACE enabled, when you first see this kind of message. It will be easily done with a script

            jay Jinshan Xiong (Inactive) added a comment - why were there so many of them? Is one OST in recovery at that time? It will be helpful to dump log, with DLMTRACE, CACHE and VFSTRACE enabled, when you first see this kind of message. It will be easily done with a script

            Like these messages?

            2013-02-11 13:29:13.788428 {DefaultControlEventListener} [mmcs]{613}.12.0: LustreError: 31967:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001c1fc2fe8@{[26 -> 28/31], [3|0|+|rpc|wihY|c0000001b0756f60], [196608|3|+|-|c0000001cdfaa6e0|16|c00000037ab0f300]} ls1-OST005a-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
            2013-02-11 13:30:26.835353 {DefaultControlEventListener} [mmcs]{613}.0.3:  LustreError: 40667:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001cdacee38@{[2 -> 4/15], [3|0|+|rpc|wiY|c0000001b0759480], [196608|3|+|-|c0000001cdfa9600|16|c0000003c54483a0]} ls1-OST007b-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
            2013-02-11 13:30:58.855528 {DefaultControlEventListener} [mmcs]{613}.3.0:  LustreError: 32002:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001cdacfca8@{[2 -> 4/15], [3|0|+|rpc|wihY|c0000001b0759d80], [196608|3|+|-|c0000001cdfa99c0|16|c0000003742646e0]} ls1-OST0083-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
            2013-02-11 13:30:58.856226 {DefaultControlEventListener} [mmcs]{613}.3.2:  LustreError: 3958:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001cedc8d18@{[42 -> 44/47], [3|0|+|rpc|wihY|c0000002e43ed020], [196608|3|+|-|c0000001c0c13c00|16|c0000003c87a1e60]} ls1-OST014d-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
            2013-02-11 13:31:44.273199 {DefaultControlEventListener} [mmcs]{613}.12.0: LustreError: 31983:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001c9b17230@{[2 -> 4/15], [3|0|+|rpc|wihY|c0000001b075a680], [196608|3|+|-|c0000002e1471d10|16|c0000003c80ad4a0]} ls1-OST008b-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
            
            prakash Prakash Surya (Inactive) added a comment - Like these messages? 2013-02-11 13:29:13.788428 {DefaultControlEventListener} [mmcs]{613}.12.0: LustreError: 31967:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001c1fc2fe8@{[26 -> 28/31], [3|0|+|rpc|wihY|c0000001b0756f60], [196608|3|+|-|c0000001cdfaa6e0|16|c00000037ab0f300]} ls1-OST005a-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress? 2013-02-11 13:30:26.835353 {DefaultControlEventListener} [mmcs]{613}.0.3: LustreError: 40667:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001cdacee38@{[2 -> 4/15], [3|0|+|rpc|wiY|c0000001b0759480], [196608|3|+|-|c0000001cdfa9600|16|c0000003c54483a0]} ls1-OST007b-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress? 2013-02-11 13:30:58.855528 {DefaultControlEventListener} [mmcs]{613}.3.0: LustreError: 32002:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001cdacfca8@{[2 -> 4/15], [3|0|+|rpc|wihY|c0000001b0759d80], [196608|3|+|-|c0000001cdfa99c0|16|c0000003742646e0]} ls1-OST0083-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress? 2013-02-11 13:30:58.856226 {DefaultControlEventListener} [mmcs]{613}.3.2: LustreError: 3958:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001cedc8d18@{[42 -> 44/47], [3|0|+|rpc|wihY|c0000002e43ed020], [196608|3|+|-|c0000001c0c13c00|16|c0000003c87a1e60]} ls1-OST014d-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress? 2013-02-11 13:31:44.273199 {DefaultControlEventListener} [mmcs]{613}.12.0: LustreError: 31983:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001c9b17230@{[2 -> 4/15], [3|0|+|rpc|wihY|c0000001b075a680], [196608|3|+|-|c0000002e1471d10|16|c0000003c80ad4a0]} ls1-OST008b-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?

            Did you find anything was printed in osc_extent_wait()?

            jay Jinshan Xiong (Inactive) added a comment - Did you find anything was printed in osc_extent_wait()?

            People

              jay Jinshan Xiong (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: