Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15117

ofd_read_lock vs transaction deadlock while allocating buffers

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      PID: 154236  TASK: ffff9ab9b2f330c0  CPU: 9   COMMAND: "ll_ost_io01_002"
       #0 [ffff9ab9b2f6af58] __schedule at ffffffff9876ab17
       #1 [ffff9ab9b2f6afe0] schedule at ffffffff9876b019
       #2 [ffff9ab9b2f6aff0] wait_transaction_locked at ffffffffc0760085 [jbd2]
       #3 [ffff9ab9b2f6b048] add_transaction_credits at ffffffffc0760368 [jbd2]
       #4 [ffff9ab9b2f6b0a8] start_this_handle at ffffffffc07605e1 [jbd2]
       #5 [ffff9ab9b2f6b140] jbd2__journal_start at ffffffffc0760a93 [jbd2]
       #6 [ffff9ab9b2f6b188] __ldiskfs_journal_start_sb at ffffffffc19c1c79 [ldiskfs]
       #7 [ffff9ab9b2f6b1c8] ldiskfs_release_dquot at ffffffffc19b92ec [ldiskfs]
       #8 [ffff9ab9b2f6b1e8] dqput at ffffffff982aeb5d
       #9 [ffff9ab9b2f6b210] __dquot_drop at ffffffff982b0215
      #10 [ffff9ab9b2f6b248] dquot_drop at ffffffff982b0285
      #11 [ffff9ab9b2f6b258] ldiskfs_clear_inode at ffffffffc19bdcf2 [ldiskfs]
      #12 [ffff9ab9b2f6b270] ldiskfs_evict_inode at ffffffffc19dccdf [ldiskfs]
      #13 [ffff9ab9b2f6b2b0] evict at ffffffff9825ee14
      #14 [ffff9ab9b2f6b2d8] dispose_list at ffffffff9825ef1e
      #15 [ffff9ab9b2f6b300] prune_icache_sb at ffffffff9825ff2c
      #16 [ffff9ab9b2f6b368] prune_super at ffffffff98244323
      #17 [ffff9ab9b2f6b3a0] shrink_slab at ffffffff981ca105
      #18 [ffff9ab9b2f6b440] do_try_to_free_pages at ffffffff981cd3c2
      #19 [ffff9ab9b2f6b4b8] try_to_free_pages at ffffffff981cd5dc
      #20 [ffff9ab9b2f6b550] __alloc_pages_slowpath at ffffffff987601ef
      #21 [ffff9ab9b2f6b640] __alloc_pages_nodemask at ffffffff981c1465
      #22 [ffff9ab9b2f6b6f0] alloc_pages_current at ffffffff9820e2c8
      #23 [ffff9ab9b2f6b738] new_slab at ffffffff982192d5
      #24 [ffff9ab9b2f6b770] ___slab_alloc at ffffffff9821ad4c
      #25 [ffff9ab9b2f6b840] __slab_alloc at ffffffff9876160c
      #26 [ffff9ab9b2f6b880] kmem_cache_alloc at ffffffff9821c3eb
      #27 [ffff9ab9b2f6b8c0] __radix_tree_preload at ffffffff9837b7b9
      #28 [ffff9ab9b2f6b8f0] radix_tree_maybe_preload at ffffffff9837bd0e
      #29 [ffff9ab9b2f6b900] __add_to_page_cache_locked at ffffffff981b734a
      #30 [ffff9ab9b2f6b940] add_to_page_cache_lru at ffffffff981b74b7
      #31 [ffff9ab9b2f6b970] find_or_create_page at ffffffff981b783e
      #32 [ffff9ab9b2f6b9b0] osd_bufs_get at ffffffffc1a773c3 [osd_ldiskfs]
      #33 [ffff9ab9b2f6ba10] ofd_preprw_write at ffffffffc144f156 [ofd]
      #34 [ffff9ab9b2f6ba90] ofd_preprw at ffffffffc14500ce [ofd]
      #35 [ffff9ab9b2f6bb28] tgt_brw_write at ffffffffc0ece6e9 [ptlrpc]
      #36 [ffff9ab9b2f6bca0] tgt_request_handle at ffffffffc0eccd4a [ptlrpc]
      #37 [ffff9ab9b2f6bd30] ptlrpc_server_handle_request at ffffffffc0e72586 [ptlrpc]
      #38 [ffff9ab9b2f6bde8] ptlrpc_main at ffffffffc0e7625a [ptlrpc]
      #39 [ffff9ab9b2f6bec8] kthread at ffffffff980c1f81
      #40 [ffff9ab9b2f6bf50] ret_from_fork_nospec_begin at ffffffff98777c1d
      

      Attachments

        Issue Links

          Activity

            [LU-15117] ofd_read_lock vs transaction deadlock while allocating buffers

            "Stephane Thiell <sthiell@stanford.edu>" uploaded a new patch: https://review.whamcloud.com/47925
            Subject: LU-15117 ofd: don't take lock for dt_bufs_get()
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 2fdf51055b76d47e24464cc93ebeabdc050dbd3a

            gerrit Gerrit Updater added a comment - "Stephane Thiell <sthiell@stanford.edu>" uploaded a new patch: https://review.whamcloud.com/47925 Subject: LU-15117 ofd: don't take lock for dt_bufs_get() Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 2fdf51055b76d47e24464cc93ebeabdc050dbd3a

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47029/
            Subject: LU-15117 ofd: don't take lock for dt_bufs_get()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7c4a7c59ed9c6185da326d6df6223f4818b57769

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47029/ Subject: LU-15117 ofd: don't take lock for dt_bufs_get() Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7c4a7c59ed9c6185da326d6df6223f4818b57769

            Hello,

            We hit this problem on another 2.12 filesystem last night. One OSS load was at 400+, still up, answering quota requests but not processing I/Os. Thus, many jobs were hang. I see that Alex's patch is still in review for master. Just wanted to raise awareness that this issue is currently our most impactful Lustre issue. Thanks for your help with this!

            sthiell Stephane Thiell added a comment - Hello, We hit this problem on another 2.12 filesystem last night. One OSS load was at 400+, still up, answering quota requests but not processing I/Os. Thus, many jobs were hang. I see that Alex's patch is still in review for master. Just wanted to raise awareness that this issue is currently our most impactful Lustre issue. Thanks for your help with this!

            sthiell you're correct.

            bzzz Alex Zhuravlev added a comment - sthiell you're correct.

            I believe we hit this same issue yesterday with 2.12.8 on Oak at Stanford. OSS was deadlocked. Alex, do you think it can be backported to b2_12?

            thread #1: zone_reclaim -> start_this_handle

            PID: 64334  TASK: ffff98040eb26300  CPU: 10  COMMAND: "ll_ost_io00_105"
             #0 [ffff980ca7ff31d8] __schedule at ffffffffacb86d07
             #1 [ffff980ca7ff3260] schedule at ffffffffacb87229
             #2 [ffff980ca7ff3270] wait_transaction_locked at ffffffffc026a085 [jbd2]
             #3 [ffff980ca7ff32c8] add_transaction_credits at ffffffffc026a378 [jbd2]
             #4 [ffff980ca7ff3328] start_this_handle at ffffffffc026a601 [jbd2]
             #5 [ffff980ca7ff33c0] jbd2__journal_start at ffffffffc026aab3 [jbd2]
             #6 [ffff980ca7ff3408] __ldiskfs_journal_start_sb at ffffffffc12f02b9 [ldiskfs]
             #7 [ffff980ca7ff3448] ldiskfs_release_dquot at ffffffffc132839c [ldiskfs]
             #8 [ffff980ca7ff3468] dqput at ffffffffac6bd16d
             #9 [ffff980ca7ff3490] __dquot_drop at ffffffffac6be865
            #10 [ffff980ca7ff34c8] dquot_drop at ffffffffac6be8d5
            #11 [ffff980ca7ff34d8] ldiskfs_clear_inode at ffffffffc132cf02 [ldiskfs]
            #12 [ffff980ca7ff34f0] ldiskfs_evict_inode at ffffffffc131601f [ldiskfs]
            #13 [ffff980ca7ff3530] evict at ffffffffac66c194
            #14 [ffff980ca7ff3558] dispose_list at ffffffffac66c29e
            #15 [ffff980ca7ff3580] prune_icache_sb at ffffffffac66d38c
            #16 [ffff980ca7ff35e8] prune_super at ffffffffac65071b
            #17 [ffff980ca7ff3618] shrink_slab at ffffffffac5d18c5
            #18 [ffff980ca7ff36b8] zone_reclaim at ffffffffac5d46c9
            #19 [ffff980ca7ff3760] get_page_from_freelist at ffffffffac5c8788
            #20 [ffff980ca7ff3878] __alloc_pages_nodemask at ffffffffac5c8ae6
            #21 [ffff980ca7ff3920] alloc_pages_current at ffffffffac618a18
            #22 [ffff980ca7ff3968] __page_cache_alloc at ffffffffac5bdb87
            #23 [ffff980ca7ff39a0] find_or_create_page at ffffffffac5bed25
            #24 [ffff980ca7ff39e0] osd_bufs_get at ffffffffc1433523 [osd_ldiskfs]
            #25 [ffff980ca7ff3a40] ofd_preprw_write at ffffffffc1582346 [ofd]
            #26 [ffff980ca7ff3ab8] ofd_preprw at ffffffffc15831ff [ofd]
            #27 [ffff980ca7ff3b60] tgt_brw_write at ffffffffc0f7be89 [ptlrpc]
            #28 [ffff980ca7ff3cd0] tgt_request_handle at ffffffffc0f7df1a [ptlrpc]
            #29 [ffff980ca7ff3d58] ptlrpc_server_handle_request at ffffffffc0f22bfb [ptlrpc]
            #30 [ffff980ca7ff3df8] ptlrpc_main at ffffffffc0f26564 [ptlrpc]
            #31 [ffff980ca7ff3ec8] kthread at ffffffffac4c5c21
            #32 [ffff980ca7ff3f50] ret_from_fork_nospec_begin at ffffffffacb94ddd
            

             

            thread #2: ofd_attr_set

            PID: 233404  TASK: ffff97f3e2fee300  CPU: 13  COMMAND: "ll_ost01_009"
             #0 [ffff984ee284ba58] __schedule at ffffffffacb86d07
             #1 [ffff984ee284bae0] schedule at ffffffffacb87229
             #2 [ffff984ee284baf0] rwsem_down_write_failed at ffffffffacb88965
             #3 [ffff984ee284bb88] call_rwsem_down_write_failed at ffffffffac797767
             #4 [ffff984ee284bbd0] down_write at ffffffffacb8655d
             #5 [ffff984ee284bbe8] osd_write_lock at ffffffffc1409c9c [osd_ldiskfs]
             #6 [ffff984ee284bc10] ofd_attr_set at ffffffffc157c053 [ofd]
             #7 [ffff984ee284bc78] ofd_setattr_hdl at ffffffffc156b95d [ofd]
             #8 [ffff984ee284bcd0] tgt_request_handle at ffffffffc0f7df1a [ptlrpc]
             #9 [ffff984ee284bd58] ptlrpc_server_handle_request at ffffffffc0f22bfb [ptlrpc]
            #10 [ffff984ee284bdf8] ptlrpc_main at ffffffffc0f26564 [ptlrpc]
            #11 [ffff984ee284bec8] kthread at ffffffffac4c5c21
            #12 [ffff984ee284bf50] ret_from_fork_nospec_begin at ffffffffacb94ddd
            

             

            sthiell Stephane Thiell added a comment - I believe we hit this same issue yesterday with 2.12.8 on Oak at Stanford. OSS was deadlocked. Alex, do you think it can be backported to b2_12? thread #1: zone_reclaim -> start_this_handle PID: 64334 TASK: ffff98040eb26300 CPU: 10 COMMAND: "ll_ost_io00_105" #0 [ffff980ca7ff31d8] __schedule at ffffffffacb86d07 #1 [ffff980ca7ff3260] schedule at ffffffffacb87229 #2 [ffff980ca7ff3270] wait_transaction_locked at ffffffffc026a085 [jbd2] #3 [ffff980ca7ff32c8] add_transaction_credits at ffffffffc026a378 [jbd2] #4 [ffff980ca7ff3328] start_this_handle at ffffffffc026a601 [jbd2] #5 [ffff980ca7ff33c0] jbd2__journal_start at ffffffffc026aab3 [jbd2] #6 [ffff980ca7ff3408] __ldiskfs_journal_start_sb at ffffffffc12f02b9 [ldiskfs] #7 [ffff980ca7ff3448] ldiskfs_release_dquot at ffffffffc132839c [ldiskfs] #8 [ffff980ca7ff3468] dqput at ffffffffac6bd16d #9 [ffff980ca7ff3490] __dquot_drop at ffffffffac6be865 #10 [ffff980ca7ff34c8] dquot_drop at ffffffffac6be8d5 #11 [ffff980ca7ff34d8] ldiskfs_clear_inode at ffffffffc132cf02 [ldiskfs] #12 [ffff980ca7ff34f0] ldiskfs_evict_inode at ffffffffc131601f [ldiskfs] #13 [ffff980ca7ff3530] evict at ffffffffac66c194 #14 [ffff980ca7ff3558] dispose_list at ffffffffac66c29e #15 [ffff980ca7ff3580] prune_icache_sb at ffffffffac66d38c #16 [ffff980ca7ff35e8] prune_super at ffffffffac65071b #17 [ffff980ca7ff3618] shrink_slab at ffffffffac5d18c5 #18 [ffff980ca7ff36b8] zone_reclaim at ffffffffac5d46c9 #19 [ffff980ca7ff3760] get_page_from_freelist at ffffffffac5c8788 #20 [ffff980ca7ff3878] __alloc_pages_nodemask at ffffffffac5c8ae6 #21 [ffff980ca7ff3920] alloc_pages_current at ffffffffac618a18 #22 [ffff980ca7ff3968] __page_cache_alloc at ffffffffac5bdb87 #23 [ffff980ca7ff39a0] find_or_create_page at ffffffffac5bed25 #24 [ffff980ca7ff39e0] osd_bufs_get at ffffffffc1433523 [osd_ldiskfs] #25 [ffff980ca7ff3a40] ofd_preprw_write at ffffffffc1582346 [ofd] #26 [ffff980ca7ff3ab8] ofd_preprw at ffffffffc15831ff [ofd] #27 [ffff980ca7ff3b60] tgt_brw_write at ffffffffc0f7be89 [ptlrpc] #28 [ffff980ca7ff3cd0] tgt_request_handle at ffffffffc0f7df1a [ptlrpc] #29 [ffff980ca7ff3d58] ptlrpc_server_handle_request at ffffffffc0f22bfb [ptlrpc] #30 [ffff980ca7ff3df8] ptlrpc_main at ffffffffc0f26564 [ptlrpc] #31 [ffff980ca7ff3ec8] kthread at ffffffffac4c5c21 #32 [ffff980ca7ff3f50] ret_from_fork_nospec_begin at ffffffffacb94ddd   thread #2: ofd_attr_set PID: 233404 TASK: ffff97f3e2fee300 CPU: 13 COMMAND: "ll_ost01_009" #0 [ffff984ee284ba58] __schedule at ffffffffacb86d07 #1 [ffff984ee284bae0] schedule at ffffffffacb87229 #2 [ffff984ee284baf0] rwsem_down_write_failed at ffffffffacb88965 #3 [ffff984ee284bb88] call_rwsem_down_write_failed at ffffffffac797767 #4 [ffff984ee284bbd0] down_write at ffffffffacb8655d #5 [ffff984ee284bbe8] osd_write_lock at ffffffffc1409c9c [osd_ldiskfs] #6 [ffff984ee284bc10] ofd_attr_set at ffffffffc157c053 [ofd] #7 [ffff984ee284bc78] ofd_setattr_hdl at ffffffffc156b95d [ofd] #8 [ffff984ee284bcd0] tgt_request_handle at ffffffffc0f7df1a [ptlrpc] #9 [ffff984ee284bd58] ptlrpc_server_handle_request at ffffffffc0f22bfb [ptlrpc] #10 [ffff984ee284bdf8] ptlrpc_main at ffffffffc0f26564 [ptlrpc] #11 [ffff984ee284bec8] kthread at ffffffffac4c5c21 #12 [ffff984ee284bf50] ret_from_fork_nospec_begin at ffffffffacb94ddd  

            mnishizawa I've done with the local testing for the patch above. would you be able to test it at scale?

            bzzz Alex Zhuravlev added a comment - mnishizawa I've done with the local testing for the patch above. would you be able to test it at scale?

            "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47029
            Subject: LU-15117 ofd: don't take lock for dt_bufs_get()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: c5db35bca5b42c40b4a5675a5d8b8230018d5138

            gerrit Gerrit Updater added a comment - "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47029 Subject: LU-15117 ofd: don't take lock for dt_bufs_get() Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: c5db35bca5b42c40b4a5675a5d8b8230018d5138

            > One of our customer was affected by this issue and waiting for the fix. Do you have any progress on this?
            the patch is in local testing at the moment..

            bzzz Alex Zhuravlev added a comment - > One of our customer was affected by this issue and waiting for the fix. Do you have any progress on this? the patch is in local testing at the moment..

            Hi bzzz
            One of our customer was affected by this issue and waiting for the fix. Do you have any progress on this?

            mnishizawa Mitsuhiro Nishizawa added a comment - Hi bzzz One of our customer was affected by this issue and waiting for the fix. Do you have any progress on this?
            askulysh Andriy Skulysh added a comment - st_vmcore

            askulyshdo you still have full list of traces? could you please attach it to the ticket?

            bzzz Alex Zhuravlev added a comment - askulysh do you still have full list of traces? could you please attach it to the ticket?

            People

              askulysh Andriy Skulysh
              askulysh Andriy Skulysh
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: