Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15779

do not hold object's lock over read bulk

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0, Lustre 2.15.2
    • None
    • None
    • 9223372036854775807

    Description

      as a stuck bulk can block OUT's (e.g. out_tx_xattr_set_exec taking an exclusive object's lock), then all shared object's locks are blocked and finally all transactions are blocked:

      Call Trace:
      [<0>] call_rwsem_down_write_failed+0x17/0x30
      [<0>] osd_write_lock+0x5c/0xe0 [osd_ldiskfs]
      [<0>] out_tx_xattr_set_exec+0xdb/0x840 [ptlrpc]
      [<0>] out_tx_end+0xe1/0x5c0 [ptlrpc]
      [<0>] out_handle+0x1452/0x1bc0 [ptlrpc]
      [<0>] tgt_request_handle+0xaee/0x15f0 [ptlrpc]
      [<0>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
      
      Call Trace:
      [<0>] wait_transaction_locked+0x85/0xd0 [jbd2]
      [<0>] add_transaction_credits+0x278/0x310 [jbd2]
      [<0>] start_this_handle+0x1a1/0x430 [jbd2]
      [<0>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
      [<0>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
      [<0>] osd_trans_start+0x20e/0x4e0 [osd_ldiskfs]
      [<0>] ofd_commitrw_write+0x11dc/0x1da0 [ofd]
      [<0>] ofd_commitrw+0x53f/0xf70 [ofd]
      [<0>] tgt_brw_write+0xffb/0x1dc0 [ptlrpc]
      [<0>] tgt_request_handle+0xaee/0x15f0 [ptlrpc]
      [<0>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
      

      Attachments

        Activity

          [LU-15779] do not hold object's lock over read bulk

          "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49468
          Subject: LU-15779 ofd: don't hold read lock over bulk
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set: 1
          Commit: ad08375a6a5dccec2c7b70770b35695543ff6aae

          gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49468 Subject: LU-15779 ofd: don't hold read lock over bulk Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: ad08375a6a5dccec2c7b70770b35695543ff6aae

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47825/
          Subject: LU-15779 ofd: don't hold read lock over bulk
          Project: fs/lustre-release
          Branch: b2_15
          Current Patch Set:
          Commit: 28875487ab3c94015fdd1c6b32c3ee63bdf81965

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47825/ Subject: LU-15779 ofd: don't hold read lock over bulk Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 28875487ab3c94015fdd1c6b32c3ee63bdf81965
          pjones Peter Jones added a comment -

          Landed for 2.16

          pjones Peter Jones added a comment - Landed for 2.16

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47126/
          Subject: LU-15779 ofd: don't hold read lock over bulk
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 98ba50819024b908453b62fd095647442929a61f

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47126/ Subject: LU-15779 ofd: don't hold read lock over bulk Project: fs/lustre-release Branch: master Current Patch Set: Commit: 98ba50819024b908453b62fd095647442929a61f

          "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47825
          Subject: LU-15779 ofd: don't hold read lock over bulk
          Project: fs/lustre-release
          Branch: b2_15
          Current Patch Set: 1
          Commit: a229943d51b7c876ce7108a2d7fab9b34e85d0ff

          gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47825 Subject: LU-15779 ofd: don't hold read lock over bulk Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: a229943d51b7c876ce7108a2d7fab9b34e85d0ff

          I constructed a test which basically: 1) grabs buffers using dt_bufs_get(), then declares 0-copy write using these buffers 3) removes the object in a separate thread 4) tries to commit the buffer to the filesysytem – passed on both ldiskfs and ZFS.

          bzzz Alex Zhuravlev added a comment - I constructed a test which basically: 1) grabs buffers using dt_bufs_get(), then declares 0-copy write using these buffers 3) removes the object in a separate thread 4) tries to commit the buffer to the filesysytem – passed on both ldiskfs and ZFS.

          "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47126
          Subject: LU-15779 ofd: don't hold read lock over bulk
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 2faff9f82a263ae1b9ed3ab0ebe893aea5772ba1

          gerrit Gerrit Updater added a comment - "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47126 Subject: LU-15779 ofd: don't hold read lock over bulk Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2faff9f82a263ae1b9ed3ab0ebe893aea5772ba1

          People

            bzzz Alex Zhuravlev
            bzzz Alex Zhuravlev
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: