Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11285

don't stop on the first blocked lock in ldlm_reprocess_queue()

Details

    • 9223372036854775807

    Description

      The ldlm_reprocess_queue() stops on the first blocked lock in the waiting queue, meanwhile for IBITS locks there may be more waiting locks which can be granted immediately and don't interfere with that blocking lock. That is all about different IBITS, e.g. this is resource dump from racer run:

      [ 1192.782632] Lustre: 13366:0:(ldlm_resource.c:1728:ldlm_resource_dump()) Granted locks (in reverse order):
      [ 1192.782638] Lustre: 13366:0:(ldlm_resource.c:1731:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8800b58c7700/0x5c7c511d9b5b4287 lrc: 3/0,0 mode: PW/PW res: [0x200000401:0x395:0x0].0x0 bits 0x40/0x0 rrc: 18 type: IBT flags: 0x60200400000020 nid: 0@lo remote: 0x5c7c511d9b5b4279 expref: 342 pid: 16202 timeout: 1193 lvb_type: 0
      [ 1192.782643] Lustre: 13366:0:(ldlm_resource.c:1731:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801e209c000/0x5c7c511d9b5b44da lrc: 2/0,0 mode: PR/PR res: [0x200000401:0x395:0x0].0x0 bits 0x1a/0x0 rrc: 18 type: IBT flags: 0x40200000000000 nid: 0@lo remote: 0x5c7c511d9b5b447f expref: 325 pid: 17956 timeout: 1193 lvb_type: 0
      [ 1192.782647] Lustre: 13366:0:(ldlm_resource.c:1731:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801dda96300/0x5c7c511d9b5b2274 lrc: 2/0,0 mode: CR/CR res: [0x200000401:0x395:0x0].0x0 bits 0x8/0x0 rrc: 18 type: IBT flags: 0x40200000000000 nid: 0@lo remote: 0x5c7c511d9b5b2258 expref: 325 pid: 16202 timeout: 0 lvb_type: 0
      [ 1192.782651] Lustre: 13366:0:(ldlm_resource.c:1731:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801e5b1e800/0x5c7c511d9b5b1ed8 lrc: 2/0,0 mode: CR/CR res: [0x200000401:0x395:0x0].0x0 bits 0x8/0x0 rrc: 18 type: IBT flags: 0x40000000000000 nid: 0@lo remote: 0x5c7c511d9b5b1ed1 expref: 342 pid: 18163 timeout: 0 lvb_type: 3
      [ 1192.782653] Lustre: 13366:0:(ldlm_resource.c:1742:ldlm_resource_dump()) Waiting locks:
      [ 1192.782657] Lustre: 13366:0:(ldlm_resource.c:1744:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801dd895400/0x5c7c511d9b5b4504 lrc: 3/0,1 mode: --/PW res: [0x200000401:0x395:0x0].0x0 bits 0x40/0x0 rrc: 18 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 15795 timeout: 0 lvb_type: 0
      [ 1192.782661] Lustre: 13366:0:(ldlm_resource.c:1744:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801e240af80/0x5c7c511d9b5b4519 lrc: 3/0,1 mode: --/PW res: [0x200000401:0x395:0x0].0x0 bits 0x40/0x0 rrc: 18 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 13377 timeout: 0 lvb_type: 0
      [ 1192.782665] Lustre: 13366:0:(ldlm_resource.c:1744:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801e6e32300/0x5c7c511d9b5b4806 lrc: 3/0,1 mode: --/EX res: [0x200000401:0x395:0x0].0x0 bits 0x21/0x0 rrc: 18 type: IBT flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 15808 timeout: 0 lvb_type: 0
      [ 1192.782668] Lustre: 13366:0:(ldlm_resource.c:1744:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8800b769c280/0x5c7c511d9b5b488b lrc: 3/1,0 mode: --/PR res: [0x200000401:0x395:0x0].0x0 bits 0x13/0x8 rrc: 18 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 15827 timeout: 0 lvb_type: 0
      [ 1192.782672] Lustre: 13366:0:(ldlm_resource.c:1744:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8800b58c6800/0x5c7c511d9b5b48ca lrc: 3/1,0 mode: --/PR res: [0x200000401:0x395:0x0].0x0 bits 0x13/0x8 rrc: 18 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 15807 timeout: 0 lvb_type: 0
      [ 1192.782679] Lustre: 13366:0:(ldlm_resource.c:1744:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8800b58c6300/0x5c7c511d9b5b4909 lrc: 3/1,0 mode: --/PR res: [0x200000401:0x395:0x0].0x0 bits 0x13/0x8 rrc: 18 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 17956 timeout: 0 lvb_type: 0
      [ 1192.782683] Lustre: 13366:0:(ldlm_resource.c:1744:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8800b58c6080/0x5c7c511d9b5b4b71 lrc: 3/1,0 mode: --/PR res: [0x200000401:0x395:0x0].0x0 bits 0x13/0x8 rrc: 18 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 13373 timeout: 0 lvb_type: 0
      [ 1192.782686] Lustre: 13366:0:(ldlm_resource.c:1744:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801e5aa8000/0x5c7c511d9b5b4b9b lrc: 3/1,0 mode: --/PR res: [0x200000401:0x395:0x0].0x0 bits 0x20/0x0 rrc: 18 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 13378 timeout: 0 lvb_type: 0
      

      the first lock in the waiting queue is DOM lock waiting for other DOM lock, but the next one EX lock has no matched bits either with both DOM locks or with all other granted locks, so can be granted immediately. Instead of that it will wait for DOM lock to be granted, because we don't allow locks to be granted out of order.

      With IBITS locks it should be safe to grant such waiting locks which has no ibits match with any granted and any waiting locks before it.

      The reason for this improvement is DOM lock mostly, because they may block for quite long time, e.g. CLIO may need to lock whole file region, so it will take DOM lock and all OST locks with all blocking ASTs, that means DOM lock may wait for quite long time and that is not big problem with it due to prolong mechanism. But waiting in a common waiting queue with other metadata locks it may stop whole lock processing for a while. With out-of-order lock granting this problem will be much less severe.

      Attachments

        Issue Links

          Activity

            [LU-11285] don't stop on the first blocked lock in ldlm_reprocess_queue()
            pjones Peter Jones added a comment -

            As per Mike, all necessary fixes have now landed for 2.13

            pjones Peter Jones added a comment - As per Mike, all necessary fixes have now landed for 2.13

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35955/
            Subject: LU-11285 mdt: improve IBITS lock definitions
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: e71e845156becd1fc7efd676247bc85467881a38

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35955/ Subject: LU-11285 mdt: improve IBITS lock definitions Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: e71e845156becd1fc7efd676247bc85467881a38

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35955
            Subject: LU-11285 mdt: improve IBITS lock definitions
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 88bf2ec7953ff34a69f34f9ff86fafbb0454b01e

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35955 Subject: LU-11285 mdt: improve IBITS lock definitions Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 88bf2ec7953ff34a69f34f9ff86fafbb0454b01e

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35045/
            Subject: LU-11285 mdt: improve IBITS lock definitions
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 3611352b699ce479779c0ff92ca558d9321e58a2

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35045/ Subject: LU-11285 mdt: improve IBITS lock definitions Project: fs/lustre-release Branch: master Current Patch Set: Commit: 3611352b699ce479779c0ff92ca558d9321e58a2

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35045
            Subject: LU-11285 mdt: improve IBITS lock definitions
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 355db9abe915bd78319a7100777b252934eadb91

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35045 Subject: LU-11285 mdt: improve IBITS lock definitions Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 355db9abe915bd78319a7100777b252934eadb91

            I confirm that it resolves LU-12017 deadlock.

            askulysh Andriy Skulysh added a comment - I confirm that it resolves LU-12017 deadlock.

            Understand, I was confused because you start to take a DoM lock without releasing a MD locks.
            While it don't needs and resolve this race without changes in ldlm code, which affect a performance.

            shadow Alexey Lyashkov added a comment - Understand, I was confused because you start to take a DoM lock without releasing a MD locks. While it don't needs and resolve this race without changes in ldlm code, which affect a performance.

            yes, and that is why we cannot take discard DOM lock when unlink starts, combing it with other child bits, it is to be taken when object is being deleted. That is answer on your proposal to combine DOM lock with child bits on unlink few comments above.

            tappro Mikhail Pershin added a comment - yes, and that is why we cannot take discard DOM lock when unlink starts, combing it with other child bits, it is to be taken when object is being deleted. That is answer on your proposal to combine DOM lock with child bits on unlink few comments above.

            Last unlink situation isn't need to have hold any locks in MD namespace.
            One operation is remove from MD namespace and connect to the orphan list, second operation is orphan list handling with destroy "data" locks.
            it's same as for any data placement. for OST objects you put an FID in unlink llog, for DoM objects you can destroy orphan object by self. I don't see any differences in this case.

            shadow Alexey Lyashkov added a comment - Last unlink situation isn't need to have hold any locks in MD namespace. One operation is remove from MD namespace and connect to the orphan list, second operation is orphan list handling with destroy "data" locks. it's same as for any data placement. for OST objects you put an FID in unlink llog, for DoM objects you can destroy orphan object by self. I don't see any differences in this case.

            unfortunately we can't do that with DOM - at the moment of lock taking we don't know if that is last unlink or not. This is known only later, with lock already taken. On OST we always know that object is being deleted and have no such problem.
            As I mentioned before, LU-11359 patch solves that problem with discard lock by using non-blocking completion AST, so that type of deadlock should be solved already.

            tappro Mikhail Pershin added a comment - unfortunately we can't do that with DOM - at the moment of lock taking we don't know if that is last unlink or not. This is known only later, with lock already taken. On OST we always know that object is being deleted and have no such problem. As I mentioned before, LU-11359 patch solves that problem with discard lock by using non-blocking completion AST, so that type of deadlock should be solved already.

            People

              tappro Mikhail Pershin
              tappro Mikhail Pershin
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: