Details

    • Improvement
    • Resolution: Fixed
    • Major
    • Lustre 2.5.0
    • Lustre 2.5.0
    • None
    • 7025

    Description

      ldlm_poold wakes every second currently and walk list of all namespaces in the system doing some bookkeeping.
      On systems with a lot of servers this client list gets quite large and it makes no sense to visit every namespace in the list if some of them don't actually have any locks.

      As such I think it makes sense to only get ldlm_poold to iterate over non-empty client namespaces.
      Estimates from Fujitsu indicate that on a system with 2000 OSTs just the list iteration (i.e. empty namespaces) takes 2ms which is probably excessive.

      Attachments

        1. nopatch-idle-data
          0.4 kB
        2. patch-idle-5624-data
          0.3 kB
        3. patch-idle-5793-data
          0.3 kB

        Issue Links

          Activity

            [LU-2924] shrink ldlm_poold workload
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-5415 [ LU-5415 ]
            haasken Ryan Haasken added a comment -

            Oleg, in http://review.whamcloud.com/5624, why did you change the type of ldlm_srv_namespace_nr and ldlm_cli_namespace_nr from cfs_atomic_t to int?

            Now when we call ldlm_namespace_nr_*, are we expected to be holding the ldlm_namespace_lock()?

            haasken Ryan Haasken added a comment - Oleg, in http://review.whamcloud.com/5624 , why did you change the type of ldlm_srv_namespace_nr and ldlm_cli_namespace_nr from cfs_atomic_t to int? Now when we call ldlm_namespace_nr_*, are we expected to be holding the ldlm_namespace_lock()?
            jlevi Jodi Levi (Inactive) made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: In Progress [ 3 ] New: Resolved [ 5 ]

            Patches landed and test plan written.

            jlevi Jodi Levi (Inactive) added a comment - Patches landed and test plan written.
            jlevi Jodi Levi (Inactive) made changes -
            Fix Version/s New: Lustre 2.5.0 [ 10295 ]
            green Oleg Drokin made changes -
            Status Original: Open [ 1 ] New: In Progress [ 3 ]
            jlevi Jodi Levi (Inactive) made changes -
            Affects Version/s New: Lustre 2.5.0 [ 10295 ]
            morrone Christopher Morrone (Inactive) made changes -
            Link New: This issue is related to LU-1376 [ LU-1376 ]
            niu Niu Yawei (Inactive) made changes -
            Link New: This issue is related to LU-1128 [ LU-1128 ]

            This reminds me LU-1128, looks the pool shrinker on server side can't work as expected (under memory pressure, the SLV isn't decreased and the lock cancel from client never be triggered.)

            Though this ticket is mainly for addressing the client side issue.

            niu Niu Yawei (Inactive) added a comment - This reminds me LU-1128 , looks the pool shrinker on server side can't work as expected (under memory pressure, the SLV isn't decreased and the lock cancel from client never be triggered.) Though this ticket is mainly for addressing the client side issue.

            People

              green Oleg Drokin
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: