Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7802

set_param lru_size fails with 'error: set_param: setting /proc/fs/lustre/ldlm/namespaces/lustre-OST0000-osc-*/lru_size=clear: Invalid argument'

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.11.0, Lustre 2.10.2
    • Lustre 2.9.0, Lustre 2.10.0, Lustre 2.10.1, Lustre 2.11.0
    • autotest and manual testing
    • 3
    • 9223372036854775807

    Description

      lctl set_param -n ldlm.namespaces.*$1*.lru_size=clear fails with error message

      error: set_param: setting /proc/fs/lustre/ldlm/namespaces/lustre-OST0000-osc-ffff880077f04000/lru_size=clear: Invalid argument
      

      I've seen this error message in the test_log for a few sanity tests. The error does not seem to make the test fail (should it?) and the error is not consistent meaning that a test could hit the error on one test run and not experience the error the next.

      Here are a few instances of this error I've come across:
      sanity test_127a at https://testing.hpdd.intel.com/test_sets/2f35cef8-d8c8-11e5-83e2-5254006e85c2
      sanity test_241 hits this a little more regularly https://testing.hpdd.intel.com/sub_tests/79078936-d8e1-11e5-83e2-5254006e85c2.

      The error comes from a call to 'cancel_lru_locks osc'. From tests/test-framework.sh, we see

      cancel_lru_locks() {
      #$LCTL mark "cancel_lru_locks $1 start"
      $LCTL set_param -n ldlm.namespaces.*$1*.lru_size=clear
      $LCTL get_param ldlm.namespaces.*$1*.lock_unused_count | grep -v '=0'
      #$LCTL mark "cancel_lru_locks $1 stop"
      

      It's not clear what is causing this error. Since this error does not cause the test to fail, it's hard to find other occurrences of this error and when it first started.

      Attachments

        Issue Links

          Activity

            [LU-7802] set_param lru_size fails with 'error: set_param: setting /proc/fs/lustre/ldlm/namespaces/lustre-OST0000-osc-*/lru_size=clear: Invalid argument'

            John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28975/
            Subject: LU-7802 ldlm: No -EINVAL for canceled != unused
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set:
            Commit: 9a38fcb07dadc6f6b4c55e24feae004175c906e9

            gerrit Gerrit Updater added a comment - John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28975/ Subject: LU-7802 ldlm: No -EINVAL for canceled != unused Project: fs/lustre-release Branch: b2_10 Current Patch Set: Commit: 9a38fcb07dadc6f6b4c55e24feae004175c906e9
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28560/
            Subject: LU-7802 ldlm: No -EINVAL for canceled != unused
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: a5081b7362e44b8d38aee1112f9a7d3aae1642c0

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28560/ Subject: LU-7802 ldlm: No -EINVAL for canceled != unused Project: fs/lustre-release Branch: master Current Patch Set: Commit: a5081b7362e44b8d38aee1112f9a7d3aae1642c0

            Patrick Farrell (paf@cray.com) uploaded a new patch: https://review.whamcloud.com/28975
            Subject: LU-7802 ldlm: No -EINVAL for lock in use
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set: 1
            Commit: 01add10e6eab833054fb232d9e12cb48b5a63301

            gerrit Gerrit Updater added a comment - Patrick Farrell (paf@cray.com) uploaded a new patch: https://review.whamcloud.com/28975 Subject: LU-7802 ldlm: No -EINVAL for lock in use Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: 01add10e6eab833054fb232d9e12cb48b5a63301
            sbuisson Sebastien Buisson (Inactive) added a comment - another on master: https://testing.hpdd.intel.com/test_sets/40e74f6a-8cb2-11e7-b4ee-5254006e85c2
            bogl Bob Glossman (Inactive) added a comment - another on master: https://testing.hpdd.intel.com/test_sets/f5187be6-8878-11e7-b3ca-5254006e85c2

            Patrick Farrell (paf@cray.com) uploaded a new patch: https://review.whamcloud.com/28560
            Subject: LU-7802 ldlm: No -EINVAL for lock in use
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 70b92f98c56a128894009aa608dcfa589836fe47

            gerrit Gerrit Updater added a comment - Patrick Farrell (paf@cray.com) uploaded a new patch: https://review.whamcloud.com/28560 Subject: LU-7802 ldlm: No -EINVAL for lock in use Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 70b92f98c56a128894009aa608dcfa589836fe47

            This isn't racy so much as just wrong. Sometimes locks are in use, so we don't cancel them. That's intended behavior.

            The fix for this is just not to return -EINVAL. This isn't a condition that should generate that sort of error.

            I'll push a patch.

            paf Patrick Farrell (Inactive) added a comment - This isn't racy so much as just wrong. Sometimes locks are in use, so we don't cancel them. That's intended behavior. The fix for this is just not to return -EINVAL. This isn't a condition that should generate that sort of error. I'll push a patch.
            sguminsx Steve Guminski (Inactive) added a comment - Another on master: https://testing.hpdd.intel.com/test_sessions/d7870a08-73b3-4f95-898b-f4f0908c9214

            Removed LU-8066 link since this is a race condition and not a sysfs issue. What I do see is a potential patch from LU-8276 that might fix this issue. I added a link to LU-8276 to here.

            simmonsja James A Simmons added a comment - Removed LU-8066 link since this is a race condition and not a sysfs issue. What I do see is a potential patch from LU-8276 that might fix this issue. I added a link to LU-8276 to here.

            People

              paf Patrick Farrell (Inactive)
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: