Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9966

sanity test_411: fail to trigger a memory allocation error

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.12.0
    • Lustre 2.12.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/32b0aa4c-9502-11e7-ba84-5254006e85c2.

      The sub-test test_411 failed with the following error:

      fail to trigger a memory allocation error
      

      test_411 is very new. has been failing since 9/1
      some (all?) of the instances of FAIL have been seen on sles12sp2/sles12sp3

      more:
      https://testing.hpdd.intel.com/test_sets/8c6725ca-8f6c-11e7-b5c2-5254006e85c2
      https://testing.hpdd.intel.com/test_sets/0a6acf7c-8f8f-11e7-b67f-5254006e85c2

      Info required for matching: sanity 411

      Attachments

        Issue Links

          Activity

            [LU-9966] sanity test_411: fail to trigger a memory allocation error
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28974/
            Subject: LU-9966 test: add a skip test to test_411
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: f6b0e358f304b006dd24524503bb16d649c5499d

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28974/ Subject: LU-9966 test: add a skip test to test_411 Project: fs/lustre-release Branch: master Current Patch Set: Commit: f6b0e358f304b006dd24524503bb16d649c5499d

            I see that 'CONFIG_MEMCG_KMEM' is enabled in rhel7 by default. Totally explains why test 411 works on rhel7 and doesn't work on sles12.

            bogl Bob Glossman (Inactive) added a comment - I see that 'CONFIG_MEMCG_KMEM' is enabled in rhel7 by default. Totally explains why test 411 works on rhel7 and doesn't work on sles12.
            ys Yang Sheng added a comment -

            The 'CONFIG_MEMCG_KMEM' is disabled in sles12 default. So kmem.limit_in_bytes is absent. Then skipping is right solution.

            ys Yang Sheng added a comment - The 'CONFIG_MEMCG_KMEM' is disabled in sles12 default. So kmem.limit_in_bytes is absent. Then skipping is right solution.

            Bob Glossman (bob.glossman@intel.com) uploaded a new patch: https://review.whamcloud.com/28974
            Subject: LU-9966 test: add a skip test to test_411
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 6c5dcd0caca8f08eb9874b533629f09af3cc1b7f

            gerrit Gerrit Updater added a comment - Bob Glossman (bob.glossman@intel.com) uploaded a new patch: https://review.whamcloud.com/28974 Subject: LU-9966 test: add a skip test to test_411 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 6c5dcd0caca8f08eb9874b533629f09af3cc1b7f
            bogl Bob Glossman (Inactive) added a comment - - edited

            it reports as "permission denied" but pretty sure it's due to the entry not existing. Easy fix to check for the entry & skip if it doesn't exist, but not sure that's the right approach.

            I can push a mod that does that for inspection.

            bogl Bob Glossman (Inactive) added a comment - - edited it reports as "permission denied" but pretty sure it's due to the entry not existing. Easy fix to check for the entry & skip if it doesn't exist, but not sure that's the right approach. I can push a mod that does that for inspection.
            ys Yang Sheng added a comment -

            https://testing.hpdd.intel.com/test_sets/57f429a2-97f8-11e7-b9c6-5254006e85c2

            Looks like this test is failed by permission issue.

            == sanity test 411: Slab allocation error with cgroup does not LBUG ================================== 01:03:59 (1505203439)
            100+0 records in
            100+0 records out
            104857600 bytes (105 MB, 100 MiB) copied, 1.78497 s, 58.7 MB/s
            /usr/lib64/lustre/tests/sanity.sh: line 16400: /sys/fs/cgroup/memory/osc_slab_alloc/memory.kmem.limit_in_bytes: Permission denied
            204800+0 records in
            204800+0 records out
            104857600 bytes (105 MB, 100 MiB) copied, 23.5257 s, 4.5 MB/s
             sanity test_411: @@@@@@ FAIL: fail to trigger a memory allocation error 
              Trace dump:
            

            The 'osc_slab_alloc/memory.kmem.limit_in_bytes' cannot be changed so trigger action is failed. I'll try to find the cause.

            Thanks,
            YangSheng

            ys Yang Sheng added a comment - https://testing.hpdd.intel.com/test_sets/57f429a2-97f8-11e7-b9c6-5254006e85c2 Looks like this test is failed by permission issue. == sanity test 411: Slab allocation error with cgroup does not LBUG ================================== 01:03:59 (1505203439) 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 1.78497 s, 58.7 MB/s /usr/lib64/lustre/tests/sanity.sh: line 16400: /sys/fs/cgroup/memory/osc_slab_alloc/memory.kmem.limit_in_bytes: Permission denied 204800+0 records in 204800+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 23.5257 s, 4.5 MB/s sanity test_411: @@@@@@ FAIL: fail to trigger a memory allocation error Trace dump: The 'osc_slab_alloc/memory.kmem.limit_in_bytes' cannot be changed so trigger action is failed. I'll try to find the cause. Thanks, YangSheng
            bogl Bob Glossman (Inactive) added a comment - - edited

            Is there an equivalent function that could be used instead or should we just skip the test for sles12 (and presumably any other newer kernels)?

            Needs the Author of the test to answer that question.
            As far as I can tell there is nothing equivalent in sles12.

            If the solution is to skip the test when the needed /sys entry isn't there I can push a patch for that. There is already some skip logic there, I would just need to extend it a bit.

            bogl Bob Glossman (Inactive) added a comment - - edited Is there an equivalent function that could be used instead or should we just skip the test for sles12 (and presumably any other newer kernels)? Needs the Author of the test to answer that question. As far as I can tell there is nothing equivalent in sles12. If the solution is to skip the test when the needed /sys entry isn't there I can push a patch for that. There is already some skip logic there, I would just need to extend it a bit.
            pjones Peter Jones added a comment -

            Is there an equivalent function that could be used instead or should we just skip the test for sles12 (and presumably any other newer kernels)?

            pjones Peter Jones added a comment - Is there an equivalent function that could be used instead or should we just skip the test for sles12 (and presumably any other newer kernels)?
            bogl Bob Glossman (Inactive) added a comment - - edited

            In sles12 there is no /sys/fs/cgroup/memory/memory.kmem.limit_in_bytes
            Since test_411 uses this it's no surprise the test doesn't work.

            bogl Bob Glossman (Inactive) added a comment - - edited In sles12 there is no /sys/fs/cgroup/memory/memory.kmem.limit_in_bytes Since test_411 uses this it's no surprise the test doesn't work.

            The patch that added test 411 was https://review.whamcloud.com/21745, "LU-8435 tests: slab alloc error does not LBUG"

            bogl Bob Glossman (Inactive) added a comment - The patch that added test 411 was https://review.whamcloud.com/21745 , " LU-8435 tests: slab alloc error does not LBUG"

            People

              ys Yang Sheng
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: