Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8738

sanity test_255b: FAIL: Ladvise willread should use more memory than 76800 KiB

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • Lustre 2.9.0
    • None
    • 3
    • 9223372036854775807

    Description

      sanity test 255b failed as follows:

      == sanity test 255b: check 'lfs ladvise -a dontneed' ================================================= 17:21:04 (1476811264)
      100+0 records in
      100+0 records out
      104857600 bytes (105 MB) copied, 0.778124 s, 135 MB/s
      CMD: trevis-41vm4 cat /proc/meminfo | grep ^MemTotal:
      Total memory: 1923480 KiB
      CMD: trevis-41vm4 sync && echo 3 > /proc/sys/vm/drop_caches
      CMD: trevis-41vm4 cat /proc/meminfo | grep ^Cached:
      Cache used before read: 72592 KiB
      CMD: trevis-41vm4 cat /proc/meminfo | grep ^Cached:
      Cache used after read: 120972 KiB
      CMD: trevis-41vm4 cat /proc/meminfo | grep ^Cached:
      Cache used after dontneed ladvise: 18572 KiB
       sanity test_255b: @@@@@@ FAIL: Ladvise willread should use more memory than 76800 KiB
      

      Maloo reports:
      https://testing.hpdd.intel.com/test_sets/c98e8654-9d15-11e6-9f24-5254006e85c2
      https://testing.hpdd.intel.com/test_sets/f1a59ea0-9605-11e6-a96f-5254006e85c2
      https://testing.hpdd.intel.com/test_sets/ba662914-9588-11e6-9722-5254006e85c2
      https://testing.hpdd.intel.com/test_sets/793fd440-95af-11e6-a96f-5254006e85c2
      https://testing.hpdd.intel.com/test_sets/14b006f8-956e-11e6-9fed-5254006e85c2
      https://testing.hpdd.intel.com/test_sets/7d75d880-8b9a-11e6-a8b7-5254006e85c2
      https://testing.hpdd.intel.com/test_sets/abe41d90-895f-11e6-a8b7-5254006e85c2

      Attachments

        Activity

          [LU-8738] sanity test_255b: FAIL: Ladvise willread should use more memory than 76800 KiB

          This test still fails with sync after the write.

          jamesanunez James Nunez (Inactive) added a comment - This test still fails with sync after the write.

          John - I have two OSSs with two OSTs each.

          Since there is already a sync on ost1 after the write, I'll try the sync on the client.

          jamesanunez James Nunez (Inactive) added a comment - John - I have two OSSs with two OSTs each. Since there is already a sync on ost1 after the write, I'll try the sync on the client.
          bogl Bob Glossman (Inactive) added a comment - another on master: https://testing.hpdd.intel.com/test_sets/5d201c00-ad94-11e6-8144-5254006e85c2
          jhammond John Hammond added a comment -

          James, are the 4 OSTs on the same OSS?

          If you have a setup that reproduces this issue handy could you try add sync on the line after dd?

          jhammond John Hammond added a comment - James, are the 4 OSTs on the same OSS? If you have a setup that reproduces this issue handy could you try add sync on the line after dd?

          Li Xi - For sanity test_255b, we are only measuring the total amount of cache and amount of cache used for ost1. Yet, the first thing we do is stripe the file across all OSTs; ‘lfs setstripe –c -1 -i 0 …’. To see the impact of caching on ost1 only, do we want to limit the file to a single ost, in particular ost1, meaning ‘lfs setstripe –c 1 -i 0 ...' ? A file on multiple OSTs (full striping) versus a file on a single OST could change amount of cache used by the willread hint on ost1 when you have more than one OST.

          Could this explain why this test passes some of the time and fails some of the time?

          In my testing, using a single striped file for this test succeeds every time and striping a file over 4 OSTs fails every time.

          jamesanunez James Nunez (Inactive) added a comment - Li Xi - For sanity test_255b, we are only measuring the total amount of cache and amount of cache used for ost1. Yet, the first thing we do is stripe the file across all OSTs; ‘lfs setstripe –c -1 -i 0 …’. To see the impact of caching on ost1 only, do we want to limit the file to a single ost, in particular ost1, meaning ‘lfs setstripe –c 1 -i 0 ...' ? A file on multiple OSTs (full striping) versus a file on a single OST could change amount of cache used by the willread hint on ost1 when you have more than one OST. Could this explain why this test passes some of the time and fails some of the time? In my testing, using a single striped file for this test succeeds every time and striping a file over 4 OSTs fails every time.

          This is causing a lot of test failures.

          Li Xi, could you please take a look.

          adilger Andreas Dilger added a comment - This is causing a lot of test failures. Li Xi, could you please take a look.
          sguminsx Steve Guminski (Inactive) added a comment - Another failure on master: https://testing.hpdd.intel.com/test_sets/8e908c38-ac1e-11e6-9116-5254006e85c2
          yujian Jian Yu added a comment - One more failure instance on master branch: https://testing.hpdd.intel.com/test_sets/7e484224-aa8b-11e6-a095-5254006e85c2
          sguminsx Steve Guminski (Inactive) added a comment - Again on master: https://testing.hpdd.intel.com/test_sets/a6324fec-ab84-11e6-a76e-5254006e85c2
          jhammond John Hammond added a comment -

          On master:

          https://testing.hpdd.intel.com/test_sets/37218620-ab04-11e6-a726-5254006e85c2

          Note that ladvise willread used almost 76800 KiB.

          jhammond John Hammond added a comment - On master: https://testing.hpdd.intel.com/test_sets/37218620-ab04-11e6-a726-5254006e85c2 Note that ladvise willread used almost 76800 KiB.
          bogl Bob Glossman (Inactive) added a comment - another on master: https://testing.hpdd.intel.com/test_sets/e7949930-a88d-11e6-882e-5254006e85c2

          People

            jamesanunez James Nunez (Inactive)
            yujian Jian Yu
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: