Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5490

sanity test_133d: FAIL: samedir_rename_size error

Details

    • 3
    • 15318

    Description

      sanity test 133d failed as follows:

      == sanity test 133d: Verifying rename_stats ========================================== 01:32:10 (1408005130)
      CMD: onyx-47vm7 /usr/sbin/lctl list_param mdt.*.rename_stats
      mdt.lustre-MDT0000.rename_stats
      CMD: onyx-47vm7 /usr/sbin/lctl set_param mdt.*.rename_stats=clear
      mdt.lustre-MDT0000.rename_stats=clear
      total: 512 creates in 1.52 seconds: 336.70 creates/second
      source rename dir size: 64K
      target rename dir size: 8K
      CMD: onyx-47vm7 /usr/sbin/lctl get_param mdt.*.rename_stats
      mdt.lustre-MDT0000.rename_stats=
      rename_stats:
      - snapshot_time:  1408005133.29235
      - same_dir       
            512bytes: { sample:   1, pct: 100, cum_pct: 100 }
      CMD: onyx-47vm7 /usr/sbin/lctl get_param mdt.*.rename_stats
      CMD: onyx-47vm7 /usr/sbin/lctl get_param mdt.*.rename_stats
       sanity test_133d: @@@@@@ FAIL: samedir_rename_size error
      

      Maloo report: https://testing.hpdd.intel.com/test_sets/4b98ba1c-23f5-11e4-b2ba-5254006e85c2

      Attachments

        Issue Links

          Activity

            [LU-5490] sanity test_133d: FAIL: samedir_rename_size error

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31585/
            Subject: LU-5490 tests: Sanity/133d ensure stats read is on correct MDT
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 1d80fd72bf42f653a5d6a4a31fc2c5df571f1afc

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31585/ Subject: LU-5490 tests: Sanity/133d ensure stats read is on correct MDT Project: fs/lustre-release Branch: master Current Patch Set: Commit: 1d80fd72bf42f653a5d6a4a31fc2c5df571f1afc

            Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/31585
            Subject: LU-5490 tests: Sanity/133d ensure stats read is on correct MDT
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fa38f33d74bcd91ecf8768dde3dece69d629c099

            gerrit Gerrit Updater added a comment - Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/31585 Subject: LU-5490 tests: Sanity/133d ensure stats read is on correct MDT Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fa38f33d74bcd91ecf8768dde3dece69d629c099

            This test can fail for DNE as test_133d only checks 1 MDS.

            utopiabound Nathaniel Clark added a comment - This test can fail for DNE as test_133d only checks 1 MDS.

            Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/29053
            Subject: LU-5490 tests: FORTESTONLY
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 821baea4626beb70dbca822f3109d58180db3fd2

            gerrit Gerrit Updater added a comment - Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/29053 Subject: LU-5490 tests: FORTESTONLY Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 821baea4626beb70dbca822f3109d58180db3fd2
            paf Patrick Farrell (Inactive) added a comment - +1 on master: https://testing.hpdd.intel.com/test_sets/1d65eb6a-975b-11e7-b761-5254006e85c2
            bogl Bob Glossman (Inactive) added a comment - another on master: https://testing.hpdd.intel.com/test_sets/233fe796-0749-11e6-b5f1-5254006e85c2
            utopiabound Nathaniel Clark added a comment - This is still happening on master: https://testing.hpdd.intel.com/test_sets/242be33c-e0d2-11e5-97b5-5254006e85c2 https://testing.hpdd.intel.com/test_sets/75019ae8-dfee-11e5-9400-5254006e85c2 https://testing.hpdd.intel.com/test_sets/29b44c56-df1d-11e5-8471-5254006e85c2

            "cause" of error seems to be size of directory reported by ls/stat is one thing (i.e. "64K") and size in mdt.lustre-MDT0000.rename_stats is another (i.e. "512bytes"). Not sure why the discrepancy yet, but this is why the test is failing.

            utopiabound Nathaniel Clark added a comment - "cause" of error seems to be size of directory reported by ls/stat is one thing (i.e. "64K") and size in mdt.lustre-MDT0000.rename_stats is another (i.e. "512bytes"). Not sure why the discrepancy yet, but this is why the test is failing.

            Nathaniel, could you take a crack at this if you have time. Not critical, but a source of ongoing annoyance for patch landings.

            adilger Andreas Dilger added a comment - Nathaniel, could you take a crack at this if you have time. Not critical, but a source of ongoing annoyance for patch landings.
            jamesanunez James Nunez (Inactive) added a comment - - edited More failures on master at: 2015-07-10 03:43:38 - https://testing.hpdd.intel.com/test_sets/840b7844-26ec-11e5-b3d7-5254006e85c2 2015-07-22 20:01:38 - https://testing.hpdd.intel.com/test_sets/c4fdb934-30f1-11e5-a788-5254006e85c2 2015-08-24 15:36:45 - https://testing.hpdd.intel.com/test_sets/fb01ec64-4aad-11e5-88e8-5254006e85c2 2016-01-04 01:56:27 - https://testing.hpdd.intel.com/test_sets/fafafd4a-b2b4-11e5-8114-5254006e85c2 2016-02-02 00:40:04 - https://testing.hpdd.intel.com/test_sets/95eca8d4-c968-11e5-9e6a-5254006e85c2 2016-02-04 07:09:54 - https://testing.hpdd.intel.com/test_sets/57eaf714-cb3b-11e5-b49e-5254006e85c2 2016-02-04 09:47:15 - https://testing.hpdd.intel.com/test_sets/4b7d3746-cb4d-11e5-be8d-5254006e85c2

            Between this failure and LU-4203 there are 11 failures in the past month. I suspect there is some inconsistency with how osd-zfs is recording the directory size in the rename stats (which I think is the number of entries) vs. how the test or /proc is checking them (which is the number of bytes). I don't think we need to change osd-zfs to record the stats differently, but rather fix the test so that it handles this consistently. It isn't clear why the test only fails intermittently.

            adilger Andreas Dilger added a comment - Between this failure and LU-4203 there are 11 failures in the past month. I suspect there is some inconsistency with how osd-zfs is recording the directory size in the rename stats (which I think is the number of entries) vs. how the test or /proc is checking them (which is the number of bytes). I don't think we need to change osd-zfs to record the stats differently, but rather fix the test so that it handles this consistently. It isn't clear why the test only fails intermittently.

            People

              utopiabound Nathaniel Clark
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: