Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4265

replay-ost-single test_6: space grew after dd (or didn't change)

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.6.0, Lustre 2.5.3, Lustre 2.9.0, Lustre 2.12.0, Lustre 2.13.0, Lustre 2.12.1
    • 3
    • 11718

    Description

      This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

      This issue relates to the following test suite run:
      http://maloo.whamcloud.com/test_sets/627ba06c-48b0-11e3-bdb5-52540035b04c
      http://maloo.whamcloud.com/test_sets/600e6472-4f68-11e3-84d3-52540035b04c

      The sub-test test_6 failed with the following error:

      space grew after dd: before:77312128 after_dd:77312128

      Info required for matching: replay-ost-single 6

      Attachments

        Issue Links

          Activity

            [LU-4265] replay-ost-single test_6: space grew after dd (or didn't change)
            jamesanunez James Nunez (Inactive) added a comment - Another failure on master: 2015-12-26 18:38:36 - https://testing.hpdd.intel.com/test_sets/cc75f8cc-ac0a-11e5-8114-5254006e85c2
            yujian Jian Yu added a comment - More failure instance on master branch: https://testing.hpdd.intel.com/test_sets/94e59696-3640-11e5-84a9-5254006e85c2
            jamesanunez James Nunez (Inactive) added a comment - Another two failures on master in review-zfs-part-2: 2015-07-03 07:45:48 - https://testing.hpdd.intel.com/test_sets/ceba821c-2161-11e5-a979-5254006e85c2 2015-07-14 21:23:00 - https://testing.hpdd.intel.com/test_sets/234cf58c-2a7a-11e5-96c0-5254006e85c2
            bogl Bob Glossman (Inactive) added a comment - another on master: https://testing.hpdd.intel.com/test_sets/92cbbd1a-16d1-11e5-8436-5254006e85c2
            johann Johann Lombardi (Inactive) added a comment - Another instance on master: https://testing.hpdd.intel.com/test_sets/b07629a6-86a6-11e4-b678-5254006e85c2
            yujian Jian Yu added a comment - More failure instance on Lustre b2_5 branch: https://testing.hpdd.intel.com/test_sets/cdfb95ec-7dd7-11e4-aa98-5254006e85c2

            Another one:
            https://testing.hpdd.intel.com/test_sets/197aed5c-592e-11e4-8f95-5254006e85c2

            It appeared to be related to LU-3455. The test_6 tried to get before=$(kbytesfree) after file removal and some wait:

                rm -f $f
                sync && sleep 5 && sync  # wait for delete thread
            
                # wait till space is returned, following
                # (( $before > $after_dd)) test counting on that
                wait_mds_ost_sync || return 4
                wait_destroy_complete || return 5
            
                local before=$(kbytesfree)
            

            I'd doubt the wait can work reliably with ZFS - sometimes frees can be delayed for a few transaction groups' time. It seemed inherently unreliable to free and wait and get free space for ZFS.

            isaac Isaac Huang (Inactive) added a comment - Another one: https://testing.hpdd.intel.com/test_sets/197aed5c-592e-11e4-8f95-5254006e85c2 It appeared to be related to LU-3455 . The test_6 tried to get before=$(kbytesfree) after file removal and some wait: rm -f $f sync && sleep 5 && sync # wait for delete thread # wait till space is returned, following # (( $before > $after_dd)) test counting on that wait_mds_ost_sync || return 4 wait_destroy_complete || return 5 local before=$(kbytesfree) I'd doubt the wait can work reliably with ZFS - sometimes frees can be delayed for a few transaction groups' time. It seemed inherently unreliable to free and wait and get free space for ZFS.
            yong.fan nasf (Inactive) added a comment - Another failure instance: https://testing.hpdd.intel.com/test_sets/3e9f4124-45c3-11e4-9397-5254006e85c2
            yujian Jian Yu added a comment - One more instance on Lustre b2_5 branch: https://testing.hpdd.intel.com/test_sets/cd0bb8ee-44dc-11e4-bb5a-5254006e85c2
            yujian Jian Yu added a comment - While verifying patches http://review.whamcloud.com/11541 , http://review.whamcloud.com/11411 , http://review.whamcloud.com/9318 on Lustre b2_5 branch with FSTYPE=zfs, the same failure occurred: https://testing.hpdd.intel.com/test_sets/160d2586-31e7-11e4-92e0-5254006e85c2 https://testing.hpdd.intel.com/test_sets/228a2e1e-27b8-11e4-893b-5254006e85c2 https://testing.hpdd.intel.com/test_sets/cb2bd86a-9a50-11e3-965c-52540035b04c

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: