Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18646

ost-pools test_23b: FAIL: dd didn't fail with ENOSPC (26214400 > 22639616)

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.17.0
    • Lustre 2.17.0
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for jianyu <yujian@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/767a0590-688b-450d-9678-890811d62d10

      test_23b failed with the following error:

      [5 iteration] 5120+0 records in
      5120+0 records out
      5368709120 bytes (5.4 GB, 5.0 GiB) copied, 16.6809 s, 322 MB/s
      total written: 26214400
      stime=1736831289, etime=1736831374, elapsed=85
      ost-pools test_23b: @@@@@@ FAIL: dd didn't fail with ENOSPC (26214400 > 22639616) 
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/110246 - 5.14.0-427.42.1.el9_4.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/110246 - 5.14.0-427.42.1_lustre.el9.x86_64

      <<Please provide additional information about the failure here>>

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      ost-pools test_23b - dd didn't fail with ENOSPC (26214400 > 22639616)

      Attachments

        Issue Links

          Activity

            [LU-18646] ost-pools test_23b: FAIL: dd didn't fail with ENOSPC (26214400 > 22639616)

            Patch https://review.whamcloud.com/c/fs/lustre-release/+/57871 fixed the ost-pool failure. My debug patch really belongs to LU-18652. The links to this ticket can be founded with labels = zfs-2.2.7

            simmonsja James A Simmons added a comment - Patch https://review.whamcloud.com/c/fs/lustre-release/+/57871 fixed the ost-pool failure. My debug patch really belongs to LU-18652 . The links to this ticket can be founded with labels = zfs-2.2.7

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58003
            Subject: LU-18646 tests: disable failing conf-sanity test
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8a45af0a4efa4a39d9c073f1e611169558f1ac8a

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58003 Subject: LU-18646 tests: disable failing conf-sanity test Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8a45af0a4efa4a39d9c073f1e611169558f1ac8a

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/57871/
            Subject: LU-18646 tests: use /dev/urandom to consume space
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 052d622fa1938204f3cb4a9605f692cad3000587

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/57871/ Subject: LU-18646 tests: use /dev/urandom to consume space Project: fs/lustre-release Branch: master Current Patch Set: Commit: 052d622fa1938204f3cb4a9605f692cad3000587

            Nikitas, I would prefer that the other tests be fixed in a separate patch, though that patch can use this same LU ticket number. That will allow the patches to land independently, at least fixing some of the testing.

            Right now my patch is blocked on the sanity-quota test_49 failure (LU-18676) and I haven't had any chance to debug it due to travel.

            adilger Andreas Dilger added a comment - Nikitas, I would prefer that the other tests be fixed in a separate patch, though that patch can use this same LU ticket number. That will allow the patches to land independently, at least fixing some of the testing. Right now my patch is blocked on the sanity-quota test_49 failure ( LU-18676 ) and I haven't had any chance to debug it due to travel.
            nangelinas Nikitas Angelinas added a comment - - edited

            adilger, https://review.whamcloud.com/#/c/fs/lustre-release/+/49277, which is on top of current master, is failing some additional ZFS tests that might be due to the same issue, e.g. sanityn/36 in https://testing.whamcloud.com/test_sets/c9ff0867-312b-46e7-aaa0-68a100ab11ac, sanity-pfl/22c in https://testing.whamcloud.com/test_sets/ccf5b1e0-3de3-4a21-9dfc-93d7bc4323c1 and others in conf-sanity in https://testing.whamcloud.com/test_sets/9ad3e0e6-0df7-4176-a046-5ce1b2de09e9 (I don't know if I have missed some).

            Should https://review.whamcloud.com/#/c/fs/lustre-release/+/57871 and its testlist be updated to fix and run those tests as well, if the failures are caused by the same issue?

            nangelinas Nikitas Angelinas added a comment - - edited adilger , https://review.whamcloud.com/#/c/fs/lustre-release/+/49277 , which is on top of current master, is failing some additional ZFS tests that might be due to the same issue, e.g. sanityn/36 in https://testing.whamcloud.com/test_sets/c9ff0867-312b-46e7-aaa0-68a100ab11ac , sanity-pfl/22c in https://testing.whamcloud.com/test_sets/ccf5b1e0-3de3-4a21-9dfc-93d7bc4323c1 and others in conf-sanity in https://testing.whamcloud.com/test_sets/9ad3e0e6-0df7-4176-a046-5ce1b2de09e9 (I don't know if I have missed some). Should https://review.whamcloud.com/#/c/fs/lustre-release/+/57871 and its testlist be updated to fix and run those tests as well, if the failures are caused by the same issue?

            It looks like this problem was initially hit with the review testing of patch https://review.whamcloud.com/57237 ("LU-18387 kernel: RHEL 9.5 server support") which updated contrib/lbuild/lbuild to use SPLZFSVER=2.2.6, so this may be related to LU-18646 which is addressing failures across a large number of sanity-quota test failures across many subtests, with what looks like "dd" not consuming enough space during writes.

            adilger Andreas Dilger added a comment - It looks like this problem was initially hit with the review testing of patch https://review.whamcloud.com/57237 (" LU-18387 kernel: RHEL 9.5 server support ") which updated contrib/lbuild/lbuild to use SPLZFSVER=2.2.6 , so this may be related to LU-18646 which is addressing failures across a large number of sanity-quota test failures across many subtests, with what looks like "dd" not consuming enough space during writes.

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57871
            Subject: LU-18646 tests: use /dev/urandom for quota files
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 2911ffdefb07d1efa068dbfa40ad011e313f1fce

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57871 Subject: LU-18646 tests: use /dev/urandom for quota files Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2911ffdefb07d1efa068dbfa40ad011e313f1fce

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: