Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17747

interop: sanity test_130g: filefrag printed 175 < 700 extents

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.14.0, Lustre 2.15.4
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Alena Nikitenko <anikitenko@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/5907cf45-fd1e-4f8b-a959-32150206c06d

      test_130g failed with the following error:

      filefrag printed 175 < 700 extents
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-b_2_14/8 - 4.18.0-348.7.1.el8_5.x86_64
      servers: https://build.whamcloud.com/job/lustre-master/4518 - 5.14.0-284.30.1_lustre.el9.x86_64

      trevis-27vm1 - Client 1 (2.14.0.21, x86_64)
      trevis-27vm2 - Client 2 (2.14.0.21, x86_64)
      trevis-27vm3 - OST 1, OST 2, OST 3, OST 4, OST 5, OST 6, OST 7 (2.15.62, x86_64)
      trevis-27vm6 - MDS 1 (2.15.62, x86_64)

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_130g - filefrag printed 175 < 700 extents

      Attachments

        Issue Links

          Activity

            [LU-17747] interop: sanity test_130g: filefrag printed 175 < 700 extents
            dongyang Dongyang Li added a comment - https://review.whamcloud.com/c/fs/lustre-release/+/54851
            adilger Andreas Dilger added a comment - - edited

            Looking at one of the test failures:
            https://testing.whamcloud.com/test_logs/d9eaa5a7-f091-4711-8427-08d3fb0cd0bb/show_text

              sanity test_130g: @@@@@@ FAIL: filefrag printed 175 < 700 extents 
            

            It shows the file was created with 700 stripes, and just returned 175 stripes with filefrag, so I don't think the problem is on the object creation side, especially since the stripe counts are always the same:

             filefrag list 175 extents in file with stripecount 700
             /mnt/lustre/f130g.sanity
             lmm_stripe_count:  700
             lmm_stripe_size:   4194304
             lmm_pattern:       raid0,overstriped 
            

            However, I do see that lmm_stripe_size = 4MB, so this would mean the "dd" is not writing enough data to the file to put 1MB on each stripe, so this is likely causing the test to fail. It looks like "-S1M" is added on master in commit v2_15_58-43-gea18d7da59, but it isn't on b2_14 or b2_15 clients where the the test is running:

                onyx-51vm4 - Client 1 (2.14.0.21, x86_64)
                onyx-51vm5 - Client 2 (2.14.0.21, x86_64)
                onyx-55vm4 - OST 1-7 (2.15.61.194, x86_64)
                onyx-55vm7 - MDS 1 (2.15.61.194, x86_64)
            

            It looks like this (and other) subtests need to be fixed on b2_15 to handle the change in default stripe size, and this subtest excluded by autotest from running with b2_14 clients by filing an ATM ticket to request this for clients < 2.15.5 (assuming this patch will land on b2_15 for that release). This needs to be handled at the Autotest level because we can't retroactively fix the 2.14.0 release to work with 4MB default stripe count.

            adilger Andreas Dilger added a comment - - edited Looking at one of the test failures: https://testing.whamcloud.com/test_logs/d9eaa5a7-f091-4711-8427-08d3fb0cd0bb/show_text sanity test_130g: @@@@@@ FAIL: filefrag printed 175 < 700 extents It shows the file was created with 700 stripes, and just returned 175 stripes with filefrag, so I don't think the problem is on the object creation side, especially since the stripe counts are always the same: filefrag list 175 extents in file with stripecount 700 /mnt/lustre/f130g.sanity lmm_stripe_count: 700 lmm_stripe_size: 4194304 lmm_pattern: raid0,overstriped However, I do see that lmm_stripe_size = 4MB, so this would mean the "dd" is not writing enough data to the file to put 1MB on each stripe, so this is likely causing the test to fail. It looks like "-S1M" is added on master in commit v2_15_58-43-gea18d7da59, but it isn't on b2_14 or b2_15 clients where the the test is running: onyx-51vm4 - Client 1 (2.14.0.21, x86_64) onyx-51vm5 - Client 2 (2.14.0.21, x86_64) onyx-55vm4 - OST 1-7 (2.15.61.194, x86_64) onyx-55vm7 - MDS 1 (2.15.61.194, x86_64) It looks like this (and other) subtests need to be fixed on b2_15 to handle the change in default stripe size, and this subtest excluded by autotest from running with b2_14 clients by filing an ATM ticket to request this for clients < 2.15.5 (assuming this patch will land on b2_15 for that release). This needs to be handled at the Autotest level because we can't retroactively fix the 2.14.0 release to work with 4MB default stripe count.

            People

              dongyang Dongyang Li
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: