Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16872

sanity: test_27M Error: '(5) stripe count , should be 8 for append'

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Frank Sehr <fsehr@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/46d1b101-8e4f-4415-bb49-ee39963275fe

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/95319 - 4.18.0-425.10.1.el8_7.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/95319 - 4.18.0-425.10.1.el8_lustre.x86_64

      == sanity test 27M: test O_APPEND striping =============== 03:40:09 (1685936409)
      CMD: trevis-129vm4 /usr/sbin/lctl get_param -n version 2>/dev/null
      striped dir -i3 -c2 -H crush2 /mnt/lustre/d27M.sanity
      CMD: trevis-129vm4 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.append_pool
      CMD: trevis-129vm4 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.append_stripe_count
      CMD: trevis-129vm4,trevis-129vm5 /usr/sbin/lctl set_param mdd.*.append_stripe_count=0
      mdd.lustre-MDT0000.append_stripe_count=0
      mdd.lustre-MDT0002.append_stripe_count=0
      mdd.lustre-MDT0001.append_stripe_count=0
      mdd.lustre-MDT0003.append_stripe_count=0
      CMD: trevis-129vm4,trevis-129vm5 /usr/sbin/lctl set_param mdd.*.append_stripe_count=2
      mdd.lustre-MDT0000.append_stripe_count=2
      mdd.lustre-MDT0002.append_stripe_count=2
      mdd.lustre-MDT0001.append_stripe_count=2
      mdd.lustre-MDT0003.append_stripe_count=2
      CMD: trevis-129vm4,trevis-129vm5 /usr/sbin/lctl set_param mdd.*.append_stripe_count=-1
      mdd.lustre-MDT0000.append_stripe_count=-1
      mdd.lustre-MDT0002.append_stripe_count=-1
      mdd.lustre-MDT0001.append_stripe_count=-1
      mdd.lustre-MDT0003.append_stripe_count=-1
      /usr/lib64/lustre/tests/sanity.sh: line 3101: /mnt/lustre/d27M.sanity/f27M.sanity.5: Invalid argument
      lfs: getstripe for '/mnt/lustre/d27M.sanity/f27M.sanity.5' failed: No such file or directory
      /usr/lib64/lustre/tests/sanity.sh: line 3103: [: -eq: unary operator expected
      sanity test_27M: @@@@@@ FAIL: (5) stripe count , should be 8 for append
      Trace dump:
      = /usr/lib64/lustre/tests/test-framework.sh:6585:error()
      = /usr/lib64/lustre/tests/sanity.sh:3104:test_27M()
      = /usr/lib64/lustre/tests/test-framework.sh:6925:run_one()
      = /usr/lib64/lustre/tests/test-framework.sh:6974:run_one_logged()
      = /usr/lib64/lustre/tests/test-framework.sh:6811:run_test()
      = /usr/lib64/lustre/tests/sanity.sh:3181:main()
      Dumping lctl log to /autotest/autotest-2/2023-06-05/lustre-reviews_review-ldiskfs-dne_95319_27_26b27a74-2421-4453-9c33-cd237feca413//sanity.test_27M.*.1685936415.log
      CMD: trevis-129vm1.trevis.whamcloud.com,trevis-129vm2,trevis-129vm3,trevis-129vm4,trevis-129vm5 /usr/sbin/lctl dk > /autotest/autotest-2/2023-06-05/lustre-reviews_review-ldiskfs-dne_95319_27_26b27a74-2421-4453-9c33-cd237feca413//sanity.test_27M.debug_log.$(hostname -s).1685936415.log;
      dmesg > /autotest/autotest-2/2023-06-05/lustre-reviews_review-ldiskfs-dne_95319_27_26b27a74-2421-4453-9c33-cd237feca413//sanity.test_27M.dmesg.$(hostname -s).1685936415.log
      CMD: trevis-129vm4,trevis-129vm5 /usr/sbin/lctl set_param mdd.*.append_stripe_count=1
      mdd.lustre-MDT0000.append_stripe_count=1
      mdd.lustre-MDT0002.append_stripe_count=1
      mdd.lustre-MDT0001.append_stripe_count=1
      mdd.lustre-MDT0003.append_stripe_count=1
      CMD: trevis-129vm4,trevis-129vm5 /usr/sbin/lctl set_param mdd.*.append_pool=none
      mdd.lustre-MDT0000.append_pool=none
      mdd.lustre-MDT0002.append_pool=none
      mdd.lustre-MDT0001.append_pool=none
      mdd.lustre-MDT0003.append_pool=none

      Attachments

        Issue Links

          Activity

            [LU-16872] sanity: test_27M Error: '(5) stripe count , should be 8 for append'
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51602/
            Subject: LU-16872 tests: exercise sanity test_27M more fully
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7bb1685048bf999df03ceadab39faa09b8a5560d

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51602/ Subject: LU-16872 tests: exercise sanity test_27M more fully Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7bb1685048bf999df03ceadab39faa09b8a5560d

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51559/
            Subject: LU-16872 lod: reset llc_ostlist when using O_APPEND stripes
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 766b35a9700f36aa08b652fa9d18b890d34bf4a5

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51559/ Subject: LU-16872 lod: reset llc_ostlist when using O_APPEND stripes Project: fs/lustre-release Branch: master Current Patch Set: Commit: 766b35a9700f36aa08b652fa9d18b890d34bf4a5

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51602
            Subject: LU-16872 tests: exercise sanity test_27M more fully
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 221a2d05d5d4ec2b39c88c6a5d84df2ba3f177dc

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51602 Subject: LU-16872 tests: exercise sanity test_27M more fully Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 221a2d05d5d4ec2b39c88c6a5d84df2ba3f177dc

            How to reproduce:

            # setup:
            $ lctl set_param mdd.*.append_stripe_count=-1
            $ lfs setstripe -o 1,3 /mnt/lustre
            
            # touch enough files with the default striping so that every mdt kernel thread probably has the defaults stored in its memory
            $ for i in {0..100}; do touch /mnt/lustre/x$i; done
            
            # now an append should return EINVAL as long as it gets handled by a kernel thread that previously did a create with default stripes
            $ echo 1 >> /mnt/lustre/f
            -bash: /mnt/lustre/g: Invalid argument

            A closely related problem occurs when an append_pool is set, but in this case, the create succeeds, but the append file is created with the default stripes rather than the pool.

            I haven't identified which patch caused (or uncovered) the issue yet – I didn't see anything obvious in the patches merged shortly before the first test failure occurred. So I'll attempt a git bisect to try to find what caused this and will update if I get that answer.

            bertschinger Thomas Bertschinger added a comment - How to reproduce: # setup: $ lctl set_param mdd.*.append_stripe_count=-1 $ lfs setstripe -o 1,3 /mnt/lustre # touch enough files with the default striping so that every mdt kernel thread probably has the defaults stored in its memory $ for i in {0..100}; do touch /mnt/lustre/x $i ; done # now an append should return EINVAL as long as it gets handled by a kernel thread that previously did a create with default stripes $ echo 1 >> /mnt/lustre/f -bash: /mnt/lustre/g: Invalid argument A closely related problem occurs when an append_pool is set, but in this case, the create succeeds, but the append file is created with the default stripes rather than the pool. I haven't identified which patch caused (or uncovered) the issue yet – I didn't see anything obvious in the patches merged shortly before the first test failure occurred. So I'll attempt a git bisect to try to find what caused this and will update if I get that answer.

            "Thomas Bertschinger <bertschinger@lanl.gov>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51559
            Subject: LU-16872 lod: do not stripe O_APPEND files on specific OSTs
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 81e2439da705dd35dc2c8c687be21cf7dc952eba

            gerrit Gerrit Updater added a comment - "Thomas Bertschinger <bertschinger@lanl.gov>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51559 Subject: LU-16872 lod: do not stripe O_APPEND files on specific OSTs Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 81e2439da705dd35dc2c8c687be21cf7dc952eba

            Oops, I wasn't thinking straight this morning but overflowing llc_pool wouldn't affect op_array and op_count since the buffer wouldn't be adjacent to these fields. So I'm still looking for what could cause op_array and op_count to have bad values. (I still think that's the most likely explanation for the issue.)

            bertschinger Thomas Bertschinger added a comment - Oops, I wasn't thinking straight this morning but overflowing llc_pool wouldn't affect op_array and op_count since the buffer wouldn't be adjacent to these fields. So I'm still looking for what could cause op_array and op_count to have bad values. (I still think that's the most likely explanation for the issue.)

            People

              bertschinger Thomas Bertschinger
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: