Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9096

sanity test_253: File creation failed after rm

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • None
    • Lustre 2.9.0, Lustre 2.10.0, Lustre 2.14.0
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/b06b5a90-e8f0-11e6-935d-5254006e85c2.

      The sub-test test_253 failed with the following error:

      CMD: trevis-43vm3 /usr/sbin/lctl get_param -n osp.lustre-OST0000-osc-MDT0000.prealloc_status
      prealloc_status -28
      dd: failed to open '/mnt/lustre/d253.sanity/2': No space left on device
      5+0 records in
      5+0 records out
      5242880 bytes (5.2 MB) copied, 0.0534031 s, 98.2 MB/s
      CMD: trevis-43vm3 /usr/sbin/lctl set_param -n osd*.*MD*.force_sync 1
      CMD: trevis-43vm3 /usr/sbin/lctl get_param -n osc.*MDT*.sync_*
      :
      :
      CMD: trevis-43vm3 /usr/sbin/lctl get_param -n osc.*MDT*.sync_*
      Delete is not completed in 28 seconds
      Waiting for local destroys to complete
      CMD: trevis-43vm3 lctl get_param -n lov.*.qos_maxage
       sanity test_253: @@@@@@ FAIL: File creation failed after rm
      
      

      The MDS console log shows that there are no precreated objects for this OST:

      20:44:10:[11571.982460] LustreError: 22503:0:(lod_qos.c:1273:lod_alloc_specific()) can't lstripe objid [0x20000234c:0x13b8:0x0]: have 0 want 1
      

      Info required for matching: sanity 253

      Attachments

        Issue Links

          Activity

            [LU-9096] sanity test_253: File creation failed after rm

            It looks like this test was re-enabled by patch https://review.whamcloud.com/33778 "LU-10070 tests: New test-framework functionality". Since it currently is not failing very offer, I'm not in a hurry to turn it off again, especially if it is adding coverage for SEL.

            adilger Andreas Dilger added a comment - It looks like this test was re-enabled by patch https://review.whamcloud.com/33778 " LU-10070 tests: New test-framework functionality ". Since it currently is not failing very offer, I'm not in a hurry to turn it off again, especially if it is adding coverage for SEL.

            For some reason this test is now running again and failing intermittently (about 4x per month).

            adilger Andreas Dilger added a comment - For some reason this test is now running again and failing intermittently (about 4x per month).
            pjones Peter Jones added a comment -

            The test is no longer being run so I will remove the fixversion from this ticket and it can be reintroduced when the test has been given a clean bill of health

            pjones Peter Jones added a comment - The test is no longer being run so I will remove the fixversion from this ticket and it can be reintroduced when the test has been given a clean bill of health

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27013/
            Subject: LU-9096 test: add sanity 253 to ALWAYS_EXCEPT
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: cdc7b3bb16537fc513a64b4f20824b9832efcdf9

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27013/ Subject: LU-9096 test: add sanity 253 to ALWAYS_EXCEPT Project: fs/lustre-release Branch: master Current Patch Set: Commit: cdc7b3bb16537fc513a64b4f20824b9832efcdf9

            Since sanity 253 is failing intermittently and the correctness of the test is being questioned, I've uploaded a patch to add the test to the ALWAYS_EXCEPT list and, thus, not be run.

            Patch https://review.whamcloud.com/27013

            jamesanunez James Nunez (Inactive) added a comment - Since sanity 253 is failing intermittently and the correctness of the test is being questioned, I've uploaded a patch to add the test to the ALWAYS_EXCEPT list and, thus, not be run. Patch https://review.whamcloud.com/27013

            James Nunez (james.a.nunez@intel.com) uploaded a new patch: https://review.whamcloud.com/27013
            Subject: LU-9096 test: add sanity 253 to ALWAYS_EXCEPT
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d62f40a8af0fa7d4041517e81a0183bb051c1785

            gerrit Gerrit Updater added a comment - James Nunez (james.a.nunez@intel.com) uploaded a new patch: https://review.whamcloud.com/27013 Subject: LU-9096 test: add sanity 253 to ALWAYS_EXCEPT Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d62f40a8af0fa7d4041517e81a0183bb051c1785

            bzzz,

            Instead of hard coding a constant, would it make sense to use test_framework.sh::fs_log_size() to ensure there's enough room for the fs log?

            utopiabound Nathaniel Clark added a comment - bzzz , Instead of hard coding a constant, would it make sense to use test_framework.sh::fs_log_size() to ensure there's enough room for the fs log?
            jhammond John Hammond added a comment -

            In the interim it would be good to dump all the osp params when this fails (rather than just prealloc_status).

            jhammond John Hammond added a comment - In the interim it would be good to dump all the osp params when this fails (rather than just prealloc_status).
            jhammond John Hammond added a comment -

            > solves the problem.. I'm wondering where 10 comes from.

            I think it matches the 10 used a few lines before:
            dd if=/dev/zero of=$DIR/$tdir/0 bs=1M count=10

            jhammond John Hammond added a comment - > solves the problem.. I'm wondering where 10 comes from. I think it matches the 10 used a few lines before: dd if=/dev/zero of=$DIR/$tdir/0 bs=1M count=10

            @@ -14318,7 +14318,7 @@ test_253() {
            local blocks=$($LFS df $MOUNT | grep $ost_name | awk '

            { print $4 }

            ')
            echo "OST still has $((blocks/1024)) mbytes free"

            • local new_lwm=$((blocks/1024-10))
              + local new_lwm=$((blocks/1024-20))
              do_facet $SINGLEMDS $LCTL set_param \
              osp.$mdtosc_proc1.reserved_mb_high=$((new_lwm+5))
              do_facet $SINGLEMDS $LCTL set_param \

            solves the problem.. I'm wondering where 10 comes from.

            bzzz Alex Zhuravlev added a comment - @@ -14318,7 +14318,7 @@ test_253() { local blocks=$($LFS df $MOUNT | grep $ost_name | awk ' { print $4 } ') echo "OST still has $((blocks/1024)) mbytes free" local new_lwm=$((blocks/1024-10)) + local new_lwm=$((blocks/1024-20)) do_facet $SINGLEMDS $LCTL set_param \ osp.$mdtosc_proc1.reserved_mb_high=$((new_lwm+5)) do_facet $SINGLEMDS $LCTL set_param \ solves the problem.. I'm wondering where 10 comes from.

            updated statfs: rc = 0, state, ffree 57923, bavail 52864, avail 206, low 1
            updated statfs: rc = 0, state, ffree 57923, bavail 52864, avail 206, low 1
            updated statfs: rc = 0, state enospc, ffree 57923, bavail 39552, avail 154, low 209
            updated statfs: rc = 0, state enospc, ffree 18345, bavail 44512, avail 173, low 209
            updated statfs: rc = 0, state enospc, ffree 18345, bavail 44512, avail 173, low 209
            updated statfs: rc = 0, state enospc, ffree 18345, bavail 44512, avail 173, low 209
            updated statfs: rc = 0, state enospc, ffree 18345, bavail 44512, avail 173, low 209
            updated statfs: rc = 0, state enospc, ffree 18345, bavail 42848, avail 167, low 209
            updated statfs: rc = 0, state enospc, ffree 18345, bavail 42848, avail 167, low 209
            updated statfs: rc = 0, state enospc, ffree 57908, bavail 52864, avail 206, low 209
            updated statfs: rc = 0, state enospc, ffree 57908, bavail 52864, avail 206, low 209

            sanity/253 sets low to 209:
            local blocks=$($LFS df $MOUNT | grep $ost_name | awk '

            { print $4 }

            ')
            local new_lwm=$((blocks/1024-10))

            osp.lustre-OST0000-osc-MDT0000.reserved_mb_low=209

            bzzz Alex Zhuravlev added a comment - updated statfs: rc = 0, state, ffree 57923, bavail 52864, avail 206, low 1 updated statfs: rc = 0, state, ffree 57923, bavail 52864, avail 206, low 1 updated statfs: rc = 0, state enospc, ffree 57923, bavail 39552, avail 154, low 209 updated statfs: rc = 0, state enospc, ffree 18345, bavail 44512, avail 173, low 209 updated statfs: rc = 0, state enospc, ffree 18345, bavail 44512, avail 173, low 209 updated statfs: rc = 0, state enospc, ffree 18345, bavail 44512, avail 173, low 209 updated statfs: rc = 0, state enospc, ffree 18345, bavail 44512, avail 173, low 209 updated statfs: rc = 0, state enospc, ffree 18345, bavail 42848, avail 167, low 209 updated statfs: rc = 0, state enospc, ffree 18345, bavail 42848, avail 167, low 209 updated statfs: rc = 0, state enospc, ffree 57908, bavail 52864, avail 206, low 209 updated statfs: rc = 0, state enospc, ffree 57908, bavail 52864, avail 206, low 209 sanity/253 sets low to 209: local blocks=$($LFS df $MOUNT | grep $ost_name | awk ' { print $4 } ') local new_lwm=$((blocks/1024-10)) osp.lustre-OST0000-osc-MDT0000.reserved_mb_low=209

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated: