Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14200

ost-pools test 23b fails with 'dd did not fail with ENOSPC'

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.14.0
    • RHEL8.3 servers/clients
    • 3
    • 9223372036854775807

    Description

      ost-pools test_23b fails for RHEL 8.3 server/clients.

      Looking at the suite_log for the failure at https://testing.whamcloud.com/test_sets/8445d81a-2e35-493c-bcf7-311524a97aa5, this is the first time we’ve seen ost-pools test_23b fail with

      [4 iteration] dd: closing output file '/mnt/lustre/d23b.ost-pools/dir/f23b.ost-pools-quota4': Input/output error
      total written: 20971520
      stime=1607360253, etime=1607360637, elapsed=384
       ost-pools test_23b: @@@@@@ FAIL: dd did not fail with ENOSPC 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:6257:error()
        = /usr/lib64/lustre/tests/ost-pools.sh:1360:test_23b()
      

      We have seen this test fail with the same error message, see LU-10396, and with “write error ... Input/output error“, but this is the first time we see dd fail with “close ... Input/output error”. From the value after total written, it looks like dd did fail, just not with “No space left on device" as the test requires/is looking for.

      In the client1 (vm) dmesg log, we see

      [ 5322.937853] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.testpool | sort -u | tr '\n' ' ' 
      [ 5342.709668] Lustre: lustre-OST0001-osc-ffff913ecdbb6000: disconnect after 20s idle
      [ 5665.004748] LustreError: 7828:0:(osc_request.c:1947:osc_brw_fini_request()) lustre-OST0000-osc-ffff913ecdbb6000: unexpected positive size 1
      [ 5713.926869] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  ost-pools test_23b: @@@@@@ FAIL: dd did not fail with ENOSPC 
      

      Attachments

        Issue Links

          Activity

            [LU-14200] ost-pools test 23b fails with 'dd did not fail with ENOSPC'
            jamesanunez James Nunez (Inactive) added a comment - - edited

            It looks like we are seeing this same error in sanity-quota test 9 seen in RHEL8.3 client/server testing:
            https://testing.whamcloud.com/test_sets/79e432c6-29f8-4449-a43f-372fffb66c4d
            https://testing.whamcloud.com/test_sets/7bd7e9d6-c722-41f4-8e4d-9619692b5da4

            We also see sanity-flr tests 204e and 204f fail in the similar way; https://testing.whamcloud.com/test_sets/fbde9358-e264-4548-95fb-236490a7135b .

            jamesanunez James Nunez (Inactive) added a comment - - edited It looks like we are seeing this same error in sanity-quota test 9 seen in RHEL8.3 client/server testing: https://testing.whamcloud.com/test_sets/79e432c6-29f8-4449-a43f-372fffb66c4d https://testing.whamcloud.com/test_sets/7bd7e9d6-c722-41f4-8e4d-9619692b5da4 We also see sanity-flr tests 204e and 204f fail in the similar way; https://testing.whamcloud.com/test_sets/fbde9358-e264-4548-95fb-236490a7135b .

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: