Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12087

sanity-scrub test 10a fails with “Fail to cleanup the env!”

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.12.0, Lustre 2.13.0, Lustre 2.10.6, Lustre 2.10.7
    • ppc64 clients
    • 3
    • 9223372036854775807

    Description

      sanity-scrub test_10a fails for ppc64 with “Fail to cleanup the env!”

      Looking at a recent failure, https://testing.whamcloud.com/test_sets/2b0db972-4859-11e9-b98a-52540065bddc, we see that we can’t remove directories on the Lustre file system from a previous sanity-scrub test. From the suite_log, we see

      rm: cannot remove '/mnt/lustre/d9.sanity-scrub/mds1': Directory not empty
       sanity-scrub test_10a: @@@@@@ FAIL: Fail to cleanup the env! 
      

      Looking at the OSS (vm1) console log, we see a Lustre error during test 9

      [ 1095.831214] Lustre: DEBUG MARKER: trevis-26vm2.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 1104.386561] LustreError: 11824:0:(ldlm_resource.c:1146:ldlm_resource_complain()) lustre-MDT0000-lwp-OST0000: namespace resource [0x200000006:0x1020000:0x0].0x0 (ffff8a8765dfe600) refcount nonzero (1) after lock cleanup; forcing cleanup.
      [ 1104.388585] LustreError: 11824:0:(ldlm_resource.c:1146:ldlm_resource_complain()) Skipped 1 previous similar message
      [ 1131.369609] Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 	    fail_val=0 2>/dev/null
      

      On the console log for client 2 (vm9), we some messages

      [ 1168.406302] Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
      [ 1174.948324] Lustre: 3119:0:(mdc_request.c:1504:mdc_read_page()) Page-wide hash collision: 0xfeffffffffffffff
      [ 1174.948439] Lustre: 3119:0:(mdc_request.c:1504:mdc_read_page()) Skipped 54 previous similar messages
      [ 1176.178907] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity-scrub test_10a: @@@@@@ FAIL: Fail to cleanup the env! 
      

      We see this issue only for ppc64 client testing. Note: Although this test has failed with the same message for non-ppc64 clients, in these cases several/most tests prior to 10a fail with not being able to clean up the environment

      In some cases, we don’t see any of the above error messages. For example for a recent 2.10.7 RC1 failure at https://testing.whamcloud.com/test_sets/3f1ccaa6-4332-11e9-92fe-52540065bddc, we don’t see any of these error messages in test 9 nor test 10.

      Other failures for sanity-scrub test 10a are at
      https://testing.whamcloud.com/test_sets/4e833ba2-b72c-11e8-a7de-52540065bddc
      https://testing.whamcloud.com/test_sets/d22cb2d0-e288-11e8-bfe1-52540065bddc
      https://testing.whamcloud.com/test_sets/660c8ec2-2734-11e9-b97f-52540065bddc

      Attachments

        Issue Links

          Activity

            [LU-12087] sanity-scrub test 10a fails with “Fail to cleanup the env!”

            We see sanityn test 37 fail with

            == sanityn test 37: check i_size is not updated for directory on close (bug 18695) =================== 04:35:07 (1555994107)
            multiop /mnt/lustre/d37.sanityn vD_c
            TMPPIPE=/tmp/multiop_open_wait_pipe.9642
            total: 10000 create in 6.51 seconds: 1536.03 ops/second
             sanityn test_37: @@@@@@ FAIL: 3523 != 10000 truncated directory? 
            

            You see the 'hash collision' message on the client 1 console log

            [11634.540885] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanityn test 37: check i_size is not updated for directory on close \(bug 18695\) =================== 04:35:07 \(1555994107\)
            [11634.743158] Lustre: DEBUG MARKER: == sanityn test 37: check i_size is not updated for directory on close (bug 18695) =================== 04:35:07 (1555994107)
            [11641.370854] Lustre: 31253:0:(mdc_request.c:1519:mdc_read_page()) Page-wide hash collision: 0xfeffffffffffffff
            [11641.370970] Lustre: 31253:0:(mdc_request.c:1519:mdc_read_page()) Skipped 1 previous similar message
            [11641.536138] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanityn test_37: @@@@@@ FAIL: 3523 != 10000 truncated directory? 
            [11641.718318] Lustre: DEBUG MARKER: sanityn test_37: @@@@@@ FAIL: 3523 != 10000 truncated directory?
            

            See the following for logs:
            https://testing.whamcloud.com/test_sets/7d68fa34-668f-11e9-8bb1-52540065bddc

            jamesanunez James Nunez (Inactive) added a comment - We see sanityn test 37 fail with == sanityn test 37: check i_size is not updated for directory on close (bug 18695) =================== 04:35:07 (1555994107) multiop /mnt/lustre/d37.sanityn vD_c TMPPIPE=/tmp/multiop_open_wait_pipe.9642 total: 10000 create in 6.51 seconds: 1536.03 ops/second sanityn test_37: @@@@@@ FAIL: 3523 != 10000 truncated directory? You see the 'hash collision' message on the client 1 console log [11634.540885] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanityn test 37: check i_size is not updated for directory on close \(bug 18695\) =================== 04:35:07 \(1555994107\) [11634.743158] Lustre: DEBUG MARKER: == sanityn test 37: check i_size is not updated for directory on close (bug 18695) =================== 04:35:07 (1555994107) [11641.370854] Lustre: 31253:0:(mdc_request.c:1519:mdc_read_page()) Page-wide hash collision: 0xfeffffffffffffff [11641.370970] Lustre: 31253:0:(mdc_request.c:1519:mdc_read_page()) Skipped 1 previous similar message [11641.536138] Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanityn test_37: @@@@@@ FAIL: 3523 != 10000 truncated directory? [11641.718318] Lustre: DEBUG MARKER: sanityn test_37: @@@@@@ FAIL: 3523 != 10000 truncated directory? See the following for logs: https://testing.whamcloud.com/test_sets/7d68fa34-668f-11e9-8bb1-52540065bddc

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: