Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12087

sanity-scrub test 10a fails with “Fail to cleanup the env!”

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.12.0, Lustre 2.13.0, Lustre 2.10.6, Lustre 2.10.7
    • ppc64 clients
    • 3
    • 9223372036854775807

    Description

      sanity-scrub test_10a fails for ppc64 with “Fail to cleanup the env!”

      Looking at a recent failure, https://testing.whamcloud.com/test_sets/2b0db972-4859-11e9-b98a-52540065bddc, we see that we can’t remove directories on the Lustre file system from a previous sanity-scrub test. From the suite_log, we see

      rm: cannot remove '/mnt/lustre/d9.sanity-scrub/mds1': Directory not empty
       sanity-scrub test_10a: @@@@@@ FAIL: Fail to cleanup the env! 
      

      Looking at the OSS (vm1) console log, we see a Lustre error during test 9

      [ 1095.831214] Lustre: DEBUG MARKER: trevis-26vm2.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 1104.386561] LustreError: 11824:0:(ldlm_resource.c:1146:ldlm_resource_complain()) lustre-MDT0000-lwp-OST0000: namespace resource [0x200000006:0x1020000:0x0].0x0 (ffff8a8765dfe600) refcount nonzero (1) after lock cleanup; forcing cleanup.
      [ 1104.388585] LustreError: 11824:0:(ldlm_resource.c:1146:ldlm_resource_complain()) Skipped 1 previous similar message
      [ 1131.369609] Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 	    fail_val=0 2>/dev/null
      

      On the console log for client 2 (vm9), we some messages

      [ 1168.406302] Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
      [ 1174.948324] Lustre: 3119:0:(mdc_request.c:1504:mdc_read_page()) Page-wide hash collision: 0xfeffffffffffffff
      [ 1174.948439] Lustre: 3119:0:(mdc_request.c:1504:mdc_read_page()) Skipped 54 previous similar messages
      [ 1176.178907] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity-scrub test_10a: @@@@@@ FAIL: Fail to cleanup the env! 
      

      We see this issue only for ppc64 client testing. Note: Although this test has failed with the same message for non-ppc64 clients, in these cases several/most tests prior to 10a fail with not being able to clean up the environment

      In some cases, we don’t see any of the above error messages. For example for a recent 2.10.7 RC1 failure at https://testing.whamcloud.com/test_sets/3f1ccaa6-4332-11e9-92fe-52540065bddc, we don’t see any of these error messages in test 9 nor test 10.

      Other failures for sanity-scrub test 10a are at
      https://testing.whamcloud.com/test_sets/4e833ba2-b72c-11e8-a7de-52540065bddc
      https://testing.whamcloud.com/test_sets/d22cb2d0-e288-11e8-bfe1-52540065bddc
      https://testing.whamcloud.com/test_sets/660c8ec2-2734-11e9-b97f-52540065bddc

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: