Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14017

Interop: sanityn tests 43a and 45a fail with 'mkdir must succeed'

Details

    • Bug
    • Resolution: Done
    • Minor
    • None
    • Lustre 2.14.0, Lustre 2.15.0
    • 2.13.52.52 or later servers with earlier (< 2.13.52.52 ) clients
    • 3
    • 9223372036854775807

    Description

      sanityn tests 43a and 45a fail with 'mkdir must succeed' after the patch https://review.whamcloud.com/30880 for LU-10235 landed to master in FEB 2020.

      Starting on 23 FEB 2020, interop testing between master servers and older clients, sanityn test_43a and test_45a fail with

      == sanityn test 43a: pdirops: unlink vs mkdir ======================================================== 12:50:41 (1582462241)
      CMD: trevis-14vm4 /usr/sbin/lctl set_param -n ldlm.namespaces.*mdt*.lru_size=clear
      CMD: trevis-14vm4 /usr/sbin/lctl get_param ldlm.namespaces.*mdt*.lock_unused_count ldlm.namespaces.*mdt*.lock_count
      ldlm.namespaces.mdt-lustre-MDT0000_UUID.lock_count=35
      CMD: trevis-14vm4 lctl set_param fail_loc=0x80000145
      fail_loc=0x80000145
      mkdir: cannot create directory '/mnt/lustre2/f43a.sanityn': File exists
       sanityn test_43a: @@@@@@ FAIL: mkdir must succeed 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:5900:error()
        = /usr/lib64/lustre/tests/sanityn.sh:1887:test_43a()
      

      and

      == sanityn test 45a: pdirops: rename src vs mkdir ==================================================== 12:55:13 (1582462513)
      CMD: trevis-14vm4 /usr/sbin/lctl set_param -n ldlm.namespaces.*mdt*.lru_size=clear
      CMD: trevis-14vm4 /usr/sbin/lctl get_param ldlm.namespaces.*mdt*.lock_unused_count ldlm.namespaces.*mdt*.lock_count
      ldlm.namespaces.mdt-lustre-MDT0000_UUID.lock_count=35
      CMD: trevis-14vm4 lctl set_param fail_loc=0x80000145
      fail_loc=0x80000145
      mkdir: cannot create directory '/mnt/lustre2/f45a.sanityn': File exists
       sanityn test_45a: @@@@@@ FAIL: mkdir must succeed 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:5900:error()
        = /usr/lib64/lustre/tests/sanityn.sh:2170:test_45a()
      

      We may just need a Lustre version check to skip or run different tests in this situation.

      Here are a few link to logs for past 138 failures:
      2.13.52.52 servers/2.12.4 clients - https://testing.whamcloud.com/test_sets/048a979e-1a4a-4178-8bdf-568314985c82
      2.13.52.52 servers/2.13.0 clients - https://testing.whamcloud.com/test_sets/c390a7a8-c0d8-456b-9708-e24b07508f2c
      2.13.56.6 servers/2.12.5 clients - https://testing.whamcloud.com/test_sets/95c71443-8c73-41c7-8b9e-c1ada7bc7b48
      2.13.56.6 servers/2.13.0 clients - https://testing.whamcloud.com/test_sets/4e2ff8da-0590-4175-818f-658a777b3aea

      Attachments

        Issue Links

          Activity

            [LU-14017] Interop: sanityn tests 43a and 45a fail with 'mkdir must succeed'
            pjones Peter Jones made changes -
            Fix Version/s Original: Lustre 2.12.9 [ 15490 ]
            adilger Andreas Dilger made changes -
            Fix Version/s New: Lustre 2.12.9 [ 15490 ]
            Resolution New: Done [ 10000 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            adilger Andreas Dilger added a comment - - edited

            The patch https://review.whamcloud.com/41674 "LU-10235 mdt: mdt_create: check EEXIST without lock" was landed to b2_12 and changes these tests to match master, so they will no longer fail in interop testing with 2.15/master. However, they will presumably start failing with 2.10 clients, but I don't think we care to fix that.

            NB: the most recent interop test failure was 2.12.8.6, but the patch landed as 2.12.8.21.

            adilger Andreas Dilger added a comment - - edited The patch https://review.whamcloud.com/41674 " LU-10235 mdt: mdt_create: check EEXIST without lock " was landed to b2_12 and changes these tests to match master, so they will no longer fail in interop testing with 2.15/master. However, they will presumably start failing with 2.10 clients, but I don't think we care to fix that. NB: the most recent interop test failure was 2.12.8.6, but the patch landed as 2.12.8.21.
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-10235 [ LU-10235 ]
            sarah Sarah Liu made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 27853 ]
            sarah Sarah Liu made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 27047 ]
            sarah Sarah Liu made changes -
            Affects Version/s New: Lustre 2.15.0 [ 14791 ]

            Both 43a and 45a are modified by that patch, so I suspect that we just need to skip these subtests in interop testing before 2.13.53.

            adilger Andreas Dilger added a comment - Both 43a and 45a are modified by that patch, so I suspect that we just need to skip these subtests in interop testing before 2.13.53.
            jamesanunez James Nunez (Inactive) made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 24767 ]
            jamesanunez James Nunez (Inactive) made changes -
            Affects Version/s New: Lustre 2.14.0 [ 14490 ]

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: