Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16456

Interop conf-sanity test_132: Can not take the layout lock

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0, Lustre 2.15.3
    • Lustre 2.15.2
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/ddbab4a8-cf2a-4e76-a840-43789b47ce46

      test_132 failed with the following error:

      conf-sanity test 132: hsm_actions processed after failover
      :
      Can not take the layout lock
      
      [27701.164931] Lustre: DEBUG MARKER: dmesg
      [27701.945320] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
      [27702.659163] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
      [27709.401721] Lustre: 1441099:0:(client.c:2282:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1672867911/real 1672867911]  req@000000007bf97720 x1754124222843648/t0(0) o251->MGC10.240.29.71@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1672867917 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'umount.0'
      [27709.407170] Lustre: 1441099:0:(client.c:2282:ptlrpc_expire_one_request()) Skipped 9 previous similar messages
      [27709.834788] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
      [27709.834788] lctl dl | grep ' ST ' || true
      [27710.549943] Lustre: DEBUG MARKER: modprobe dm-flakey;
      [27710.549943] 			 dmsetup targets | grep -q flakey
      [27711.259838] Lustre: DEBUG MARKER: tunefs.lustre --param mdt.hsm_control=enabled /dev/mapper/mds1_flakey
      [27711.633812] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro
      [27712.085320] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
      [27712.808080] Lustre: DEBUG MARKER: modprobe dm-flakey;
      [27712.808080] 			 dmsetup targets | grep -q flakey
      [27713.574469] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
      [27714.291452] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
      [27715.016281] Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
      [27715.729223] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
      [27716.803220] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov  /dev/mapper/mds1_flakey /mnt/lustre-mds1
      [27717.198084] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro
      [27717.281247] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
      [27717.336762] Lustre: Found index 0 for lustre-MDT0000, updating log
      [27737.944800] Lustre: 1442465:0:(mdt_coordinator.c:1114:mdt_hsm_cdt_start()) lustre-MDT0000: trying to init HSM before MDD
      [27737.947032] LustreError: 1442465:0:(mdt_coordinator.c:1125:mdt_hsm_cdt_start()) lustre-MDT0000: cannot take the layout locks needed for registered restore: -2
      [27738.455316] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
      [27739.172322] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us
      [27740.057501] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
      [27740.646869] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
      [27741.696006] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
      [27742.140823] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-113vm4.onyx.whamcloud.com: executing set_default_debug -1 all 4
      [27742.141829] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-113vm4.onyx.whamcloud.com: executing set_default_debug -1 all 4
      [27742.593286] Lustre: DEBUG MARKER: onyx-113vm4.onyx.whamcloud.com: executing set_default_debug -1 all 4
      [27742.594102] Lustre: DEBUG MARKER: onyx-113vm4.onyx.whamcloud.com: executing set_default_debug -1 all 4
      [27743.025929] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      [27743.741089] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
      [27744.505150] Lustre: DEBUG MARKER: dmesg
      [27745.394611] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  conf-sanity test_132: @@@@@@ FAIL: Can not take the layout lock 
      [27745.799850] Lustre: DEBUG MARKER: conf-sanity test_132: @@@@@@ FAIL: Can not take the layout lock
      [27746.276110] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /autotest/autotest-1/2023-01-04/lustre-b2_15_full-part-3_47_63_cbc12095-043f-4981-9a9a-632861982003//conf-sanity.test_132.debug_log.$(hostname -s).1672867954.log;
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      conf-sanity test_132 - Can not take the layout lock

      Attachments

        Issue Links

          Activity

            [LU-16456] Interop conf-sanity test_132: Can not take the layout lock

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49602/
            Subject: LU-16456 tests: skip conf-sanity test_129/132 in interop
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 151afb445080d9a3f81fa617371b20e56afb9759

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49602/ Subject: LU-16456 tests: skip conf-sanity test_129/132 in interop Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 151afb445080d9a3f81fa617371b20e56afb9759

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49601/
            Subject: LU-16456 tests: skip conf-sanity test_129/132 in interop
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7e566c6a1f9d5324718ebc7149153f3272363b9c

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49601/ Subject: LU-16456 tests: skip conf-sanity test_129/132 in interop Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7e566c6a1f9d5324718ebc7149153f3272363b9c

            The test_133 failure is also because the test was added in 2.14.57 and is testing new functionality that doesn't exist in 2.14.0. Same with test_129 (no bug was filed for that).

            I've pushed a patch that will skip both tests.

            adilger Andreas Dilger added a comment - The test_133 failure is also because the test was added in 2.14.57 and is testing new functionality that doesn't exist in 2.14.0. Same with test_129 (no bug was filed for that). I've pushed a patch that will skip both tests.

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49602
            Subject: LU-16456 tests: skip conf-sanity test_129/132 in interop
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 17f36bcc56255da9290d21ec71575eb56ea66f7c

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49602 Subject: LU-16456 tests: skip conf-sanity test_129/132 in interop Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 17f36bcc56255da9290d21ec71575eb56ea66f7c

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49601
            Subject: LU-16456 tests: skip conf-sanity test_129/132 in interop
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ae3c44aa38dd9ea77a2a501aa5086760ce62534e

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49601 Subject: LU-16456 tests: skip conf-sanity test_129/132 in interop Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ae3c44aa38dd9ea77a2a501aa5086760ce62534e

            This has been failing since 2022-09-27. It looks like it may be related to test_122b (LU-14598) failing first along with test_129, since they have all failed exactly the same days and number of times in the past 3 months, or possibly they are all just test interop bugs and these are the only days that interop testing was run?

            I've pushed patch patch: https://review.whamcloud.com/49583 "LU-14598 tests: skip conf-sanity test_122b in interop" to fix that issue, possibly it will fix the other fallout as well?

            adilger Andreas Dilger added a comment - This has been failing since 2022-09-27. It looks like it may be related to test_122b ( LU-14598 ) failing first along with test_129, since they have all failed exactly the same days and number of times in the past 3 months, or possibly they are all just test interop bugs and these are the only days that interop testing was run? I've pushed patch patch: https://review.whamcloud.com/49583 " LU-14598 tests: skip conf-sanity test_122b in interop " to fix that issue, possibly it will fix the other fallout as well?

            People

              adilger Andreas Dilger
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: