Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15762

Interop: sanity-hsm test_500: mdt_coordinator.c:1629:mdt_hsm_update_request_state()) lustre-MDT0000: Cannot find running request for cookie 0x624a1c9f on fid=[0x2000088da:0xe2:0x0]

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.15.0, Lustre 2.15.3, Lustre 2.15.5, Lustre 2.15.6
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Cliff White <cwhite@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/9d46a14e-d93b-4614-b0b0-dd5eb9c8aa31

      test_500 failed with the following error:

      One llapi HSM test failed
      

      Very little information in test logs:

      [17004.240938] Lustre: DEBUG MARKER: == sanity-hsm test 500: various LLAPI HSM tests ========== 22:16:14 (1649024174)
      [17021.355953] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity-hsm test_500: @@@@@@ FAIL: One llapi HSM test failed 
      [17021.748438] Lustre: DEBUG MARKER: sanity-hsm test_500: @@@@@@ FAIL: One llapi HSM test failed
      

      MDS 1 logs show apparent failure:

      [16305.514145] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param mdt.lustre-MDT0000.hsm.actions
      [16306.785240] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | grep '0x2000088d2:0x2d9:0x0' | egrep 'WAITING|STARTED'
      [16311.214616] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity-hsm test 500: various LLAPI HSM tests ========== 22:16:14 \(1649024174\)
      [16311.630214] Lustre: DEBUG MARKER: == sanity-hsm test 500: various LLAPI HSM tests ========== 22:16:14 (1649024174)
      [16314.422451] Lustre: HSM agent f54da81c-0850-4a81-8f9a-54b4367ad171 already registered
      [16321.985881] LustreError: 26958:0:(mdt_coordinator.c:1629:mdt_hsm_update_request_state()) lustre-MDT0000: Cannot find running request for cookie 0x624a1c9f on fid=[0x2000088da:0xe2:0x0]
      [16328.627565] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity-hsm test_500: @@@@@@ FAIL: One llapi HSM test failed 
      [16329.093187] Lustre: DEBUG MARKER: sanity-hsm test_500: @@@@@@ FAIL: One llapi HSM test failed
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-hsm test_500 - One llapi HSM test failed

      Attachments

        Issue Links

          Activity

            [LU-15762] Interop: sanity-hsm test_500: mdt_coordinator.c:1629:mdt_hsm_update_request_state()) lustre-MDT0000: Cannot find running request for cookie 0x624a1c9f on fid=[0x2000088da:0xe2:0x0]

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/58786/
            Subject: LU-15762 tests: skip llapi_hsm_test113 on old server
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 066ead4d1eecd0e3586582b4866f12b1a1a8f628

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/58786/ Subject: LU-15762 tests: skip llapi_hsm_test113 on old server Project: fs/lustre-release Branch: master Current Patch Set: Commit: 066ead4d1eecd0e3586582b4866f12b1a1a8f628

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58786
            Subject: LU-15762 tests: skip llapi_hsm_test113 on old server
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: b9e73cb051f15fd7d8ebadcccaa000f05151a05c

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58786 Subject: LU-15762 tests: skip llapi_hsm_test113 on old server Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: b9e73cb051f15fd7d8ebadcccaa000f05151a05c

            Sorry, my mistake. The "test113_progress: assertion 'hca.hca_location.length == (i+2)*length/10' failed:" message is the stdout from the failed subtest, while the mdt_hsm_update_request_state() message is printed to the console, for as long as I checked back to old failures. It looks like these are the same longstanding bug.

            adilger Andreas Dilger added a comment - Sorry, my mistake. The " test113_progress: assertion 'hca.hca_location.length == (i+2)*length/10' failed: " message is the stdout from the failed subtest, while the mdt_hsm_update_request_state() message is printed to the console, for as long as I checked back to old failures. It looks like these are the same longstanding bug.
            yujian Jian Yu added a comment -

            This looks like a different error and should get a new LU ticket

            LU-18628

            yujian Jian Yu added a comment - This looks like a different error and should get a new LU ticket LU-18628

            This looks like a different error and should get a new LU ticket:

            llapi_hsm_test: llapi_hsm_test.c:1048: test113_progress: assertion 'hca.hca_location.length == (i+2)*length/10' failed: i=1, length=400
            
            adilger Andreas Dilger added a comment - This looks like a different error and should get a new LU ticket: llapi_hsm_test: llapi_hsm_test.c:1048: test113_progress: assertion 'hca.hca_location.length == (i+2)*length/10' failed: i=1, length=400
            yujian Jian Yu added a comment -

            Test session details:
            clients: https://build.whamcloud.com/job/lustre-b2_15/99 - 4.18.0-513.24.1.el8_9.x86_64
            servers: https://build.whamcloud.com/job/lustre-b2_14/2 - 4.18.0-240.1.1.el8_lustre.x86_64
            The same failure occurred: https://testing.whamcloud.com/test_sets/60a0c589-54c0-4178-a290-74df6ed4c695

            yujian Jian Yu added a comment - Test session details: clients: https://build.whamcloud.com/job/lustre-b2_15/99 - 4.18.0-513.24.1.el8_9.x86_64 servers: https://build.whamcloud.com/job/lustre-b2_14/2 - 4.18.0-240.1.1.el8_lustre.x86_64 The same failure occurred: https://testing.whamcloud.com/test_sets/60a0c589-54c0-4178-a290-74df6ed4c695

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: