Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17430

interop sanity-hsm test_114: request on <fid> is not SUCCEED on mds1

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for eaujames <eaujames@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/bb8dce7a-a62a-4093-9cfb-8324978304a1

      test_114 failed with the following error:

      request on 0x200000402:0x253:0x0 is not SUCCEED on mds1
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/101373 - 4.18.0-477.27.1.el8_8.x86_64
      servers: https://build.whamcloud.com/job/lustre-b2_15/81 - 4.18.0-513.9.1.el8_lustre.x86_64

      The test fails with:

      == sanity-hsm test 114: Incompatible request does not set other requests as STARTED ========================================================== 15:42:52 (1705333372)
      0+0 records in
      0+0 records out
      0 bytes copied, 0.000421867 s, 0.0 kB/s
      0+0 records in
      0+0 records out
      0 bytes copied, 0.000465835 s, 0.0 kB/s
      CMD: trevis-121vm9 mkdir -p /tmp/arc1/sanity-hsm.test_114/
      Starting copytool 'agt1' on 'trevis-121vm9' with cmdline 'lhsmtool_posix --archive-format=v2 --hsm-root=/tmp/arc1/sanity-hsm.test_114/ --daemon --pid-file=/var/run/lhsmtool_posix.pid  "/mnt/lustre2"'
      CMD: trevis-121vm9 lhsmtool_posix --archive-format=v2 --hsm-root=/tmp/arc1/sanity-hsm.test_114/ --daemon --pid-file=/var/run/lhsmtool_posix.pid  "/mnt/lustre2" < /dev/null > "/autotest/autotest-2/2024-01-15/lustre-reviews_custom_101373_1003_c53f4cf7-0b84-488f-b7c2-5d058bec8e18//sanity-hsm.test_114.copytool_log.trevis-121vm9.log" 2>&1
      CMD: trevis-102vm7 /usr/sbin/lctl set_param mdt.lustre-MDT0000.hsm_control='disabled'
      mdt.lustre-MDT0000.hsm_control=disabled
      CMD: trevis-102vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm_control
      CMD: trevis-102vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000402:0x253:0x0'.*action='ARCHIVE'/ {print \$13}' | cut -f2 -d=
      CMD: trevis-102vm7 /usr/sbin/lctl set_param mdt.lustre-MDT0000.hsm_control='enabled'
      mdt.lustre-MDT0000.hsm_control=enabled
      CMD: trevis-102vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm_control
      CMD: trevis-102vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000402:0x253:0x0'.*action='ARCHIVE'/ {print \$13}' | cut -f2 -d=
      Waiting 200s for 'SUCCEED'
      ...
      Waiting 10s for 'SUCCEED'
      ...
      CMD: trevis-102vm7 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000402:0x253:0x0'.*action='ARCHIVE'/ {print \$13}' | cut -f2 -d=
      Waiting 0s for 'SUCCEED'
      Update not seen after 200s: want 'SUCCEED' got 'STARTED'
       sanity-hsm test_114: @@@@@@ FAIL: request on 0x200000402:0x253:0x0 is not SUCCEED on mds1 
      

      The request is not received by the copytool:

      lhsmtool_posix: 1705333373.574471 lhsmtool_posix[144023]: action=0 src=(null) dst=(null) mount_point=/mnt/lustre2
      lhsmtool_posix: 1705333373.578036 lhsmtool_posix[144024]: waiting for message from kernel
      exiting: Terminated
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-hsm test_114 - request on 0x200000402:0x253:0x0 is not SUCCEED on mds1

      Attachments

        Activity

          People

            eaujames Etienne Aujames
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: