Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14087

sanity-hsm test 254b fails with 'Expected 0 (!= '60') active restore requests'

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.14.0
    • None
    • RHEL8.2
    • 3
    • 9223372036854775807

    Description

      sanity-hsm test_254b fails for el8.2 with “'Expected 0 (!= '60') active restore requests”

      Looking at the failure at https://testing.whamcloud.com/test_sets/39cda0dc-b495-4af9-b0ca-757042d6fd3a, we see the following in the suite_log

      == sanity-hsm test 254b: Request counters are correctly incremented and decremented ================== 01:46:54 (1603849614)
      Will launch 60 requests of each type
      CMD: trevis-4vm6 mkdir -p /tmp/arc1/sanity-hsm.test_254b/
      Starting copytool agt1 on trevis-4vm6
      CMD: trevis-4vm6 lhsmtool_posix  --daemon --hsm-root "/tmp/arc1/sanity-hsm.test_254b/" "/mnt/lustre2" < /dev/null > "/autotest/autotest-1/2020-10-27/lustre-reviews_review-dne-part-2_77301_1_104_721734b9-fe3f-443d-bbbf-9f1c00a88e0e/sanity-hsm.test_254b.copytool_log.trevis-4vm6.log" 2>&1
      CMD: trevis-4vm8 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.max_requests
      CMD: trevis-4vm8 /usr/sbin/lctl set_param -n mdt.lustre-MDT0000.hsm.max_requests=60
      CMD: trevis-4vm9 /usr/sbin/lctl set_param -n mdt.lustre-MDT0001.hsm.max_requests=60
      CMD: trevis-4vm8 /usr/sbin/lctl set_param -n mdt.lustre-MDT0002.hsm.max_requests=60
      CMD: trevis-4vm9 /usr/sbin/lctl set_param -n mdt.lustre-MDT0003.hsm.max_requests=60
      Checking archive requests
      CMD: trevis-4vm6 libtool execute pkill -STOP -x lhsmtool_posix
      Copytool is suspended on trevis-4vm6
      CMD: trevis-4vm8 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions
      CMD: trevis-4vm8 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.archive_count
      CMD: trevis-4vm6 libtool execute pkill -CONT -x lhsmtool_posix
      Copytool is continued on trevis-4vm6
      CMD: trevis-4vm8 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions
      CMD: trevis-4vm8 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.archive_count
      Checking restore requests
      CMD: trevis-4vm6 libtool execute pkill -STOP -x lhsmtool_posix
      Copytool is suspended on trevis-4vm6
      CMD: trevis-4vm8 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions
      CMD: trevis-4vm8 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.restore_count
      CMD: trevis-4vm6 libtool execute pkill -CONT -x lhsmtool_posix
      Copytool is continued on trevis-4vm6
      CMD: trevis-4vm8 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions
      CMD: trevis-4vm8 /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.restore_count
       sanity-hsm test_254b: @@@@@@ FAIL: Expected 0 (!= '60')  active restore requests 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:6254:error()
        = /usr/lib64/lustre/tests/sanity-hsm.sh:4363:test_254b()
      

      There is nothing obviously wrong in the copytool log nor in the console logs.

      We’ve seen this at least once before:
      https://testing.whamcloud.com/test_sets/8d8131a1-8b0e-4260-ae5a-ebc98157ca2d

      Attachments

        Issue Links

          Activity

            [LU-14087] sanity-hsm test 254b fails with 'Expected 0 (!= '60') active restore requests'
            hornc Chris Horn added a comment - +1 on master https://testing.whamcloud.com/test_sets/3b8ce249-8345-4481-9c0d-302cb554e971

            This is hit about 3x/week for the past 6 months.

            adilger Andreas Dilger added a comment - This is hit about 3x/week for the past 6 months.
            arshad512 Arshad Hussain added a comment - +1 on master ( https://testing.whamcloud.com/sub_tests/45a59b3f-d5c3-4288-a4e5-4e497eb75e47 )
            nangelinas Nikitas Angelinas added a comment - +1 on master with archive requests: https://testing.whamcloud.com/test_sets/2b45cc81-2e99-45e9-a150-b97c2aa266a4

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: