Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11833

wait_request_state return failure when call hsm_archive and hsm_remove multiple times

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      The following test script will result in wait_reqeust_state failure:

       

      test_8() {
       local file=$DIR/$tdir/$tfile
       local hsm_root=$(hsm_root)
       local mdtidx=${4:-0}
       local fid
      copytool setup -m "$MOUNT" -a "$HSM_ARCHIVE_NUMBER"
      mkdir -p $DIR/$tdir
       do_facet $SINGLEAGT "echo -n new_data > $file"
       fid=$(path2fid $file)
      $LFS hsm_state $file
       $LFS hsm_archive --archive $HSM_ARCHIVE_NUMBER $file ||
       error "Archive $file failed"
       $LCTL get_param -n ${MDT_PREFIX}${mdtidx}.hsm.actions
       wait_request_state $fid ARCHIVE SUCCEED
       $LFS hsm_remove $file
       wait_request_state $fid REMOVE SUCCEED
      $LFS hsm_state $file
       $LFS hsm_archive --archive $HSM_ARCHIVE_NUMBER $file ||
       error "Archive $file failed"
       $LCTL get_param -n ${MDT_PREFIX}${mdtidx}.hsm.actions
       wait_request_state $fid ARCHIVE SUCCEED
       $LFS hsm_remove $file
       wait_request_state $fid REMOVE SUCCEED
      }
      run_test 8 "HSM problem..."
      

      The failure report is as follows:

      == sanity-pcc test 8: Problem Finding... ============================================================= 15:58:50 (1546415930)
      Starting copytool agt1 on qian
      /mnt/lustre/d8.sanity-pcc/f8.sanity-pcc: (0x00000000)
      lrh=[type=10680000 len=136 idx=1/1] fid=[0x200000401:0x4:0x0] dfid=[0x200000401:0x4:0x0] compound/cookie=0x0/0x5c2c6f2c action=ARCHIVE archive#=2 flags=0x0 extent=0x0-0xffffffffffffffff gid=0x0 datalen=0 status=WAITING data=[]
      Waiting 200 secs for update
      Changed after 1s: from 'WAITING' to 'STARTED'
      Updated after 2s: wanted 'SUCCEED' got 'SUCCEED'
      /mnt/lustre/d8.sanity-pcc/f8.sanity-pcc: (0x00000000), archive_id:2
      lrh=[type=10680000 len=136 idx=1/1] fid=[0x200000401:0x4:0x0] dfid=[0x200000401:0x4:0x0] compound/cookie=0x0/0x5c2c6f2c action=ARCHIVE archive#=2 flags=0x0 extent=0x0-0xffffffffffffffff gid=0x0 datalen=0 status=SUCCEED data=[]
      lrh=[type=10680000 len=136 idx=1/2] fid=[0x200000401:0x4:0x0] dfid=[0x200000401:0x4:0x0] compound/cookie=0x0/0x5c2c6f2d action=REMOVE archive#=2 flags=0x0 extent=0x0-0xffffffffffffffff gid=0x0 datalen=0 status=SUCCEED data=[]
      lrh=[type=10680000 len=136 idx=1/3] fid=[0x200000401:0x4:0x0] dfid=[0x200000401:0x4:0x0] compound/cookie=0x0/0x5c2c6f2e action=ARCHIVE archive#=2 flags=0x0 extent=0x0-0xffffffffffffffff gid=0x0 datalen=0 status=WAITING data=[]
      Waiting 200 secs for update
      Changed after 1s: from 'SUCCEED
      WAITING' to 'SUCCEED
      SUCCEED'
      Waiting 190 secs for update
      Changed after 14s: from 'SUCCEED
      SUCCEED' to ''
      Waiting 180 secs for update
      Waiting 170 secs for update
      Waiting 160 secs for update
      Waiting 150 secs for update
      Waiting 140 secs for update
      Waiting 130 secs for update
      Waiting 120 secs for update
      Waiting 110 secs for update
      Waiting 100 secs for update
      Waiting 90 secs for update
      Waiting 80 secs for update
      Waiting 70 secs for update
      Waiting 60 secs for update
      Waiting 50 secs for update
      Waiting 40 secs for update
      Waiting 30 secs for update
      Waiting 20 secs for update
      Waiting 10 secs for update
      Update not seen after 200s: wanted 'SUCCEED' got ''
      

       

      Attachments

        Activity

          People

            wc-triage WC Triage
            qian_wc Qian Yingjin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: