Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17968

sanity-hsm test_35: mv f35.sanity-hsm-1 f35.sanity-hsm failed

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

      This issue relates to the following test suite run:
      https://testing.whamcloud.com/test_sets/3feceae7-7c21-4445-ac43-f396a18e5395 (most recent)
      https://testing.whamcloud.com/test_sets/6f033853-88db-43aa-8411-86ebb4d9d376 (oldest)

      test_35 failed with the following error:

      /usr/lib64/lustre/tests/sanity-hsm.sh: line 2714: 421072 Killed                  timeout --signal=KILL 1 mv "$f1" "$f"
      mv /mnt/lustre/d35.sanity-hsm/f35.sanity-hsm-1 /mnt/lustre/d35.sanity-hsm/f35.sanity-hsm failed
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/95676 - 4.18.0-425.10.1.el8_7.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/95676 - 4.18.0-425.10.1.el8_lustre.x86_64

      This has been failing very intermittently since at least 2021-07-25, maybe 1/1300 runs per month.

              # mv must not block during restore
              timeout --signal=KILL 1 mv "$f1" "$f" || error "mv $f1 $f failed"
      

      Increasing the timeout from 1s to 2s would probably avoid this rare failure (very likely due to occasional slow VM behavior) and not invalidate the test functionality.

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-hsm test_35 - mv /mnt/lustre/d35.sanity-hsm/f35.sanity-hsm-1 /mnt/lustre/d35.sanity-hsm/f35.sanity-hsm failed

      Attachments

        Activity

          [LU-17968] sanity-hsm test_35: mv f35.sanity-hsm-1 f35.sanity-hsm failed
          pjones Peter Jones added a comment -

          Merged for 2.16

          pjones Peter Jones added a comment - Merged for 2.16

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/55516/
          Subject: LU-17968 tests: sanity-hsm 35 slow VM timeout fix
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: e105680411850eefd008b226eb31e5354b967808

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/55516/ Subject: LU-17968 tests: sanity-hsm 35 slow VM timeout fix Project: fs/lustre-release Branch: master Current Patch Set: Commit: e105680411850eefd008b226eb31e5354b967808

          "Frederick Dilger <fdilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55516
          Subject: LU-17968 tests: sanity-hsm 35 slow VM timeout fix
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: d7404026120dfd9c0f2bbd29e532c683de658364

          gerrit Gerrit Updater added a comment - "Frederick Dilger <fdilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55516 Subject: LU-17968 tests: sanity-hsm 35 slow VM timeout fix Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d7404026120dfd9c0f2bbd29e532c683de658364

          Trivial fix needed for an intermittent failure on your patch https://review.whamcloud.com/55488 though it is unrelated to that patch and should be fixed separately.

          adilger Andreas Dilger added a comment - Trivial fix needed for an intermittent failure on your patch https://review.whamcloud.com/55488 though it is unrelated to that patch and should be fixed separately.

          People

            fdilger Fred Dilger
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: