Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0, Lustre 2.12.5
    • Lustre 2.13.0, Lustre 2.12.4
    • RHEL 8.1 client + RHEL 7.7 server
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for jianyu <yujian@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/f4707d44-3ba7-11ea-bb75-52540065bddc

      test_70 failed with the following error:

      == sanity-hsm test 70: Copytool logs JSON register/unregister events to FIFO ========================= 06:37:48 (1579415868)
      CMD: trevis-12vm7 mktemp --tmpdir=/tmp -d sanity-hsm.test_70.XXXX
      CMD: trevis-12vm7 mkfifo -m 0644 /tmp/sanity-hsm.test_70.r3C7/fifo
      CMD: trevis-12vm7 cat /tmp/sanity-hsm.test_70.r3C7/fifo > /tmp/sanity-hsm.test_70.r3C7/events & echo \$! > /tmp/sanity-hsm.test_70.r3C7/monitor_pid
      

      Timeout occurred after 238 mins, last suite running was sanity-hsm, restarting cluster to continue tests

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-hsm test_70 - Timeout occurred after 238 mins, last suite running was sanity-hsm, restarting cluster to continue tests

      Attachments

        Activity

          [LU-13160] sanity-hsm test 70 timeout
          dongyang Dongyang Li added a comment -

          Yes, I'm closing the ticket.

          dongyang Dongyang Li added a comment - Yes, I'm closing the ticket.

          Is this complete?

          simmonsja James A Simmons added a comment - Is this complete?

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37773/
          Subject: LU-13160 tests: fix sanity-hsm monitor setup
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set:
          Commit: c0a877ab3b049266042299a438d8d010ce3ce605

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37773/ Subject: LU-13160 tests: fix sanity-hsm monitor setup Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: c0a877ab3b049266042299a438d8d010ce3ce605

          Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37773
          Subject: LU-13160 tests: fix sanity-hsm monitor setup
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set: 1
          Commit: 2f639102be1ded4504027a735affa041dde3552a

          gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37773 Subject: LU-13160 tests: fix sanity-hsm monitor setup Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 2f639102be1ded4504027a735affa041dde3552a

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37595/
          Subject: LU-13160 tests: fix sanity-hsm monitor setup
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 6724d8ca58e9b8474a180b013a4723cbdd8900d9

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37595/ Subject: LU-13160 tests: fix sanity-hsm monitor setup Project: fs/lustre-release Branch: master Current Patch Set: Commit: 6724d8ca58e9b8474a180b013a4723cbdd8900d9

          Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/37595
          Subject: LU-13160 tests: fix sanity-hsm monitor setup
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 093d9f2fe4d80060db46475bf32d4b985edebf97

          gerrit Gerrit Updater added a comment - Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/37595 Subject: LU-13160 tests: fix sanity-hsm monitor setup Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 093d9f2fe4d80060db46475bf32d4b985edebf97
          yujian Jian Yu added a comment -

          Sure, Dongyang.
          I usually directly specify variable values to run auster. Here is the command I ran on onyx-22vm3 under /usr/lib64/lustre/tests:

          [root@onyx-22vm3 tests]# PDSH="pdsh -t 120 -S -R ssh -w" NAME=ncli RCLIENTS=onyx-22vm5 mds_HOST=onyx-22vm1 MDSDEV=/dev/vda5 MDSSIZE=2097152 ost_HOST=onyx-22vm4 OSTCOUNT=2 OSTSIZE=2097152 OSTDEV1=/dev/vda5 OSTDEV2=/dev/vda6 SHARED_DIRECTORY="/home/jianyu/test_logs" VERBOSE=true bash auster -d /home/jianyu/test_logs -r -s -v -k sanity-hsm --only 70
          
          yujian Jian Yu added a comment - Sure, Dongyang. I usually directly specify variable values to run auster. Here is the command I ran on onyx-22vm3 under /usr/lib64/lustre/tests: [root@onyx-22vm3 tests]# PDSH="pdsh -t 120 -S -R ssh -w" NAME=ncli RCLIENTS=onyx-22vm5 mds_HOST=onyx-22vm1 MDSDEV=/dev/vda5 MDSSIZE=2097152 ost_HOST=onyx-22vm4 OSTCOUNT=2 OSTSIZE=2097152 OSTDEV1=/dev/vda5 OSTDEV2=/dev/vda6 SHARED_DIRECTORY="/home/jianyu/test_logs" VERBOSE=true bash auster -d /home/jianyu/test_logs -r -s -v -k sanity-hsm --only 70
          dongyang Dongyang Li added a comment -

          Many thanks Jian,

          Looks like the session on onyx-22vm3 has already finished, I was trying to start a new session with something like --only 70 but I noticed the cfg/local.sh was not setup.

          Can I just use auster under /usr/lib64/lustre/tests on 22vm3? if not can you provide a cmd for it?

          dongyang Dongyang Li added a comment - Many thanks Jian, Looks like the session on onyx-22vm3 has already finished, I was trying to start a new session with something like --only 70 but I noticed the cfg/local.sh was not setup. Can I just use auster under /usr/lib64/lustre/tests on 22vm3? if not can you provide a cmd for it?
          yujian Jian Yu added a comment -

          Sure, Dongyang.

          Here are the test nodes:

          2 Clients: onyx-22vm3 (local), onyx-22vm5 (remote)
          1 MGS/MDS: onyx-22vm1 (1 MDT)
          1 OSS: onyx-22vm4 (2 OSTs)
          

          Now sanity-hsm test 70 is hanging on onyx-22vm3:

          == sanity-hsm test 70: Copytool logs JSON register/unregister events to FIFO ========================= 06:18:09 (1581574689)
          CMD: onyx-22vm5 mktemp --tmpdir=/tmp -d sanity-hsm.test_70.XXXX
          CMD: onyx-22vm5 mkfifo -m 0644 /tmp/sanity-hsm.test_70.Whg8/fifo
          CMD: onyx-22vm5 cat /tmp/sanity-hsm.test_70.Whg8/fifo > /tmp/sanity-hsm.test_70.Whg8/events & echo \$! > /tmp/sanity-hsm.test_70.Whg8/monitor_pid
          
          yujian Jian Yu added a comment - Sure, Dongyang. Here are the test nodes: 2 Clients: onyx-22vm3 (local), onyx-22vm5 (remote) 1 MGS/MDS: onyx-22vm1 (1 MDT) 1 OSS: onyx-22vm4 (2 OSTs) Now sanity-hsm test 70 is hanging on onyx-22vm3: == sanity-hsm test 70: Copytool logs JSON register/unregister events to FIFO ========================= 06:18:09 (1581574689) CMD: onyx-22vm5 mktemp --tmpdir=/tmp -d sanity-hsm.test_70.XXXX CMD: onyx-22vm5 mkfifo -m 0644 /tmp/sanity-hsm.test_70.Whg8/fifo CMD: onyx-22vm5 cat /tmp/sanity-hsm.test_70.Whg8/fifo > /tmp/sanity-hsm.test_70.Whg8/events & echo \$! > /tmp/sanity-hsm.test_70.Whg8/monitor_pid
          dongyang Dongyang Li added a comment -

          Yes I was using 2 clients.

          Are you reproducing it with the lab vms? if so can you create a setup and make sure the test case fails, then I can get into the vm and have a look?

          Thanks a lot

          dongyang Dongyang Li added a comment - Yes I was using 2 clients. Are you reproducing it with the lab vms? if so can you create a setup and make sure the test case fails, then I can get into the vm and have a look? Thanks a lot

          People

            dongyang Dongyang Li
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: