Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16275

replay-dual 25 test fail when do test without RPM install

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      == replay-dual test 25: replay|resend ==================== 02:48:05 (1663728485)
      1+0 records in
      1+0 records out
      512 bytes copied, 0.0019952 s, 257 kB/s
      fail_loc=0x304
      fail_loc=0x304
      fail_loc=0x304
      CMD: lustre-xwcomty2-01 multiop /mnt/lustre2/f25.replay-dual Ow512
      CMD: lustre-xwcomty2-05 lctl set_param fail_loc=0x80000325
      fail_loc=0x80000325
      Failing ost1 on lustre-xwcomty2-05
      CMD: lustre-xwcomty2-05 grep -c /mnt/lustre-ost1' ' /proc/mounts || true
      Stopping /mnt/lustre-ost1 (opts:) on lustre-xwcomty2-05
      CMD: lustre-xwcomty2-05 umount -d /mnt/lustre-ost1
      CMD: lustre-xwcomty2-05 lsmod | grep lnet > /dev/null &&
      lctl dl | grep ' ST ' || true
      reboot facets: ost1
      Failover ost1 to lustre-xwcomty2-05
      mount facets: ost1
      CMD: lustre-xwcomty2-05 dmsetup status /dev/mapper/ost1_flakey >/dev/null 2>&1
      CMD: lustre-xwcomty2-05 dmsetup status /dev/mapper/ost1_flakey 2>&1
      CMD: lustre-xwcomty2-05 test -b /dev/mapper/ost1_flakey
      CMD: lustre-xwcomty2-05 e2label /dev/mapper/ost1_flakey
      Starting ost1: -o localrecov  /dev/mapper/ost1_flakey /mnt/lustre-ost1
      CMD: lustre-xwcomty2-05 mkdir -p /mnt/lustre-ost1; mount -t lustre -o localrecov  /dev/mapper/ost1_flakey /mnt/lustre-ost1
      CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n health_check
      lustre-xwcomty2-05: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-05: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-05: CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-05: CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-05: lustre-xwcomty2-05: executing set_default_debug -1 all 16
      CMD: lustre-xwcomty2-05 e2label /dev/mapper/ost1_flakey                                 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      pdsh@lustre-xwcomty2-01: lustre-xwcomty2-05: ssh exited with exit code 1
      CMD: lustre-xwcomty2-05 e2label /dev/mapper/ost1_flakey 2>/dev/null
      
      Started lustre-OST0000
      lustre-xwcomty2-01: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-02: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-02: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-01: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-02: CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-01: CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-02: CMD: lustre-xwcomty2-02 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-01: CMD: lustre-xwcomty2-01 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null
      lustre-xwcomty2-01: lustre-xwcomty2-01: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
      lustre-xwcomty2-02: lustre-xwcomty2-02: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
      lustre-xwcomty2-01: CMD: lustre-xwcomty2-01 lctl get_param -n at_max
      lustre-xwcomty2-01: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec
      lustre-xwcomty2-02: CMD: lustre-xwcomty2-02 lctl get_param -n at_max
      lustre-xwcomty2-02: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL
      lustre-xwcomty2-02: IDLE state after 0 sec
      multiop: no process found   

      The error is due to`killall`

      It can not find the process with name multiop, so that it will report no process found hang the test. When run the test without RPM, and just build from source, the multiop will be "lt-multiop", will have a prefix called “lt-“, so killall can not recognize it.

       

       

      Attachments

        Activity

          People

            kevin.zhao Kevin Zhao
            kevin.zhao Kevin Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: