Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
== replay-dual test 25: replay|resend ==================== 02:48:05 (1663728485) 1+0 records in 1+0 records out 512 bytes copied, 0.0019952 s, 257 kB/s fail_loc=0x304 fail_loc=0x304 fail_loc=0x304 CMD: lustre-xwcomty2-01 multiop /mnt/lustre2/f25.replay-dual Ow512 CMD: lustre-xwcomty2-05 lctl set_param fail_loc=0x80000325 fail_loc=0x80000325 Failing ost1 on lustre-xwcomty2-05 CMD: lustre-xwcomty2-05 grep -c /mnt/lustre-ost1' ' /proc/mounts || true Stopping /mnt/lustre-ost1 (opts:) on lustre-xwcomty2-05 CMD: lustre-xwcomty2-05 umount -d /mnt/lustre-ost1 CMD: lustre-xwcomty2-05 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' || true reboot facets: ost1 Failover ost1 to lustre-xwcomty2-05 mount facets: ost1 CMD: lustre-xwcomty2-05 dmsetup status /dev/mapper/ost1_flakey >/dev/null 2>&1 CMD: lustre-xwcomty2-05 dmsetup status /dev/mapper/ost1_flakey 2>&1 CMD: lustre-xwcomty2-05 test -b /dev/mapper/ost1_flakey CMD: lustre-xwcomty2-05 e2label /dev/mapper/ost1_flakey Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 CMD: lustre-xwcomty2-05 mkdir -p /mnt/lustre-ost1; mount -t lustre -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n health_check lustre-xwcomty2-05: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-05: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-05: CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-05: CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-05: lustre-xwcomty2-05: executing set_default_debug -1 all 16 CMD: lustre-xwcomty2-05 e2label /dev/mapper/ost1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' pdsh@lustre-xwcomty2-01: lustre-xwcomty2-05: ssh exited with exit code 1 CMD: lustre-xwcomty2-05 e2label /dev/mapper/ost1_flakey 2>/dev/null Started lustre-OST0000 lustre-xwcomty2-01: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-02: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-02: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-01: CMD: lustre-xwcomty2-03 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-02: CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-01: CMD: lustre-xwcomty2-05 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-02: CMD: lustre-xwcomty2-02 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-01: CMD: lustre-xwcomty2-01 /root/test/build-06244k/lustre/lustre/utils/lctl get_param -n version 2>/dev/null lustre-xwcomty2-01: lustre-xwcomty2-01: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid lustre-xwcomty2-02: lustre-xwcomty2-02: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid lustre-xwcomty2-01: CMD: lustre-xwcomty2-01 lctl get_param -n at_max lustre-xwcomty2-01: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec lustre-xwcomty2-02: CMD: lustre-xwcomty2-02 lctl get_param -n at_max lustre-xwcomty2-02: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL lustre-xwcomty2-02: IDLE state after 0 sec multiop: no process found
The error is due to`killall`
It can not find the process with name multiop, so that it will report no process found hang the test. When run the test without RPM, and just build from source, the multiop will be "lt-multiop", will have a prefix called “lt-“, so killall can not recognize it.