[LU-2876] Test zfs timeout failure on test suite conf-sanity test_56 Created: 26/Feb/13 Updated: 19/Mar/13 Resolved: 19/Mar/13 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Yang Sheng |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | LB, zfs | ||
| Severity: | 3 |
| Rank (Obsolete): | 6950 |
| Description |
|
This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/a043a2ea-807f-11e2-b777-52540035b04c. The sub-test test_56 failed with the following error:
Info required for matching: conf-sanity 56 |
| Comments |
| Comment by Yang Sheng [ 15/Mar/13 ] |
|
From logs, Feb 26 11:08:15 wtm-19vm7 mrshd[7748]: root@wtm-19vm2.rosso.whamcloud.com as root: cmd='(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usr/sbin; cd /usr/lib64/lustre/tests; LUSTRE="/usr/lib64/lustre" sh -c "umount -d -f /mnt/mds1");echo XXRETCODE:$?' Feb 26 11:08:17 wtm-19vm7 kernel: LustreError: 137-5: lustre-MDT0000: Not available for connect from 10.10.16.202@tcp (stopping) Feb 26 11:08:18 wtm-19vm7 kernel: LustreError: 7388:0:(client.c:1048:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff8800439a3400 x1428061600874724/t0(0) o13->lustre-OST03e8-osc-MDT0000@10.10.16.202@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Feb 26 11:08:19 wtm-19vm7 /usr/sbin/gmond[1866]: Error 1 sending the modular data for cpu_user#012 Feb 26 11:10:31 wtm-19vm7 kernel: imklog 5.8.10, log source = /proc/kmsg started. Feb 26 11:10:31 wtm-19vm7 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1638" x-info="http://www.rsyslog.com"] start Feb 26 11:10:31 wtm-19vm7 kernel: Initializing cgroup subsys cpuset Feb 26 11:10:31 wtm-19vm7 kernel: Initializing cgroup subsys cpu Feb 26 11:10:31 wtm-19vm7 kernel: Linux version 2.6.32-279.19.1.el6_lustre.ge7838ff.x86_64 (jenkins@builder-1-sde1-el6-x8664.lab.whamcloud.com) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Mon Feb 11 00:35:42 PST 2013 Feb 26 11:10:31 wtm-19vm7 kernel: Command line: ro root=UUID=05eefdd8-3ae0-4f76-9518-7a22325a78c6 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD console=ttyS0,115200 crashkernel=auto SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM Feb 26 11:10:31 wtm-19vm7 kernel: KERNEL supported cpus: Feb 26 11:10:31 wtm-19vm7 kernel: Intel GenuineIntel Feb 26 11:10:31 wtm-19vm7 kernel: AMD AuthenticAMD Feb 26 11:10:31 wtm-19vm7 kernel: Centaur CentaurHauls Feb 26 11:10:31 wtm-19vm7 kernel: Disabled fast string operations Feb 26 11:10:31 wtm-19vm7 kernel: BIOS-provided physical RAM map: The mds host be reboot after umount issued. Client just waiting on pdsh finally timeout trigger the Sysrq. Don't know why it reboot just from maloo data. Maybe kernel crash? But i remember we always hold on the crash nodes instead of reboot it. So i think this issue not relate to lustre self. |
| Comment by Jodi Levi (Inactive) [ 19/Mar/13 ] |
|
Duplicate of |