Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.6.0, Lustre 2.7.0, Lustre 2.8.0
-
client and server: lustre-master build # 2856
zfs
-
3
-
17590
Description
This issue was created by maloo for sarah <sarah@whamcloud.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/0429703c-ba58-11e4-8053-5254006e85c2.
The sub-test test_17 failed with the following error:
test failed to respond and timed out
ost dmesg
Lustre: 3053:0:(client.c:1939:ptlrpc_expire_one_request()) Skipped 2 previous similar messages LustreError: 137-5: lustre-OST0002_UUID: not available for connect from 10.2.4.161@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: Skipped 120 previous similar messages INFO: task mount.lustre:3630 blocked for more than 120 seconds. Tainted: P --------------- 2.6.32-431.29.2.el6_lustre.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mount.lustre D 0000000000000000 0 3630 3629 0x00000080 ffff88006edf9718 0000000000000082 0000000000000000 ffff88006ef82040 ffff88006edf9698 ffffffff81055783 ffff88007e4c2ad8 ffff880002216880 ffff88006ef825f8 ffff88006edf9fd8 000000000000fbc8 ffff88006ef825f8 Call Trace: [<ffffffff81055783>] ? set_next_buddy+0x43/0x50 [<ffffffff8152a595>] schedule_timeout+0x215/0x2e0 [<ffffffff81069f15>] ? enqueue_entity+0x125/0x450 [<ffffffff8152a213>] wait_for_common+0x123/0x180 [<ffffffff81061d00>] ? default_wake_function+0x0/0x20 [<ffffffffa090cd00>] ? client_lwp_config_process+0x0/0x1948 [obdclass] [<ffffffff8152a32d>] wait_for_completion+0x1d/0x20 [<ffffffffa0898e14>] llog_process_or_fork+0x354/0x540 [obdclass] [<ffffffffa0899014>] llog_process+0x14/0x30 [obdclass] [<ffffffffa08c81d4>] class_config_parse_llog+0x1e4/0x330 [obdclass] [<ffffffffa10314f2>] mgc_process_log+0xeb2/0x1970 [mgc] [<ffffffffa102b1f0>] ? mgc_blocking_ast+0x0/0x810 [mgc] [<ffffffffa0ad0860>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc] [<ffffffffa1032ef8>] mgc_process_config+0x658/0x1210 [mgc] [<ffffffffa08d9383>] lustre_process_log+0x7e3/0x1130 [obdclass] [<ffffffffa07891c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa08d514f>] ? server_name2fsname+0x6f/0x90 [obdclass] [<ffffffffa0907496>] server_start_targets+0x12b6/0x1af0 [obdclass] [<ffffffffa0783818>] ? libcfs_log_return+0x28/0x40 [libcfs] [<ffffffffa08dbfe6>] ? lustre_start_mgc+0x4b6/0x1e00 [obdclass] [<ffffffffa07891c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa08d3390>] ? class_config_llog_handler+0x0/0x1a70 [obdclass] [<ffffffffa090c255>] server_fill_super+0xbe5/0x1690 [obdclass] [<ffffffffa0783818>] ? libcfs_log_return+0x28/0x40 [libcfs] [<ffffffffa08dde90>] lustre_fill_super+0x560/0xa80 [obdclass] [<ffffffffa08dd930>] ? lustre_fill_super+0x0/0xa80 [obdclass] [<ffffffff8118c56f>] get_sb_nodev+0x5f/0xa0 [<ffffffffa08d4ee5>] lustre_get_sb+0x25/0x30 [obdclass] [<ffffffff8118bbcb>] vfs_kern_mount+0x7b/0x1b0 [<ffffffff8118bd72>] do_kern_mount+0x52/0x130 [<ffffffff8119e972>] ? vfs_ioctl+0x22/0xa0 [<ffffffff811ad74b>] do_mount+0x2fb/0x930 [<ffffffff811ade10>] sys_mount+0x90/0xe0 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Lustre: 3053:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1424565647/real 1424565647] req@ffff880070449080 x1493765169611180/t0(0) o38->lustre-MDT0000-lwp-OST0001@10.2.4.158@tcp:12/10 lens 400/544 e 0 to 1 dl 1424565672 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 3053:0:(client.c:1939:ptlrpc_expire_one_request()) Skipped 6 previous similar messages Lustre: 3053:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1424565712/real 1424565712] req@ffff880070449680 x1493765169611316/t0(0) o38->lustre-MDT0000-lwp-OST0001@10.2.4.158@tcp:12/10 lens 400/544 e 0 to 1 dl 1424565737 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 3053:0:(client.c:1939:ptlrpc_expire_one_request()) Skipped 7 previous similar messages INFO: task mount.lustre:3630 blocked for more than 120 seconds. Tainted: P --------------- 2.6.32-431.29.2.el6_lustre.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mount.lustre D 0000000000000000 0 3630 3629 0x00000080 ffff88006edf9718 0000000000000082 0000000000000000 ffff88006ef82040 ffff88006edf9698 ffffffff81055783 ffff88007e4c2ad8 ffff880002216880 ffff88006ef825f8 ffff88006edf9fd8 000000000000fbc8 ffff88006ef825f8 Call Trace: [<ffffffff81055783>] ? set_next_buddy+0x43/0x50 [<ffffffff8152a595>] schedule_timeout+0x215/0x2e0 [<ffffffff81069f15>] ? enqueue_entity+0x125/0x450 [<ffffffff8152a213>] wait_for_common+0x123/0x180 [<ffffffff81061d00>] ? default_wake_function+0x0/0x20 [<ffffffffa090cd00>] ? client_lwp_config_process+0x0/0x1948 [obdclass] [<ffffffff8152a32d>] wait_for_completion+0x1d/0x20 [<ffffffffa0898e14>] llog_process_or_fork+0x354/0x540 [obdclass] [<ffffffffa0899014>] llog_process+0x14/0x30 [obdclass] [<ffffffffa08c81d4>] class_config_parse_llog+0x1e4/0x330 [obdclass] [<ffffffffa10314f2>] mgc_process_log+0xeb2/0x1970 [mgc] [<ffffffffa102b1f0>] ? mgc_blocking_ast+0x0/0x810 [mgc] [<ffffffffa0ad0860>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc] [<ffffffffa1032ef8>] mgc_process_config+0x658/0x1210 [mgc] [<ffffffffa08d9383>] lustre_process_log+0x7e3/0x1130 [obdclass] [<ffffffffa07891c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa08d514f>] ? server_name2fsname+0x6f/0x90 [obdclass] [<ffffffffa0907496>] server_start_targets+0x12b6/0x1af0 [obdclass] [<ffffffffa0783818>] ? libcfs_log_return+0x28/0x40 [libcfs] [<ffffffffa08dbfe6>] ? lustre_start_mgc+0x4b6/0x1e00 [obdclass] [<ffffffffa07891c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa08d3390>] ? class_config_llog_handler+0x0/0x1a70 [obdclass] [<ffffffffa090c255>] server_fill_super+0xbe5/0x1690 [obdclass] [<ffffffffa0783818>] ? libcfs_log_return+0x28/0x40 [libcfs] [<ffffffffa08dde90>] lustre_fill_super+0x560/0xa80 [obdclass] [<ffffffffa08dd930>] ? lustre_fill_super+0x0/0xa80 [obdclass] [<ffffffff8118c56f>] get_sb_nodev+0x5f/0xa0 [<ffffffffa08d4ee5>] lustre_get_sb+0x25/0x30 [obdclass] [<ffffffff8118bbcb>] vfs_kern_mount+0x7b/0x1b0 [<ffffffff8118bd72>] do_kern_mount+0x52/0x130 [<ffffffff8119e972>] ? vfs_ioctl+0x22/0xa0 [<ffffffff811ad74b>] do_mount+0x2fb/0x930 [<ffffffff811ade10>] sys_mount+0x90/0xe0 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b LustreError: 137-5: lustre-OST0002_UUID: not available for connect from 10.2.4.156@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: Skipped 304 previous similar messages Lustre: 3053:0:(client.c:1939:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1424565842/real 1424565842] req@ffff880070449c80 x1493765169611592/t0(0) o38->lustre-MDT0000-lwp-OST0001@10.2.4.158@tcp:12/10 lens 400/544 e 0 to 1 dl 1424565867 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 3053:0:(client.c:1939:ptlrpc_expire_one_request()) Skipped 16 previous similar messages INFO: task mount.lustre:3630 blocked for more than 120 seconds.