|
When running parallel-scale sub_test mdtestssf, the MDS node restart. This issue can be reproduced.
Lustre: DEBUG MARKER: ----============= acceptance-small: parallel-scale ============---- Mon Jan 16 20:21:23 PST 2012
Lustre: DEBUG MARKER: only running test mdtestssf
Lustre: DEBUG MARKER: excepting tests: parallel_grouplock
Lustre: DEBUG MARKER: Using TIMEOUT=20
Lustre: 2377:0:(debug.c:326:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
Lustre: 2377:0:(debug.c:326:libcfs_debug_str2mask()) Skipped 3 previous similar messages
Lustre: DEBUG MARKER: == parallel-scale test mdtestssf: mdtestssf == 20:21:30 (1326774090)
Lustre: 2718:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1326774129/real 1326774129] req@ffff8802b8217c00 x1391223393779475/t0(0) o36->lustre-MDT0000-mdc-ffff88030e2f0400@192.168.4.2@o2ib:12/10 lens 472/536 e 0 to 1 dl 1326774136 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
Lustre: lustre-MDT0000-mdc-ffff88030e2f0400: Connection to service lustre-MDT0000 via nid 192.168.4.2@o2ib was lost; in progress operations using this service will wait for recovery to complete.
Lustre: 31589:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1326774129/real 1326774129] req@ffff8802b8217400 x1391223393779476/t0(0) o400->MGC192.168.4.2@o2ib@192.168.4.2@o2ib:26/25 lens 192/192 e 0 to 1 dl 1326774136 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
LustreError: 166-1: MGC192.168.4.2@o2ib: Connection to service MGS via nid 192.168.4.2@o2ib was lost; in progress operations using this service will fail.
Lustre: 31587:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1326774134/real 0] req@ffff880308bd5000 x1391223393779484/t0(0) o400->lustre-MDT0000-mdc-ffff88030e2f0400@192.168.4.2@o2ib:12/10 lens 192/192 e 0 to 1 dl 1326774141 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1326774136/real 0] req@ffff8802d305c800 x1391223393779492/t0(0) o250->MGC192.168.4.2@o2ib@192.168.4.2@o2ib:26/25 lens 368/512 e 0 to 1 dl 1326774142 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 31590:0:(import.c:525:import_select_connection()) MGC192.168.4.2@o2ib: tried all connections, increasing latency to 6s
Lustre: 31590:0:(import.c:525:import_select_connection()) lustre-MDT0000-mdc-ffff88030e2f0400: tried all connections, increasing latency to 6s
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1326774146/real 0] req@ffff88030c944c00 x1391223393779500/t0(0) o38->lustre-MDT0000-mdc-ffff88030e2f0400@192.168.4.2@o2ib:12/10 lens 368/512 e 0 to 1 dl 1326774157 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) Skipped 1 previous similar message
Lustre: 31590:0:(import.c:525:import_select_connection()) MGC192.168.4.2@o2ib: tried all connections, increasing latency to 11s
Lustre: 31590:0:(import.c:525:import_select_connection()) MGC192.168.4.2@o2ib: tried all connections, increasing latency to 16s
Lustre: 31590:0:(import.c:525:import_select_connection()) Skipped 1 previous similar message
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1326774171/real 1326774171] req@ffff88030e818000 x1391223393779521/t0(0) o250->MGC192.168.4.2@o2ib@192.168.4.2@o2ib:26/25 lens 368/512 e 0 to 1 dl 1326774192 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) Skipped 3 previous similar messages
Lustre: 31590:0:(import.c:525:import_select_connection()) MGC192.168.4.2@o2ib: tried all connections, increasing latency to 21s
Lustre: 31590:0:(import.c:525:import_select_connection()) Skipped 1 previous similar message
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1326774181/real 1326774181] req@ffff88031bec0000 x1391223393779529/t0(0) o250->MGC192.168.4.2@o2ib@192.168.4.2@o2ib:26/25 lens 368/512 e 0 to 1 dl 1326774207 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) Skipped 1 previous similar message
Lustre: 31590:0:(import.c:525:import_select_connection()) MGC192.168.4.2@o2ib: tried all connections, increasing latency to 21s
Lustre: 31590:0:(import.c:525:import_select_connection()) Skipped 1 previous similar message
Lustre: 31590:0:(import.c:525:import_select_connection()) MGC192.168.4.2@o2ib: tried all connections, increasing latency to 21s
Lustre: 31590:0:(import.c:525:import_select_connection()) Skipped 1 previous similar message
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1326774201/real 1326774201] req@ffff880319193c00 x1391223393779545/t0(0) o250->MGC192.168.4.2@o2ib@192.168.4.2@o2ib:26/25 lens 368/512 e 0 to 1 dl 1326774227 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) Skipped 3 previous similar messages
Lustre: 31590:0:(import.c:525:import_select_connection()) MGC192.168.4.2@o2ib: tried all connections, increasing latency to 21s
Lustre: 31590:0:(import.c:525:import_select_connection()) Skipped 5 previous similar messages
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1326774236/real 1326774236] req@ffff88031e354400 x1391223393779585/t0(0) o250->MGC192.168.4.2@o2ib@192.168.4.2@o2ib:26/25 lens 368/512 e 0 to 1 dl 1326774262 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) Skipped 9 previous similar messages
Lustre: 31590:0:(import.c:525:import_select_connection()) MGC192.168.4.2@o2ib: tried all connections, increasing latency to 21s
Lustre: 31590:0:(import.c:525:import_select_connection()) Skipped 10 previous similar messages
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1326774301/real 1326774301] req@ffff8802f6c6cc00 x1391223393779680/t0(0) o250->MGC192.168.4.2@o2ib@192.168.4.2@o2ib:26/25 lens 368/512 e 0 to 1 dl 1326774327 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 31585:0:(client.c:1789:ptlrpc_expire_one_request()) Skipped 22 previous similar messages
Lustre: 31590:0:(import.c:525:import_select_connection()) MGC192.168.4.2@o2ib: tried all connections, increasing latency to 21s
Lustre: 31590:0:(import.c:525:import_select_connection()) Skipped 22 previous similar messages
INFO: task mdtest:2717 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdtest D 0000000000000002 0 2717 2714 0x00000080
ffff880318ddfde8 0000000000000086 0000000000000000 ffff880318ddfdac
00000010a06391b0 ffff88033fc24700 ffff880032e35f80 00000001012de486
ffff8802ec327038 ffff880318ddffd8 000000000000f598 ffff8802ec327038
Call Trace:
[<ffffffff814dc4be>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff814dc35b>] mutex_lock+0x2b/0x50
[<ffffffff81181d80>] lookup_create+0x30/0xd0
[<ffffffff81181e85>] sys_mkdirat+0x65/0x120
[<ffffffff810d1b52>] ? audit_syscall_entry+0x272/0x2a0
[<ffffffff81181f58>] sys_mkdir+0x18/0x20
[<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
|