Details
-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
Lustre 2.1.3
-
None
-
3
-
10076
Description
Newly deployed filesystem. Running initial test rsync from client to filesystem when rsync process hung.
Also see other weirdness. Poor performance. Network errors (example below).
Lustre: 16691:0:(client.c:1780:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1347583367/real 1347583367] req@ffff880664530c00 x1413043208599797/t0(0) o8->alicefs-OST0006-osc-ffff880832436000@172.20.2.125@o2ib:28/4 lens 368/512 e 0 to 1 dl 1347583422 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 16691:0:(client.c:1780:ptlrpc_expire_one_request()) Skipped 5 previous similar messages
Lustre: 16691:0:(client.c:1780:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1347583517/real 1347583517] req@ffff8808329b1800 x1413043208681668/t0(0) o8->alicefs-OST0006-osc-ffff880832436000@172.20.2.125@o2ib:28/4 lens 368/512 e 0 to 1 dl 1347583572 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 16691:0:(client.c:1780:ptlrpc_expire_one_request()) Skipped 11 previous similar messages
INFO: task rsync:16826 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rsync D 0000000000000002 0 16826 1 0x00000080
ffff8803df615978 0000000000000046 0000000000000000 ffff88043343bca0
ffff8808340ecb58 ffff880833270b40 ffff88043343bd38 ffff8803df615938
ffff88043405b058 ffff8803df615fd8 000000000000fb88 ffff88043405b058
Call Trace:
[<ffffffffa04c067e>] cfs_waitq_wait+0xe/0x10 [libcfs]
[<ffffffffa0a6c314>] lov_delete_raid0+0x2e4/0x400 [lov]
[<ffffffffa0a6b816>] lov_object_delete+0x66/0x190 [lov]
[<ffffffffa05e2d29>] lu_object_free+0x89/0x1b0 [obdclass]
[<ffffffffa04ce894>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
[<ffffffffa04cf052>] ? cfs_hash_bd_from_key+0x42/0xe0 [libcfs]
[<ffffffffa05e300f>] lu_object_put+0x1bf/0x290 [obdclass]
[<ffffffffa0b4c312>] ? cl_inode_fini+0x52/0x270 [lustre]
[<ffffffffa05e9a1e>] cl_object_put+0xe/0x10 [obdclass]
[<ffffffffa0b4c34a>] cl_inode_fini+0x8a/0x270 [lustre]
[<ffffffffa0c1cbec>] ? lmv_change_cbdata+0x2cc/0x580 [lmv]
[<ffffffffa0b1c42f>] ll_clear_inode+0x2ff/0xdb0 [lustre]
[<ffffffff81195c0c>] clear_inode+0xac/0x140
[<ffffffffa0b1747b>] ll_delete_inode+0x18b/0x650 [lustre]
[<ffffffffa0b172f0>] ? ll_delete_inode+0x0/0x650 [lustre]
[<ffffffff8119630e>] generic_delete_inode+0xde/0x1d0
[<ffffffff81196465>] generic_drop_inode+0x65/0x80
[<ffffffff811952b2>] iput+0x62/0x70
[<ffffffffa0af38d1>] ll_d_iput+0x31/0x80 [lustre]
[<ffffffff81191dfc>] dentry_iput+0x7c/0x100
[<ffffffff81191f71>] d_kill+0x31/0x60
[<ffffffff8119399c>] dput+0x7c/0x150
[<ffffffff8117caf9>] __fput+0x189/0x210
[<ffffffff8117cba5>] fput+0x25/0x30
[<ffffffff811785cd>] filp_close+0x5d/0x90
[<ffffffff8106e3bf>] put_files_struct+0x7f/0xf0
[<ffffffff8106e483>] exit_files+0x53/0x70
[<ffffffff810704fd>] do_exit+0x18d/0x870
[<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0
[<ffffffff81070c38>] do_group_exit+0x58/0xd0
[<ffffffff81070cc7>] sys_exit_group+0x17/0x20
[<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
INFO: task rsync:16826 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rsync D 0000000000000002 0 16826 1 0x00000080
ffff8803df615978 0000000000000046 0000000000000000 ffff88043343bca0
ffff8808340ecb58 ffff880833270b40 ffff88043343bd38 ffff8803df615938
ffff88043405b058 ffff8803df615fd8 000000000000fb88 ffff88043405b058
Call Trace:
[<ffffffffa04c067e>] cfs_waitq_wait+0xe/0x10 [libcfs]
[<ffffffffa0a6c314>] lov_delete_raid0+0x2e4/0x400 [lov]
[<ffffffffa0a6b816>] lov_object_delete+0x66/0x190 [lov]
[<ffffffffa05e2d29>] lu_object_free+0x89/0x1b0 [obdclass]
[<ffffffffa04ce894>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
[<ffffffffa04cf052>] ? cfs_hash_bd_from_key+0x42/0xe0 [libcfs]
[<ffffffffa05e300f>] lu_object_put+0x1bf/0x290 [obdclass]
[<ffffffffa0b4c312>] ? cl_inode_fini+0x52/0x270 [lustre]
[<ffffffffa05e9a1e>] cl_object_put+0xe/0x10 [obdclass]
[<ffffffffa0b4c34a>] cl_inode_fini+0x8a/0x270 [lustre]
[<ffffffffa0c1cbec>] ? lmv_change_cbdata+0x2cc/0x580 [lmv]
[<ffffffffa0b1c42f>] ll_clear_inode+0x2ff/0xdb0 [lustre]
[<ffffffff81195c0c>] clear_inode+0xac/0x140
[<ffffffffa0b1747b>] ll_delete_inode+0x18b/0x650 [lustre]
[<ffffffffa0b172f0>] ? ll_delete_inode+0x0/0x650 [lustre]
[<ffffffff8119630e>] generic_delete_inode+0xde/0x1d0
[<ffffffff81196465>] generic_drop_inode+0x65/0x80
[<ffffffff811952b2>] iput+0x62/0x70
[<ffffffffa0af38d1>] ll_d_iput+0x31/0x80 [lustre]
[<ffffffff81191dfc>] dentry_iput+0x7c/0x100
[<ffffffff81191f71>] d_kill+0x31/0x60
[<ffffffff8119399c>] dput+0x7c/0x150
[<ffffffff8117caf9>] __fput+0x189/0x210
[<ffffffff8117cba5>] fput+0x25/0x30
[<ffffffff811785cd>] filp_close+0x5d/0x90
[<ffffffff8106e3bf>] put_files_struct+0x7f/0xf0
[<ffffffff8106e483>] exit_files+0x53/0x70
[<ffffffff810704fd>] do_exit+0x18d/0x870
[<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0
[<ffffffff81070c38>] do_group_exit+0x58/0xd0
[<ffffffff81070cc7>] sys_exit_group+0x17/0x20
[<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b