[LU-8849] Client hangs on 'lfs ladvise' with large input values Created: 18/Nov/16 Updated: 22/Jul/17 Resolved: 22/Jul/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | Lustre 2.11.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | James Nunez (Inactive) | Assignee: | Li Xi (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
eagle cluster with lustre-master 2.8.60 |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
I was playing around with input values for 'lfs ladvise' and trying to see what maximums I could send as inputs. The following commands worked with no obvious errors or returned an error: # lfs ladvise -a willread /lustre/scratch/tfile --length 5000000000 # lfs ladvise -a willread /lustre/scratch/tfile --length 50000000009999999999999 # lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099M # lfs ladvise -a willread /lustre/scratch/space_file2 --length 50000000009999M ladvise: bad length '50000000009999M' # lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099G ladvise: bad length '500000000099G' # lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099k -s 0 # lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099k -s 1M The I tried one command too many and I hung the client with lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099k -s 500000000099k The file resides on OST0: # lfs getstripe /lustre/scratch/space_file2 /lustre/scratch/space_file2 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 15 0xf 0 ls -l /lustre/scratch/space_file2 -rw-r--r-- 1 root root 104857600 Nov 17 21:55 /lustre/scratch/space_file2 dmesg from the client: INFO: task lfs:26850 blocked for more than 120 seconds.
Not tainted 2.6.32-642.6.1.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
lfs D 0000000000000000 0 26850 15016 0x00000084
ffff880079cbbaf8 0000000000000086 0000000000000000 ffff88007851a040
ffff88007851a078 ffff880078503bc0 00001e14781a34aa 0000000000000000
ffff88007c01f558 0000000101f41458 ffff88007c01fad8 ffff880079cbbfd8
Call Trace:
[<ffffffff815498b5>] schedule_timeout+0x215/0x2e0
[<ffffffffa07765d3>] ? ptlrpc_set_add_new_req+0xe3/0x160 [ptlrpc]
[<ffffffff81549513>] wait_for_common+0x123/0x180
[<ffffffff8106c500>] ? default_wake_function+0x0/0x20
[<ffffffffa09bb3e0>] ? lov_io_end_wrapper+0x0/0x100 [lov]
[<ffffffff8154964d>] wait_for_completion+0x1d/0x20
[<ffffffffa0b336f0>] osc_io_ladvise_end+0x40/0x60 [osc]
[<ffffffffa05e7a30>] cl_io_end+0x60/0x150 [obdclass]
[<ffffffffa05e8410>] ? cl_io_start+0x0/0x140 [obdclass]
[<ffffffffa09bb4d1>] lov_io_end_wrapper+0xf1/0x100 [lov]
[<ffffffffa09bb228>] lov_io_call+0x88/0x120 [lov]
[<ffffffffa09bccac>] lov_io_end+0x4c/0xf0 [lov]
[<ffffffffa05e7a30>] cl_io_end+0x60/0x150 [obdclass]
[<ffffffffa05eb362>] cl_io_loop+0xc2/0x1b0 [obdclass]
[<ffffffffa0a33ade>] ll_file_ioctl+0x2ece/0x3eb0 [lustre]
[<ffffffff8115a409>] ? handle_mm_fault+0x299/0x3d0
[<ffffffff81052204>] ? __do_page_fault+0x1f4/0x500
[<ffffffff811af522>] vfs_ioctl+0x22/0xa0
[<ffffffff811af9ea>] do_vfs_ioctl+0x3aa/0x580
[<ffffffff811afc41>] sys_ioctl+0x81/0xa0
[<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
On OST0 dmesg: LNet: Service thread pid 7920 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 7920, comm: ll_ost_io00_012 Call Trace: [<ffffffff8113e758>] ? __alloc_pages_nodemask+0x548/0x950 [<ffffffff81177d9a>] ? alloc_pages_current+0xaa/0x110 [<ffffffff8112eca7>] ? __page_cache_alloc+0x87/0x90 [<ffffffff8112fa3f>] ? find_or_create_page+0x4f/0xb0 [<ffffffffa0d3a412>] ? osd_bufs_get+0x2b2/0x4b0 [osd_ldiskfs] [<ffffffffa0e96fd0>] ? ofd_ladvise_hdl+0x8f0/0xdc0 [ofd] [<ffffffffa0840201>] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [<ffffffffa08a449c>] ? tgt_request_handle+0x8ec/0x1440 [ptlrpc] [<ffffffffa0850ba1>] ? ptlrpc_main+0xd31/0x1800 [ptlrpc] [<ffffffffa084fe70>] ? ptlrpc_main+0x0/0x1800 [ptlrpc] [<ffffffff810a640e>] ? kthread+0x9e/0xc0 [<ffffffff8100c28a>] ? child_rip+0xa/0x20 [<ffffffff810a6370>] ? kthread+0x0/0xc0 [<ffffffff8100c280>] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1479431579.7920 After running some tests on a different client to see if the file system was responsive, dmesg on OST0 has: Lustre: 7912:0:(service.c:1331:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff88007ace3450 x1551266786862416/t0(0) o21->5959cf4b-6c57-f313-be15-ccf7ea98876e@10.100.4.158@tcp:-1/-1 lens 504/432 e 1 to 0 dl 1479432506 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: scratch-OST0000: Client 5959cf4b-6c57-f313-be15-ccf7ea98876e (at 10.100.4.158@tcp) reconnecting Lustre: scratch-OST0000: Connection restored to 5959cf4b-6c57-f313-be15-ccf7ea98876e (at 10.100.4.158@tcp) LNet: Service thread pid 31614 was inactive for 762.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 31614, comm: ll_ost_io00_019 Call Trace: [<ffffffff8107405a>] __cond_resched+0x2a/0x40 [<ffffffff8113e339>] ? __alloc_pages_nodemask+0x129/0x950 [<ffffffff8117f10c>] ? transfer_objects+0x5c/0x80 [<ffffffff8118d2a5>] ? mem_cgroup_charge_common+0xa5/0xd0 [<ffffffff8118e0b0>] ? mem_cgroup_cache_charge+0xc0/0xd0 [<ffffffff81177d9a>] ? alloc_pages_current+0xaa/0x110 [<ffffffff8112eca7>] ? __page_cache_alloc+0x87/0x90 [<ffffffff8112fa3f>] ? find_or_create_page+0x4f/0xb0 [<ffffffffa0d3a412>] ? osd_bufs_get+0x2b2/0x4b0 [osd_ldiskfs] [<ffffffffa0e96fd0>] ? ofd_ladvise_hdl+0x8f0/0xdc0 [ofd] [<ffffffffa0840201>] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [<ffffffffa08a449c>] ? tgt_request_handle+0x8ec/0x1440 [ptlrpc] [<ffffffffa0850ba1>] ? ptlrpc_main+0xd31/0x1800 [ptlrpc] [<ffffffffa084fe70>] ? ptlrpc_main+0x0/0x1800 [ptlrpc] [<ffffffff810a640e>] ? kthread+0x9e/0xc0 [<ffffffff8100c28a>] ? child_rip+0xa/0x20 [<ffffffff810a6370>] ? kthread+0x0/0xc0 [<ffffffff8100c280>] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1479432668.31614 Lustre: scratch-OST0000: Client 28158de5-48d2-2404-815b-3bb36c5b7839 (at 10.100.4.157@tcp) reconnecting Lustre: scratch-OST0000: Connection restored to 28158de5-48d2-2404-815b-3bb36c5b7839 (at 10.100.4.157@tcp) I do not know how repeatable this is. I tried to do another 'lfs advise' call on a different node and it is hung also. On this node I ran the following commands until the last one hung: # lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099k -s 0 [root@eagle-48vm5 ~]# lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099k -s 0k [root@eagle-48vm5 ~]# lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099k -s 0M [root@eagle-48vm5 ~]# lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099k -s 0T [root@eagle-48vm5 ~]# lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099k -s 0G [root@eagle-48vm5 ~]# lfs ladvise -a willread /lustre/scratch/space_file2 --length 500000000099k -s 1G With dmesg on this client: Lustre: 7929:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1479431906/real 1479431906] req@ffff88007bd859c0 x1551266786862416/t0(0) o21->scratch-OST0000-osc-ffff880078ae4800@10.100.4.155@tcp:6/4 lens 504/432 e 1 to 1 dl 1479432507 ref 1 fl Rpc:X/0/ffffffff rc 0/-1
Lustre: scratch-OST0000-osc-ffff880078ae4800: Connection to scratch-OST0000 (at 10.100.4.155@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: scratch-OST0000-osc-ffff880078ae4800: Connection restored to 10.100.4.155@tcp (at 10.100.4.155@tcp)
INFO: task lfs:26783 blocked for more than 120 seconds.
Not tainted 2.6.32-642.6.1.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
lfs D 0000000000000001 0 26783 15019 0x00000084
ffff88007b5a7af8 0000000000000086 ffff88007b5a7ac0 ffff88007b5a7abc
0000000000000020 ffff88007f824380 00001e649680e0ec ffff880002216ec0
0000000000000400 0000000101f954bb ffff880037abe5f8 ffff88007b5a7fd8
Call Trace:
[<ffffffff815498b5>] schedule_timeout+0x215/0x2e0
[<ffffffffa07765d3>] ? ptlrpc_set_add_new_req+0xe3/0x160 [ptlrpc]
[<ffffffffa07ac493>] ? ptlrpcd_add_req+0x1e3/0x2a0 [ptlrpc]
[<ffffffffa078e0c0>] ? lustre_swab_ladvise+0x0/0x50 [ptlrpc]
[<ffffffff81549513>] wait_for_common+0x123/0x180
[<ffffffff8106c500>] ? default_wake_function+0x0/0x20
[<ffffffffa09bb3e0>] ? lov_io_end_wrapper+0x0/0x100 [lov]
[<ffffffff8154964d>] wait_for_completion+0x1d/0x20
[<ffffffffa0b336f0>] osc_io_ladvise_end+0x40/0x60 [osc]
[<ffffffffa05e7a30>] cl_io_end+0x60/0x150 [obdclass]
[<ffffffffa05e8410>] ? cl_io_start+0x0/0x140 [obdclass]
[<ffffffffa09bb4d1>] lov_io_end_wrapper+0xf1/0x100 [lov]
[<ffffffffa09bb228>] lov_io_call+0x88/0x120 [lov]
[<ffffffffa09bccac>] lov_io_end+0x4c/0xf0 [lov]
[<ffffffffa05e7a30>] cl_io_end+0x60/0x150 [obdclass]
[<ffffffffa05eb362>] cl_io_loop+0xc2/0x1b0 [obdclass]
[<ffffffffa0a33ade>] ll_file_ioctl+0x2ece/0x3eb0 [lustre]
[<ffffffff8115a409>] ? handle_mm_fault+0x299/0x3d0
[<ffffffff81052204>] ? __do_page_fault+0x1f4/0x500
[<ffffffff811af522>] vfs_ioctl+0x22/0xa0
[<ffffffff811af9ea>] do_vfs_ioctl+0x3aa/0x580
[<ffffffff811afc41>] sys_ioctl+0x81/0xa0
[<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
|
| Comments |
| Comment by Andreas Dilger [ 18/Nov/16 ] |
|
James, is this ldiskfs or ZFS? Not sure that it matters, but just in case. Li Xi, can you please make a patch that limits the ladvise start and end to the object's size, so that it doesn't try filling the pages beyond the end of the object. Ideally, we would also use FIEMAP or SEEK_HOLE/SEEK_DATA to skip holes in the object. |
| Comment by James Nunez (Inactive) [ 18/Nov/16 ] |
|
All testing reported here is with ldiskfs. |
| Comment by Andreas Dilger [ 25/Feb/17 ] |
|
Li Xi, any update on this? |
| Comment by Li Xi (Inactive) [ 19/May/17 ] |
|
Sorry for the delay. We will work on this. |
| Comment by Gerrit Updater [ 19/May/17 ] |
|
Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/27209 |
| Comment by Gerrit Updater [ 22/Jul/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27209/ |
| Comment by Minh Diep [ 22/Jul/17 ] |
|
fix landed in lustre 2.11.0 |