[LU-3109] ZFS - very slow reads, OST watchdogs - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Duplicate
Priority: Blocker
Fix Version/s: None
Affects Version/s: Lustre 2.4.0
Labels:
- performance
- prz
- zfs
Environment:
Hyperion/LLNL

Severity:
3
Rank (Obsolete):
7557

Description

Running ior file-per-process. we observe one or two of the OSTs have a very excessive load factor compared to the others (load average of 112 vs LA of 0.1)
System log shows large number of watchdogs. IO is not failing but rates are very, very slow.
First watchdog (log attached)

2013-04-04 11:26:05 LNet: Service thread pid 8074 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
2013-04-04 11:26:05 Pid: 8074, comm: ll_ost_io00_018
2013-04-04 11:26:05
2013-04-04 11:26:05 Call Trace:
2013-04-04 11:26:05  [<ffffffffa056cd40>] ? arc_read_nolock+0x530/0x810 [zfs]
2013-04-04 11:26:05  [<ffffffffa04e45ac>] cv_wait_common+0x9c/0x1a0 [spl]
2013-04-04 11:26:05  [<ffffffff81090990>] ? autoremove_wake_function+0x0/0x40
2013-04-04 11:26:05  [<ffffffffa04e46e3>] __cv_wait+0x13/0x20 [spl]
2013-04-04 11:26:05  [<ffffffffa060633b>] zio_wait+0xeb/0x160 [zfs]
2013-04-04 11:26:05  [<ffffffffa057106d>] dbuf_read+0x3fd/0x720 [zfs]
2013-04-04 11:26:06  [<ffffffffa0572c1b>] dbuf_prefetch+0x10b/0x2b0 [zfs]
2013-04-04 11:26:06  [<ffffffffa0586381>] dmu_zfetch_dofetch+0xf1/0x160 [zfs]
2013-04-04 11:26:06  [<ffffffffa0570280>] ? dbuf_read_done+0x0/0x110 [zfs]
2013-04-04 11:26:06  [<ffffffffa0587211>] dmu_zfetch+0xaa1/0xe40 [zfs]
2013-04-04 11:26:06  [<ffffffffa05710fa>] dbuf_read+0x48a/0x720 [zfs]
2013-04-04 11:26:06  [<ffffffffa0578bc9>] dmu_buf_hold_array_by_dnode+0x179/0x570 [zfs]
2013-04-04 11:26:06  [<ffffffffa0579b28>] dmu_buf_hold_array_by_bonus+0x68/0x90 [zfs]
2013-04-04 11:26:06  [<ffffffffa0d4c95d>] osd_bufs_get+0x49d/0x9a0 [osd_zfs]
2013-04-04 11:26:06  [<ffffffff81270f7c>] ? put_dec+0x10c/0x110
2013-04-04 11:26:06  [<ffffffffa0723736>] ? lu_object_find+0x16/0x20 [obdclass]
2013-04-04 11:26:06  [<ffffffffa0ded49f>] ofd_preprw_read+0x13f/0x7e0 [ofd]
2013-04-04 11:26:06  [<ffffffffa0dedec5>] ofd_preprw+0x385/0x1190 [ofd]
2013-04-04 11:26:06  [<ffffffffa0da739c>] obd_preprw+0x12c/0x3d0 [ost]
2013-04-04 11:26:06  [<ffffffffa0dace80>] ost_brw_read+0xd00/0x12e0 [ost]
2013-04-04 11:26:06  [<ffffffff812739b6>] ? vsnprintf+0x2b6/0x5f0
2013-04-04 11:26:06  [<ffffffffa035127b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
2013-04-04 11:26:06  [<ffffffffa0361bdb>] ? libcfs_debug_vmsg2+0x50b/0xbb0 [libcfs]
2013-04-04 11:26:06  [<ffffffffa08a2f4c>] ? lustre_msg_get_version+0x8c/0x100 [ptlrpc]
2013-04-04 11:26:06  [<ffffffffa08a30a8>] ? lustre_msg_check_version+0xe8/0x100 [ptlrpc]
2013-04-04 11:26:06  [<ffffffffa0db3a63>] ost_handle+0x2b53/0x46f0 [ost]
2013-04-04 11:26:06  [<ffffffffa035e0e4>] ? libcfs_id2str+0x74/0xb0 [libcfs]
2013-04-04 11:26:06  [<ffffffffa08b21ac>] ptlrpc_server_handle_request+0x41c/0xdf0 [ptlrpc]
2013-04-04 11:26:06  [<ffffffffa03525de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
2013-04-04 11:26:06  [<ffffffffa08a97e9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
2013-04-04 11:26:06  [<ffffffff81052223>] ? __wake_up+0x53/0x70
2013-04-04 11:26:06  [<ffffffffa08b36f5>] ptlrpc_main+0xb75/0x1870 [ptlrpc]
2013-04-04 11:26:06  [<ffffffffa08b2b80>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
2013-04-04 11:26:06  [<ffffffff8100c0ca>] child_rip+0xa/0x20
2013-04-04 11:26:06  [<ffffffffa08b2b80>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
2013-04-04 11:26:06  [<ffffffffa08b2b80>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
2013-04-04 11:26:06  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
2013-04-04 11:26:06
2013-04-04 11:26:06 LustreError: dumping log to /tmp/lustre-log.1365099965.8074

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

agb13.lustre.log.gz
3.21 MB
04/Apr/13 6:46 PM

Issue Links

duplicates

LU-5278 ZFS - many OST watchdogs with IOR

Resolved

is related to

LU-2887 sanity-quota test_12a: slow due to ZFS VMs sharing single disk

Resolved

Activity

[LU-3109] ZFS - very slow reads, OST watchdogs

Andreas Dilger added a comment - 14/Feb/15 1:11 AM

This is probably the same as ~~LU-5278~~, which has patches that improve ZFS read performance significantly.

Andreas Dilger added a comment - 14/Feb/15 1:11 AM This is probably the same as LU-5278 , which has patches that improve ZFS read performance significantly.

Christopher Morrone (Inactive) added a comment - 05/Apr/13 8:04 AM

rc10 is old, you should definitely upgrade to the latest rc of 0.6.0. 0.6.1 is a little TOO new, because packaging has changed there, and lustre will need a little tweaking to find the new paths and things automatically. You can build by hand by giving spl and zfs paths, but the latest 0.6.0 rc will just be easier.

Christopher Morrone (Inactive) added a comment - 05/Apr/13 8:04 AM rc10 is old, you should definitely upgrade to the latest rc of 0.6.0. 0.6.1 is a little TOO new, because packaging has changed there, and lustre will need a little tweaking to find the new paths and things automatically. You can build by hand by giving spl and zfs paths, but the latest 0.6.0 rc will just be easier.

Keith Mannthey (Inactive) added a comment - 05/Apr/13 4:01 AM - edited

oprofile didn't just work. ~~LU-3111~~ filed.

Keith Mannthey (Inactive) added a comment - 05/Apr/13 4:01 AM - edited oprofile didn't just work. LU-3111 filed.

Keith Mannthey (Inactive) added a comment - 05/Apr/13 2:12 AM

It seems our zfs is 0.6.0-rc10 and there is a 0.6.1 .

Keith Mannthey (Inactive) added a comment - 05/Apr/13 2:12 AM It seems our zfs is 0.6.0-rc10 and there is a 0.6.1 .

Keith Mannthey (Inactive) added a comment - 05/Apr/13 1:33 AM

"normal"

/proc/fs/lustre/obdfilter/lustre-OST0001/stats @ 1365124587.752293
Name                      Cur.Count  Cur.Rate   #Events   Unit           last        min          avg        max
read_bytes                0          0          1625295   [bytes]           0       4096    393794.95    1048576
write_bytes               6568       437        652523    [bytes]  6880755712       4096   1047962.14    1048576
get_info                  73         4          1538      
set_info_async            0          0          3         
connect                   0          0          125       
statfs                    3          0          1226      
create                    0          0          12        
destroy                   34         2          768       
sync                      39         2          768       
preprw                    6568       437        2277818   
commitrw                  6568       437        2277818   
ping                      108        7          23612

"bad"

/proc/fs/lustre/obdfilter/lustre-OST0003/stats @ 1365124582.616412
Name                      Cur.Count  Cur.Rate   #Events   Unit           last        min          avg        max
read_bytes                0          0          1226305   [bytes]           0       4096    521965.14    1048576
write_bytes               5602       373        653713    [bytes]  5861138432       4096   1042516.96    1048576
get_info                  127        8          1426      
set_info_async            0          0          2         
connect                   0          0          125       
statfs                    3          0          1223      
create                    0          0          12        
destroy                   63         4          712       
sync                      64         4          712       
preprw                    5602       373        1880018   
commitrw                  5602       373        1880018   
ping                      95         6          23601

Looks like a "normal" server is doing more work.

Keith Mannthey (Inactive) added a comment - 05/Apr/13 1:33 AM "normal" /proc/fs/lustre/obdfilter/lustre-OST0001/stats @ 1365124587.752293 Name Cur.Count Cur.Rate #Events Unit last min avg max read_bytes 0 0 1625295 [bytes] 0 4096 393794.95 1048576 write_bytes 6568 437 652523 [bytes] 6880755712 4096 1047962.14 1048576 get_info 73 4 1538 set_info_async 0 0 3 connect 0 0 125 statfs 3 0 1226 create 0 0 12 destroy 34 2 768 sync 39 2 768 preprw 6568 437 2277818 commitrw 6568 437 2277818 ping 108 7 23612 "bad" /proc/fs/lustre/obdfilter/lustre-OST0003/stats @ 1365124582.616412 Name Cur.Count Cur.Rate #Events Unit last min avg max read_bytes 0 0 1226305 [bytes] 0 4096 521965.14 1048576 write_bytes 5602 373 653713 [bytes] 5861138432 4096 1042516.96 1048576 get_info 127 8 1426 set_info_async 0 0 2 connect 0 0 125 statfs 3 0 1223 create 0 0 12 destroy 63 4 712 sync 64 4 712 preprw 5602 373 1880018 commitrw 5602 373 1880018 ping 95 6 23601 Looks like a "normal" server is doing more work.

Cliff White (Inactive) added a comment - 05/Apr/13 1:26 AM

We are using zfs-0.6.0-rc10

Cliff White (Inactive) added a comment - 05/Apr/13 1:26 AM We are using zfs-0.6.0-rc10

Cliff White (Inactive) added a comment - 05/Apr/13 1:11 AM

Alex, this is consistently reproduceable on Hyperion - what data should we be looking at?

Cliff White (Inactive) added a comment - 05/Apr/13 1:11 AM Alex, this is consistently reproduceable on Hyperion - what data should we be looking at?

People

Assignee:: WC Triage

Reporter:: Cliff White (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 04/Apr/13 6:45 PM

Updated:: 14/Feb/15 1:11 AM

Resolved:: 14/Feb/15 1:11 AM