Restore missing proc information for LMT (LU-3300)

[LU-4259] No brw_stats on ZFS Created: 15/Nov/13  Updated: 18/Mar/15  Resolved: 05/Jan/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.7.0
Fix Version/s: Lustre 2.7.0

Type: Technical task Priority: Blocker
Reporter: Cliff White (Inactive) Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: prz, zfs

Issue Links:
Related
is related to LU-2261 Add cache stats to zfs osd Resolved
is related to LU-3300 Restore missing proc information for LMT Open
Rank (Obsolete): 11623

 Description   

ZFS mounts do not report the same statistics as reported by ldiskfs. It is impossible to determine IO size, IO rates or other values needed for troubleshooting and performance analysis.



 Comments   
Comment by Keith Mannthey (Inactive) [ 15/Nov/13 ]

You maybe able to use block level tool like "iostat" to get at some of the information easily.

Comment by Cliff White (Inactive) [ 15/Nov/13 ]

No, iostat does not give the timing information offered by brw_stat, neither is it aware of Lustre IO.

Comment by Keith Mannthey (Inactive) [ 15/Nov/13 ]

Io stat will tell you all sorts of timing information .

Comment by Andreas Dilger [ 28/Nov/13 ]

Many of the disk brw_stats might be available on an aggregate basis, if the plumbing is available in ZFS. The per-client and per-job brw_stats is much more tricky because the ZFS IO is not allocated or submitted to disk until long after the service thread has completed processing the request.

The RPC information like "pages per bulk r/w" and "discontiguous pages" could be available independent of the OSD type. These should really be OFD statistics, maybe in a new "rpc_stats" file, possibly in YAML format? These could also be available on a per-client or per-request basis. This might be enough for your debugging purposes?

Information like "disk I/Os in flight", "I/O time", and "disk I/O size" might be gotten at an aggregate basis from ZFS. The "discontiguous blocks" and "disk fragmented I/Os" would be much harder to collect for writes, without deep hooks into the ZFS IO scheduler. Some of this information could be extracted for reads, by hacking into the ZFS block pointers to get the physical disk blocks.

Comment by Peter Jones [ 05/Aug/14 ]

Nathaniel is looking into this

Comment by Nathaniel Clark [ 15/Aug/14 ]

http://review.whamcloud.com/11467

Comment by James A Simmons [ 23/Dec/14 ]

Patch has been rebased to latest master. Please review.

Comment by Gerrit Updater [ 03/Jan/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/11467/
Subject: LU-4259 osd-zfs: Add brw_stats collection and display
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: efdacb72afa55fc8b5ad155c479aa34ef0af118a

Generated at Sat Feb 10 01:41:06 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.