[LU-12631] Report latency of client operations Created: 06/Aug/19  Updated: 28/Oct/22  Resolved: 12/Nov/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.14.0

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: LTS12

Issue Links:
Related
is related to LU-13733 report client stats sumsq Resolved
is related to LU-13597 add processing time/latency, IO sizes... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

Add statistics on the client to measure and report latency of operations.

The current llite.*.stats file only reports the operation counts, not the time taken for each one. It should be simple to add stats to measure min/max/sum/sumsq for these metrics, including new read and write operations for the latency, as currently it reports read_bytes and write_bytes, which should also be kept. It might make sense to report sync writes separately as write_sync (with file->f_flags & (O_DIRECT | O_SYNC) set) since the latency profile will be quite different compared to cached writes.



 Comments   
Comment by Andreas Dilger [ 07/Aug/19 ]

The place that this should be done is llite_opcode_table and add LPROCFS_CNTR_AVGMINMAX to all of the fields there, and maybe LPROCFS_CNTR_STDDEV to the main ones that correspond to actual userspace VFS operations, not necessarily the "internal" stats like LPROC_LL_ALLOC_INODE, LPROC_LL_GETXATTR_HITS, and LPROC_LL_INODE_PERM. The LPROCFS_TYPE_REGS type can be changed to LPROCFS_TYPE_USEC for the "usec" units since it isn't used anywhere.

We need to record the start and end time for each operation using ktime_get() and only convert the times to usec units only when printed.

Comment by Peter Jones [ 09/Aug/19 ]

Jian

Could you please assist with this?

Thanks

Peter

Comment by Gerrit Updater [ 06/Sep/19 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36078
Subject: LU-12631 llite: report latency for filesystem ops
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c2acdf859997d91efbedfbfc80ece00323ee636e

Comment by Gerrit Updater [ 12/Nov/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36078/
Subject: LU-12631 llite: report latency for filesystem ops
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ea58c4cfb0fc255befbbb7754bd4ed71704a2a2c

Comment by Peter Jones [ 12/Nov/19 ]

Landed for 2.14

Generated at Sat Feb 10 02:54:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.