[LU-7880] add performance statistics to obd_statfs Created: 15/Mar/16 Updated: 04/Oct/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major |
| Reporter: | Andreas Dilger | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | medium | ||
| Issue Links: |
|
||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
In order to facilitate transfer of OST and MDT performance statistics for userspace applications, such as global NRS scheduling, SCR checkpoint scheduling, QOS and allocation decisions on the MDS, etc, it is useful to transport them via obd_statfs to the clients. The statistics should include <peak, decaying average of current> <IOPS read, IOPS write, KiB/s read, KiB/s write>. The OSS and MDS already collect these statistics for presentation via /proc and it should be possible to include them into struct obd_statfs in newly-added fields at the end of the struct. The stats should be fetched and printed with lfs df --stats command for all targets, but not necessarily for regular statfs() requests. With |
| Comments |
| Comment by Alex Zhuravlev [ 12/Dec/18 ] |
|
do I understand correctly, that OFD should track performance on its own? something like a separate thread (or timer-driven callback) collecting stats from OSD and maintaining a history of average/peak throughput, RPC rate? AFAIU, we don't track average for last few seconds, just average since start or reset.
|
| Comment by Andreas Dilger [ 03/Apr/19 ] |
|
We already track stats on the OST and MDT for RPCs, read/write calls with min/max duration, read_bytes/write_bytes with sums. It should be fairly straight forward to use the existing stats counters to generate peak performance and decaying average performance, either directly or by doing simple delta calculations when statfs is called (eg. save the last time and last stats and do a simple rate calculation over the past minute or whatever). The MDS is already calling statfs in the background every 5s so that is often enough to keep this updated. |
| Comment by Nathan Rutman [ 04/Oct/23 ] |
|
I suppose you can get instantaneous rates for the last 5 seconds if you only record the stats when called by the MDT. I think 60-second averages are more useful so we don't have to poll statfs so often; I suppose we could only record the stats if the timestamp of the last record is greater than 60 seconds, so we would effectively have 60-second epochs. |