[LU-13123] Add list of client NIDs to job_stats output Created: 10/Jan/20  Updated: 02/May/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: medium

Issue Links:
Related
is related to LU-11407 Improve stats data Resolved
is related to LU-12872 Adding more stats into JOBSTATS Resolved
Rank (Obsolete): 9223372036854775807

 Description   

It would be useful for debugging performance problems if there was a list of client NIDs included in the "obdfilter.*.job_stats" output, so that it is possible to isolate which client(s) are causing a bad IO workload.

At the basic level it would just print a list of NIDs no particular order (though likely using an rbtree or hash for the NIDs would make it faster to check for new RPCs if the NID was already in the list rather than scanning a linked list).

It would be desirable to print the NIDs as ranges (e.g. 10.0.101.22-75,10.0.102.45-92 or similar) so that the output is not so verbose.  This merging would only be needed when the job_stats file is read, so should not necessarily be done at stat collection time.


Generated at Sat Feb 10 02:58:34 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.