[LU-9221] Create pid-based hash to enhance Jobstats performance Created: 16/Mar/17  Updated: 25/Apr/19  Resolved: 21/Sep/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.9.0
Fix Version/s: Lustre 2.11.0

Type: Improvement Priority: Minor
Reporter: Ben Evans (Inactive) Assignee: Ben Evans (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: Microsoft Word JobID Cache Proposal.docx     Microsoft Word JobID Cache Proposal.docx    
Issue Links:
Related
is related to LUDOC-381 Improve documentation for jobstats Open
Epic/Theme: Performance
Rank (Obsolete): 9223372036854775807

 Description   

Currently jobstats gets information by either dedicating an entire node to a single job (LU-7195) or reading from the userspace environment for each call. The first is very fast, but can be inaccurate, the second introduces a significant performance penalty on the client. This patch creates a cache of JobID names mapped to PIDs. It has several fallback mechanisms in order to present the most accurate info available.



 Comments   
Comment by Cory Spitz [ 21/Apr/17 ]

Patch up for review is at https://review.whamcloud.com/#/c/25208/.

Comment by Cory Spitz [ 18/May/17 ]

@hdoreau, the OpenSFS LWG is looking to you to be the reviewer of record for this enhancement. Can you accept? We're getting close to code cut-off for v2.10 and we were hoping that you would have time for reviews. Thanks!

Comment by Peter Jones [ 20/May/17 ]

Could still make 2.10 if reviews complete in time but otherwise will be in 2.10.1

Comment by Henri Doreau (Inactive) [ 23/May/17 ]

@cory: I'd be glad to help but it will not have much time to review Ben's patch before LUG...

As seen in first revisions of the patch, introducing such a new cache comes with a risk of subtle race conditions and memory leaks. Therefore reviewing and testing it seriously cannot be done in a rush.

Comment by Ben Evans (Inactive) [ 26/May/17 ]

JobID Cache Proposal.docx
Updated doc

Comment by Ben Evans (Inactive) [ 14/Jun/17 ]

https://jira.hpdd.intel.com/browse/LUDOC-381

Comment by Andrew Perepechko [ 31/Jul/17 ]

Another interesting tool which might be useful to implement this feature is the linux key management system (see man keyctl).

One can create a key with some data attached and link it to one of the available keyrings (thread, process, session, default session, etc). The key will be stored on a temporal or permanent basis. The cache can be easily searched from the kernel using request_key().

It's also possible to create a key type so that requested keys are created using a userspace upcall (see /sbin/request-key, /etc/request-key.conf).

Comment by Gerrit Updater [ 21/Sep/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25208/
Subject: LU-9221 jobstats: Create a pid-based hash for jobid values
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 08479b74ec3599ee91e14f3f646389bb0aca4575

Comment by Peter Jones [ 21/Sep/17 ]

Landed for 2.11

Generated at Sat Feb 10 02:24:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.