[LU-7195] Allow for static string content for jobstats jobid_var Created: 22/Sep/15 Updated: 02/Feb/17 Resolved: 25/Oct/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0, Lustre 2.5.3 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Jesse Hanley | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Environment: |
RHEL 6.6 |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
We've been benchmarking I/O performance (mainly metadata operations) with job stats enabled. There's potential for a performance impact when using the environment variable setup. This performance degradation appears to be associated with the environment variable lookup. The impact when using the special procname_uid setting is negligible. To counter this, we would like to see the ability to support a static string that's not evaluated as an environment variable, but is simply passed along with the rpc. I would like to propose using a prefix to the jobid_var variable to indicate that it should be passed, not evaluated. I think it would make sense to use a symbol like @ for this prefix. I'm basing this on my assumption that environment variables compliant with IEEE Std 1003.1-2001 will not contain the at-sign. This would allow administrators to statically set this at a job start, or using the client's hostname, etc, without the overhead of the environment lookup. This also allows us the ability to take this out of the user's control without resorting to read-only variables in their environments. Examples of use: Associating traffic per-host: lctl set_param jobid_var="@$(hostname)" Associating traffic with a specific string: lctl set_param jobid_var="@benchmarking" From my understanding, it looks like this would be a pretty straight forward change to the obd class, within the lustre_get_jobid function. I have a potential patch I can push to master if this is a behavior we want supported. Thanks! |
| Comments |
| Comment by James A Simmons [ 22/Sep/15 ] |
|
As a note we are seeing 9% performance lose for each job due to job stats reading the environment variables. |
| Comment by Oleg Drokin [ 22/Sep/15 ] |
|
Upstream kernel client has a different mechanism where every node has a setting for node-wide jobid to be set in prologue. This is a limited solution anyway because it makes only a single setting for entire node so if multiple jobs are running - it won't work. |
| Comment by James A Simmons [ 22/Sep/15 ] |
|
Yep. This looks like the solution that is needed. Will port it. |
| Comment by Gerrit Updater [ 22/Sep/15 ] |
|
James Simmons (uja.ornl@yahoo.com) uploaded a new patch: http://review.whamcloud.com/16598 |
| Comment by Andreas Dilger [ 27/Sep/15 ] |
|
Since lots of users are already using job stats, it also makes sense to improve the performance of the existing code. When Oleg's patch to remove the environment variable access was going upstream, Peng Tao and I also implemented a cache mechanism for the jobid so that it didn't need to access the environment very much. I'll have to see if I can find a version of that patch. The other concern is that some sites run with multiple different jobs on the same nodes, so having a single global jobid assigned to the node will not work for them. James, it would be good to know what you were testing that hit this performance loss, since I thought we tested it ourselves and didn't see anything close to that. I wonder if something has changed in newer kernels that would make it so much can worse? It might be that this only shows up for metadata-heavy jobs, and not IO jobs? Maybe the other difference is how many environment variables are set, since this could affect the parsing time significantly. |
| Comment by Jesse Hanley [ 29/Sep/15 ] |
|
Hey Andreas, These were actually from some runs I did. Yes, your assumption is right - this from metadata-heavy jobs. From my IOR runs I didn't see any noticeable impact. I was comparing run times of mdtest. Here's the parameters I used on a 2.7 client: Shared directory: mpirun -n 8 -N 8 mdtest -n 131072 -d output/run -F -C -T -r -N 8 I was benchmarking the overhead since we do have some metadata heavy jobs. I did about a dozen runs like this with jobid_var set to disable, an environment variable, and procname_uid. In the case of the environment variable, I tested with the target variable both undefined and defined when performing the runs. There was very little detectable overhead when using procname_uid, which I expected since it's a pretty easy lookup. When set to an environment variable, it was about a 5% hit, with worse behavior for file creations using a shared directory (the 7% to 9% range). Does this help? |
| Comment by Andreas Dilger [ 30/Sep/15 ] |
|
Jesse, thanks for the additional information. The results definitely make more sense in this regard. I guess there isn't much surprise that there is some overhead for metadata-heavy workloads since the jobid value is cached in the inode, but with a file create workload the inodes are never re-used. I don't know if there is anything that could be done to improve performance for a per-task jobid, since the jobid is already kept in the process task struct in the kernel, just in an inefficient-to-access ASCII string format. There isn't any spare space in the task struct for keeping extra data, although it might be possible to cache the jobid in the process "env". There was a patch to do this posted on LKML at one point (https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg528724.html), but the whole jobid functionality was ripped out of upstream so it never landed. It might be worthwhile to see if it could be revived and the lu_env cached jobid could also be used to populate lli_jobid and then md_op_data to pass the jobid down to the MDC code for storing in pb_jobid before it gets down to ptlrpc_set_add_req(), similar to how it happens in the IO path. |
| Comment by Andreas Dilger [ 30/Sep/15 ] |
|
Slightly updated version of patch to cache jobid in vvp_env. This still needs to be updated to copy the jobid into md_op_data to pass down to the MDC layer. |
| Comment by Gerrit Updater [ 24/Oct/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16598/ |
| Comment by Peter Jones [ 25/Oct/15 ] |
|
Landed for 2.8 |
| Comment by Jian Yu [ 26/Oct/15 ] |
|
I created |
| Comment by Gerrit Updater [ 02/Feb/17 ] |
|
Ben Evans (bevans@cray.com) uploaded a new patch: https://review.whamcloud.com/25208 |
| Comment by James A Simmons [ 02/Feb/17 ] |
|
Please create a new ticket. |