[LU-694] Job Stats Created: 21/Sep/11 Updated: 22/Nov/14 Resolved: 04/Jun/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0 |
| Fix Version/s: | Lustre 2.3.0 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Niu Yawei (Inactive) | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 4306 | ||||||||
| Description |
|
This feature is to collect filesystem operation stats for the jobs running on Lustre. When some job sheculer (SLURM, for instance) is running on lustre client, the lustre client will pack the job id into each request (open, unlink, write...), and server will collect those information then expose them via procfs. |
| Comments |
| Comment by Niu Yawei (Inactive) [ 21/Sep/11 ] |
The Jobstats is disabled by default, that can be verified by checking the /proc/fs/lustre/jobid_var on client, lctl get_param jobid_var jobid_var=disable To enable the Jobstats, one can specify the 'jobid_var' for a certain job scheduler.
To enable Jobstats for certain job scheduler, the 'jobid_var' should be configured as proper value: For example, to enable Jobstats for SLURM on a fs named 'lustre': lctl conf_param testfs.sys.jobid_var=SLURM_JOB_ID Disable Jobstats on a fs named 'lustre': lctl conf_param testfs.sys.jobid_var=disable If there isn't any job scheduler is running over the system, or user just want to collect the stats for process & uid: lctl conf_param testfs.sys.jobid_var=procname_uid
The metadata operation stats is collected on MDT, and one can access it via lctl get_param mdt.*.job_stats: lctl get_param mdt.lustre-MDT0000.job_stats
job_stats:
- job_id: bash.0
snapshot_time: 1352084992
open: { samples: 2, unit: reqs }
close: { samples: 2, unit: reqs }
mknod: { samples: 0, unit: reqs }
link: { samples: 0, unit: reqs }
unlink: { samples: 0, unit: reqs }
mkdir: { samples: 0, unit: reqs }
rmdir: { samples: 0, unit: reqs }
rename: { samples: 0, unit: reqs }
getattr: { samples: 3, unit: reqs }
setattr: { samples: 0, unit: reqs }
getxattr: { samples: 0, unit: reqs }
setxattr: { samples: 0, unit: reqs }
statfs: { samples: 0, unit: reqs }
sync: { samples: 0, unit: reqs }
samedir_rename: { samples: 0, unit: reqs }
crossdir_rename: { samples: 0, unit: reqs }
- job_id: dd.0
snapshot_time: 1352085037
open: { samples: 1, unit: reqs }
close: { samples: 1, unit: reqs }
mknod: { samples: 0, unit: reqs }
link: { samples: 0, unit: reqs }
unlink: { samples: 0, unit: reqs }
mkdir: { samples: 0, unit: reqs }
rmdir: { samples: 0, unit: reqs }
rename: { samples: 0, unit: reqs }
getattr: { samples: 0, unit: reqs }
setattr: { samples: 0, unit: reqs }
getxattr: { samples: 0, unit: reqs }
setxattr: { samples: 0, unit: reqs }
statfs: { samples: 0, unit: reqs }
sync: { samples: 2, unit: reqs }
samedir_rename: { samples: 0, unit: reqs }
crossdir_rename: { samples: 0, unit: reqs }
The data operation stats is collected on OST, can one can check it via lctl get_param obdfilter.*.job_stats: lctl get_param obdfilter.lustre-OST0000.job_stats
job_stats:
- job_id: bash.0
snapshot_time: 1352085025
read: { samples: 0, unit: bytes, min: 0, max: 0, sum: 0 }
write: { samples: 1, unit: bytes, min: 4, max: 4, sum: 4 }
setattr: { samples: 0, unit: reqs }
punch: { samples: 0, unit: reqs }
sync: { samples: 0, unit: reqs }
One can clear the job stats for a certain MDT or OST by writing the proc file 'job_stats'. Clear stats for all job on testfs-OST0001: lctl set_param obdfilter.testfs-OST0001.job_stats=clear Clear stats for job "dd.0" on lustre-MDT0000: lctl set_param mdt.lustre-MDT0000.job_stats=dd.0
By default, if some job doesn't have any activities for 600 seconds, it's stats will be cleared, this expiration value for instance, change the cleanup interval to just over an hour (4000) seconds for MDT: lctl conf_param lustre.mdt.job_cleanup_interval=4000 The 'job_cleanup_interval' can be set as 0 to disable the auto-cleanup. |
| Comment by Niu Yawei (Inactive) [ 21/Sep/11 ] |
| Comment by Niu Yawei (Inactive) [ 10/Nov/11 ] |
|
follow-up patch which moves 'jobid_var' to global: http://review.whamcloud.com/1683 |
| Comment by Shuichi Ihara (Inactive) [ 30/Jan/12 ] |
|
Hello, Niu |
| Comment by Niu Yawei (Inactive) [ 30/Jan/12 ] |
|
Hi, Ihara The patch will not be landed for 2.2, which version it should be landed for is not decided yet. |
| Comment by Richard Henwood (Inactive) [ 27/Apr/12 ] |
|
I have been advised that a filesystem name may not uniquely identify a lustre filesystem. I am not sure what a better choice for the command you have above is, but some thought as to an alternative to fs name would be valuable. |
| Comment by Niu Yawei (Inactive) [ 29/Apr/12 ] |
Hi, Richard, fs name should be unique on a single MGS namespace, and most 'lctl conf_param' uses fsname to identify a filesystem. Do ou suggest that we'd set jobstats parameters per target server but not per fs? I'm not sure if I followed your comment correctly? |
| Comment by Nathan Rutman [ 30/Apr/12 ] |
|
The intent of the MGS was to provide config info for all the filesystems at a site, so the fs name is unique. If multiple MGS's are being used, on different nodes, the filesystem name could overlap – but you'd have to be masochistic to use the same filesystem name for two different filesystems at a single site. |
| Comment by Christopher Morrone [ 30/Apr/12 ] |
|
Nathan, it boggles my mind as well. But I know for a fact that folks out there have done it, because they complained about LMT not being able to handle two filesystem having exactly the same name. They seemed to think it was Livermore's responsibility to factor in additional information like IP addresses to uniquely identify filesystems with the same name. I of course declined. But one has to sympathize with the users. Configuring lustre is so horribly bad that something like the filesystem name is completely non-obvious to most people. You set it once in some cryptic way, and it none too clear from that point forward how it is used at all. Which I suppose is a long winded way of agreeing that filesystem names really need to be unique, and bending over backwards to differentiate filesystems with the same name is a path to madness. But we also need to promote the name to a first-class object that is used in a sane way in the command-line tools and throughout lustre. We also need to clearly document filesystem name usage. |
| Comment by Richard Henwood (Inactive) [ 02/May/12 ] |
|
Thanks for the input. I agree that fs names are useful, for example:
So, how about supporting: <mount point|fsname>? If the mount point is not valid, then error |
| Comment by Andreas Dilger [ 22/Nov/14 ] |
|
Just updated examples in this bug to be more clear, since it showed up in a Google search. I prefer not to use "lustre" as the fsname in examples, since it is very non-obvious that this needs to be replaced with the actual fsname and is not an fixed part of the parameter being specified (like the "sys.jobid_var" part is). |