[LU-16228] create lljobstats command - Whamcloud Community JIRA

Details

Type: New Feature
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.16.0
Affects Version/s: None
Labels:
None

Rank (Obsolete):
9223372036854775807

Description

In DDN-3356, by Andreas Dilger:

We don't have a tool to do this today, but it would make sense to write a simple tool "lljobstat" to show the top jobs on a server in order to simplify debugging of high load problems, since this is a reasonably frequent request.

It should be included with the base Lustre RPMs, so it must not have any complex external dependencies that are not included in the base OS distro (el7, el8, sles15, ubuntu22).

It should read all of the local "..job_stats" files (by default, or --ost or --mdt, or a specific jobstats file if given as an argument) every 10s interval (configurable, either "-i N" or last argument) and prints the top e.g. 5 jobs (configurable "-c N"), one line per job similar to "iostat -x -k -z 10". It should show something useful when run with minimal arguments (eg. just the interval), so that users can use it to easily determine which jobs are driving the most load.

Since the job_stats has a large number of stats, it is not possible to fit all of them in a single 80-column line, so any operations that have samples = 0 should not be shown. Priority for display should be to show read, write (counts, if non-zero), read_bytes, write_bytes (in MiB/s units, if non-zero), then the top metadata ops by count. It probably makes sense to use abbreviations for the names, like llobdstat so that more can fit onto the line (cx: create, dx: destroy, st: statfs, pu: punch, etc). In the newer llstat and llobdstst it checks if the terminal width is over 80 and shows more fields, but this doesn't have to be in the first version.

To determine the "top" jobs, it probably makes sense to sum the operations for the same job name across all watched job_stats files, then sort by total count of operations (read+write, but not bytes) and include this as the second item shown ("ops: N") after the job name ("job: name", with escaping/quoting if needed). The timestamp should be shown for each interval.

Given that the input is YAML, the output could also be YAML, but only if it can be formatted nicely for human readability (one line per job, no excessive quoting). The main users of this will be people, since monitoring tools will likely read and process all of the job_stats output directly.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

glljobstat
16 kB
17/Aug/23 4:38 AM
lljobstat
8 kB
18/Aug/23 7:03 AM

Issue Links

is related to

LU-16231 Lustre stats header incorrectly using boot time

Resolved

LU-16251 Fill jobid in an atomic way

Resolved

LU-16110 Make output of jobs_stats and rename_stats valid YAML

Resolved

LU-17352 Enhance lljobstat to read existing job_stats files

Resolved

Activity

[LU-16228] create lljobstats command

Andreas Dilger added a comment - 17/Aug/23 4:39 AM

bolausson, I pushed the "simple" version of your patch but it is reporting an error:

This is causing test failures:

 lljobstat -n 1 -i 0 -c 1000
  Traceback (most recent call last):
    File "/usr/bin/lljobstat", line 15, in 
       from yaml import CLoader as Loader, CDumper as Dumper
  ImportError: cannot import name 'CLoader'

Andreas Dilger added a comment - 17/Aug/23 4:39 AM bolausson , I pushed the "simple" version of your patch but it is reporting an error: This is causing test failures: lljobstat -n 1 -i 0 -c 1000 Traceback (most recent call last): File "/usr/bin/lljobstat", line 15, in from yaml import CLoader as Loader, CDumper as Dumper ImportError: cannot import name 'CLoader'

Peter Jones added a comment - 27/Jan/23 4:19 AM

Landed for 2.16

Peter Jones added a comment - 27/Jan/23 4:19 AM Landed for 2.16

Gerrit Updater added a comment - 27/Jan/23 12:33 AM

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48888/
Subject: ~~LU-16228~~ utils: add lljobstat util
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e2812e877314bc101efdc5a235c7fae8f7424f96

Gerrit Updater added a comment - 27/Jan/23 12:33 AM "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48888/ Subject: LU-16228 utils: add lljobstat util Project: fs/lustre-release Branch: master Current Patch Set: Commit: e2812e877314bc101efdc5a235c7fae8f7424f96

Andreas Dilger added a comment - 25/Jan/23 12:45 AM

It looks like the newly-added sanity.sh test_205e needs to add a version check for interop testing:

trevis-82vm3: sh: lljobstat: command not found

There is a version check in test_205d already.

Andreas Dilger added a comment - 25/Jan/23 12:45 AM It looks like the newly-added sanity.sh test_205e needs to add a version check for interop testing: trevis-82vm3: sh: lljobstat: command not found There is a version check in test_205d already.

Gerrit Updater added a comment - 17/Oct/22 5:38 AM

"Feng Lei <flei@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/48888
Subject: ~~LU-16228~~ utils: add lljobstat util
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 777dd8757b0a121daf22f275f7eeb5a2b00ea62f

Gerrit Updater added a comment - 17/Oct/22 5:38 AM "Feng Lei <flei@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/48888 Subject: LU-16228 utils: add lljobstat util Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 777dd8757b0a121daf22f275f7eeb5a2b00ea62f

Andreas Dilger added a comment - 12/Oct/22 8:18 AM

Feng Lei, this looks mostly good. I would say that the comment is large enough that it shouldn't be printed each time, maybe just document the abbreviations in the man page or if "-h" is used. I would suggest "ln" for link (to match the command name).

The "top_jobs:" should have an underscore so it is a single word, even though I know YAML does not require this, since it makes parsing easier with scripts (eg. "awk '/keyname:/ { print $2 }'".

Andreas Dilger added a comment - 12/Oct/22 8:18 AM Feng Lei, this looks mostly good. I would say that the comment is large enough that it shouldn't be printed each time, maybe just document the abbreviations in the man page or if " -h " is used. I would suggest " ln " for link (to match the command name). The " top_jobs: " should have an underscore so it is a single word, even though I know YAML does not require this, since it makes parsing easier with scripts (eg. " awk '/keyname:/ { print $2 }' ".

Feng Lei added a comment - 12/Oct/22 6:50 AM

adilger Is such an output OK?

# ./lljobstat
# Abbr.:
# cr: create,    op: open,      cl: close,     mn: mknod,     lk: link,     
# ul: unlink,    mk: mkdir,     rm: rmdir,     mv: rename,    ga: getattr,  
# sa: setattr,   gx: getxattr,  sx: setxattr,  st: statfs,    sy: sync,     
# rd: read,      wr: write,     pu: punch,     mi: migrate,   fa: fallocate,
# dt: destroy,   gi: get_info,  si: set_info,  qc: quotactl,  pa: prealloc, 
timestamp: 1665557039
top jobs:
- touch.500:       {ops: 6, op: 1, cl: 1, mn: 1, ga: 1, sa: 2}
- rm.0:            {ops: 6, cl: 2, ul: 1, rm: 1, ga: 1, st: 1}
- chown.0:         {ops: 3, ga: 2, sa: 1}
- bash.0:          {ops: 2, ga: 2}
- mkdir.0:         {ops: 2, mk: 1, st: 1}

Feng Lei added a comment - 12/Oct/22 6:50 AM adilger Is such an output OK? # ./lljobstat # Abbr.: # cr: create, op: open, cl: close, mn: mknod, lk: link, # ul: unlink, mk: mkdir, rm: rmdir, mv: rename, ga: getattr, # sa: setattr, gx: getxattr, sx: setxattr, st: statfs, sy: sync, # rd: read, wr: write, pu: punch, mi: migrate, fa: fallocate, # dt: destroy, gi: get_info, si: set_info, qc: quotactl, pa: prealloc, timestamp: 1665557039 top jobs: - touch.500: {ops: 6, op: 1, cl: 1, mn: 1, ga: 1, sa: 2} - rm.0: {ops: 6, cl: 2, ul: 1, rm: 1, ga: 1, st: 1} - chown.0: {ops: 3, ga: 2, sa: 1} - bash.0: {ops: 2, ga: 2} - mkdir.0: {ops: 2, mk: 1, st: 1}

Andreas Dilger added a comment - 11/Oct/22 8:38 AM

No, the time should be the current Unix timestamp in seconds:

# lctl get_param llite.*.stats
llite.testfs-ffff89b1b9c27000.stats=
snapshot_time             1665476432.161461498 secs.nsecs
ioctl                     502 samples [reqs]
getattr                   290 samples [usec] 56 1059 48623 11761597
getxattr                  2 samples [usec] 975 30159 31134 910515906
inode_permission          298 samples [usec] 61 566 52783 11517621
opencount                 295 samples [reqs] 1 1 295 295
# date +%s
1665476439

there is a bug on master that the timestamp is incorrectly printing the boot-relative time instead of the wallclock time. See ~~LU-16231~~.

Andreas Dilger added a comment - 11/Oct/22 8:38 AM No, the time should be the current Unix timestamp in seconds: # lctl get_param llite.*.stats llite.testfs-ffff89b1b9c27000.stats= snapshot_time 1665476432.161461498 secs.nsecs ioctl 502 samples [reqs] getattr 290 samples [usec] 56 1059 48623 11761597 getxattr 2 samples [usec] 975 30159 31134 910515906 inode_permission 298 samples [usec] 61 566 52783 11517621 opencount 295 samples [reqs] 1 1 295 295 # date +%s 1665476439 there is a bug on master that the timestamp is incorrectly printing the boot-relative time instead of the wallclock time. See LU-16231 .

Feng Lei added a comment - 11/Oct/22 4:29 AM

adilger To confirm that snapshot_time is designed to be uptime (the seconds from the last OS bootup), not clock time. For example:

# lctl get_param *.*.job_stats | grep snapshot
  snapshot_time:   5754772.790688109 secs.nsecs

It is significantly different from epoch seconds:

# date +%s
1665461988

But similar to system uptime:

# cat /proc/uptime
5755466.00 22244003.59

Feng Lei added a comment - 11/Oct/22 4:29 AM adilger To confirm that snapshot_time is designed to be uptime (the seconds from the last OS bootup), not clock time. For example: # lctl get_param *.*.job_stats | grep snapshot snapshot_time: 5754772.790688109 secs.nsecs It is significantly different from epoch seconds: # date +%s 1665461988 But similar to system uptime: # cat /proc/uptime 5755466.00 22244003.59

Feng Lei added a comment - 11/Oct/22 2:41 AM

Command Synopsis:

lljobstat [-i|--interval NUM] [-c|--count NUM] [--mdt|--ost|--param PARAM_PATH] 
  -i NUM: interval in seconds, default 10
  -c NUM: how many jobs are displayed, default 5
  --mdt: check only mdt job_stats
  --ost: check only ost job_stats
  --param PARAM_PATH: check specified PARAM_PATH, e.g., *.lustre-*.job_stats

Feng Lei added a comment - 11/Oct/22 2:41 AM Command Synopsis: lljobstat [-i|--interval NUM] [-c|--count NUM] [--mdt|--ost|--param PARAM_PATH] -i NUM: interval in seconds, default 10 -c NUM: how many jobs are displayed, default 5 --mdt: check only mdt job_stats --ost: check only ost job_stats --param PARAM_PATH: check specified PARAM_PATH, e.g., *.lustre-*.job_stats

Andreas Dilger added a comment - 10/Oct/22 3:43 AM

The timestamp should be Unix seconds lik the other timestamps reported by Lustre. That avoids time zone issues and simplifies log correlation.

Andreas Dilger added a comment - 10/Oct/22 3:43 AM The timestamp should be Unix seconds lik the other timestamps reported by Lustre. That avoids time zone issues and simplifies log correlation.

People

Assignee:: Feng Lei

Reporter:: Feng Lei

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 09/Oct/22 6:54 AM

Updated:: 13/Jun/24 5:59 PM

Resolved:: 27/Jan/23 4:19 AM