[LU-16164] Check params and make them valid YAML if they are designed to be Created: 16/Sep/22 Updated: 02/Nov/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Feng Lei | Assignee: | Feng Lei |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
From https://review.whamcloud.com/#/c/fs/lustre-release/+/48417/8/lustre/tests/sanity.sh@1154 It would be good to run "lctl get_param -R '*'" on both the client and server to see which parameters are in YAML format, and add checks for them so that they don't break in the future. |
| Comments |
| Comment by Feng Lei [ 21/Sep/22 ] |
|
Check all the params with a script: #!/bin/bash >read_fail_list >invalid_yaml_list for i in `lctl get_param -R -N "*"`; do echo ===$i=== lctl get_param $i > /dev/null || echo $i >> read_fail_list lctl get_param -n $i | yq if [[ $? != 0 ]]; then echo $i >> invalid_yaml_list fi done Get 2 files lists, one for files cannot be read, the other for invalid YAML. In the attachment. Currently there are 75 files cannot be read, and 51 invalid YAML files. # wc read_fail_list 75 75 2733 read_fail_list # wc invalid_yaml_list 51 51 1844 invalid_yaml_list |
| Comment by Feng Lei [ 21/Sep/22 ] |
|
Even some files pass the check of yq, they are parsed as a single string, not meaningful YAML. For example: # lctl get_param -n mgs.*.*.stats snapshot_time 1663721269.216694664 secs.nsecs req_waittime 15358 samples [usec] 12 67506 3474626 11719248936 req_qdepth 15358 samples [reqs] 0 0 0 0 ... It's actually parsed as a single string, not a meaningful YAML:
# lctl get_param -n mgs.*.*.stats | yq
"snapshot_time 1663721319.904973718 secs.nsecs req_waittime 15368 samples [usec] 12 67506 3477317 11720202797 req_qdepth 15368 samples [reqs] 0 0 0 0 req_active 15368 samples [reqs] 1 2 15369 15371 req_timeout 15368 samples [sec] 1 10 15377 15467 reqbuf_avail 45997 samples [bufs] 62 64 2913042 184496438 ldlm_plain_enqueue 50 samples [reqs] 1 1 50 50 mgs_connect 1 samples [usec] 156 156 156 24336 mgs_target_reg 4 samples [usec] 573 8409 17020 105176030 mgs_config_read 9 samples [usec] 125 959 4866 3121456 obd_ping 15151 samples [usec] 2 5710 866463 192267777 llog_origin_handle_open 39 samples [usec] 99 1632 13234 7631248 llog_origin_handle_next_block 76 samples [usec] 92 1099 25799 12133241 llog_origin_handle_read_header 38 samples [usec] 71 1073 11972 5535674 snapshot_time 1663721319.905164765 secs.nsecs"
|
| Comment by Andreas Dilger [ 02/Nov/23 ] |
|
The "*.*.stats" files are mostly not YAML formatted. The osc/mdc.*.import file should be YAML, as with the obdfilter/mdt.*.exports.*.export files. It looks like on the server there are ldlm.services.ldlm_canceld.nrs_policies and ldlm.services.ldlm_cbd.nrs_policies, along with ost.OSS.*.nrs_policies, mds.MDS.*.nrs_policies and mgs.MGS.mgs.nrs_policies, mdd.*.lfsck_layout, mdt.*.exports.*.reply_data, mdt.*.recovery_status files. There is also the quota files osd-ldiskfs.*.quota_slave*.acct_group, osd-ldiskfs.*.quota_slave*.acct_group,acct_project,acct_user, osd-ldiskfs.*.quota_slave*.info, osd-ldiskfs.*.quota_slave*.limit_group,limit_project,limit_user, osd-ldiskfs.*.oi_scrub, qmt.*.dt-0x0.glb-grp,glb-prj,glb-usr . On the client, llite.*.max_cached_mb, llite.*.statahead_stats, llite,osc,mdc.*.unstable_stats, osc.*.osc_cached_mb, mdc.*.mdc_cached_mb, mdc.*.state. The lwp.*.srpc_info looks like it could/should be in YAML format, but needs a few small improvements (colon after "gc internal" and "gc next"). |