[LU-16164] Check params and make them valid YAML if they are designed to be Created: 16/Sep/22  Updated: 02/Nov/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Feng Lei Assignee: Feng Lei
Resolution: Unresolved Votes: 0
Labels: None

Attachments: HTML File invalid_yaml_list     HTML File read_fail_list    
Issue Links:
Related
is related to LU-16110 Make output of jobs_stats and rename_... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

From https://review.whamcloud.com/#/c/fs/lustre-release/+/48417/8/lustre/tests/sanity.sh@1154

It would be good to run "lctl get_param -R '*'" on both the client and server to see which parameters are in YAML format, and add checks for them so that they don't break in the future.



 Comments   
Comment by Feng Lei [ 21/Sep/22 ]

Check all the params with a script:

#!/bin/bash
  
>read_fail_list
>invalid_yaml_list

for i in `lctl get_param -R -N "*"`; do
        echo ===$i===
        lctl get_param $i > /dev/null || echo $i >> read_fail_list
        lctl get_param -n $i | yq
        if [[ $? != 0 ]]; then
                echo $i >> invalid_yaml_list
        fi
done

Get 2 files lists, one for files cannot be read, the other for invalid YAML. In the attachment.

Currently there are 75 files cannot be read, and 51 invalid YAML files.

# wc read_fail_list 
  75   75 2733 read_fail_list
# wc invalid_yaml_list 
  51   51 1844 invalid_yaml_list
Comment by Feng Lei [ 21/Sep/22 ]

Even some files pass the check of yq, they are parsed as a single string, not meaningful YAML. For example:

# lctl get_param -n mgs.*.*.stats
snapshot_time             1663721269.216694664 secs.nsecs
req_waittime              15358 samples [usec] 12 67506 3474626 11719248936
req_qdepth                15358 samples [reqs] 0 0 0 0
... 

It's actually parsed as a single string, not a meaningful YAML:

# lctl get_param -n mgs.*.*.stats | yq
"snapshot_time             1663721319.904973718 secs.nsecs req_waittime              15368 samples [usec] 12 67506 3477317 11720202797 req_qdepth                15368 samples [reqs] 0 0 0 0 req_active                15368 samples [reqs] 1 2 15369 15371 req_timeout               15368 samples [sec] 1 10 15377 15467 reqbuf_avail              45997 samples [bufs] 62 64 2913042 184496438 ldlm_plain_enqueue        50 samples [reqs] 1 1 50 50 mgs_connect               1 samples [usec] 156 156 156 24336 mgs_target_reg            4 samples [usec] 573 8409 17020 105176030 mgs_config_read           9 samples [usec] 125 959 4866 3121456 obd_ping                  15151 samples [usec] 2 5710 866463 192267777 llog_origin_handle_open   39 samples [usec] 99 1632 13234 7631248 llog_origin_handle_next_block 76 samples [usec] 92 1099 25799 12133241 llog_origin_handle_read_header 38 samples [usec] 71 1073 11972 5535674 snapshot_time             1663721319.905164765 secs.nsecs" 
Comment by Andreas Dilger [ 02/Nov/23 ]

The "*.*.stats" files are mostly not YAML formatted.

The osc/mdc.*.import file should be YAML, as with the obdfilter/mdt.*.exports.*.export files.

It looks like on the server there are ldlm.services.ldlm_canceld.nrs_policies and ldlm.services.ldlm_cbd.nrs_policies, along with ost.OSS.*.nrs_policies, mds.MDS.*.nrs_policies and mgs.MGS.mgs.nrs_policies, mdd.*.lfsck_layout, mdt.*.exports.*.reply_data, mdt.*.recovery_status files.

There is also the quota files osd-ldiskfs.*.quota_slave*.acct_group, osd-ldiskfs.*.quota_slave*.acct_group,acct_project,acct_user, osd-ldiskfs.*.quota_slave*.info, osd-ldiskfs.*.quota_slave*.limit_group,limit_project,limit_user, osd-ldiskfs.*.oi_scrub, qmt.*.dt-0x0.glb-grp,glb-prj,glb-usr .

On the client, llite.*.max_cached_mb, llite.*.statahead_stats, llite,osc,mdc.*.unstable_stats, osc.*.osc_cached_mb, mdc.*.mdc_cached_mb, mdc.*.state.

The lwp.*.srpc_info looks like it could/should be in YAML format, but needs a few small improvements (colon after "gc internal" and "gc next").

Generated at Sat Feb 10 03:24:35 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.