[LU-14789] sanity 133f and 133g are no longer effective Created: 24/Jun/21  Updated: 15/May/23  Resolved: 29/Jul/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: Lustre 2.15.0

Type: Task Priority: Minor
Reporter: John Hammond Assignee: Cyril Bordage
Resolution: Fixed Votes: 0
Labels: tests

Issue Links:
Related
is related to LU-16828 build find cmdline correctly for sani... Resolved
is related to LU-10401 sanity test_133g: timeout during MDT ... Resolved
is related to LU-14788 sanity test_133g: crash in __proc_lne... Resolved
is related to LU-13091 conf-sanity "lctl list_param" test to... Open
Rank (Obsolete): 9223372036854775807

 Description   

In https://review.whamcloud.com/38567 (LU-10401 tests: add -F so list_param prints entry type) we replaced find $proc_dirs ... with $LCTL list_param -FR '*'. But badarea_io expects paths and not params. So every open silently fails and the tests don't test anything.



 Comments   
Comment by John Hammond [ 25/Jun/21 ]

ys could you take a look?

Comment by John Hammond [ 29/Jun/21 ]

cbordage could you take a look at this instead of ys? Let me know if you have any questions. The patch will probably need to be based on your fix for LU-14788.

Comment by Cyril Bordage [ 02/Jul/21 ]

I do not see the link with LU-14788

Would adding the path to the parameters (running a "find") be a good approach? I checked and not all parameters could be found. How to check them?

Comment by John Hammond [ 02/Jul/21 ]

On master, run

badarea_io /sys/kernel/debug/lnet/portal_rotor

with and without your fix for LU-14788. Without your fix it should produce an oops. With your fix it will not.

We had a test case for this. In https://review.whamcloud.com/38567 the test was made ineffective. Look at the commands used in 133g before and after 38567. Run them manually.

Comment by Cyril Bordage [ 05/Jul/21 ]

Yes, it is what I did. That is why I asked my previous question… To be clearer, I will rephrase.

I saw that badarea_io needs a path, and LU-14788 changes the complete path to the parameter only. With that, the test is skipped.

My question was, it is enough to find the corresponding path from list_param? What to do with parameters which cannot be found? If it is not clear, I will push a patch to have a base.

Comment by Gerrit Updater [ 08/Jul/21 ]

Cyril Bordage (cbordage@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44184
Subject: LU-14789 tests: make sanity 133f and 133g working
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 505630d47080929e5de4d52f5c37322824ce75db

Comment by Gerrit Updater [ 27/Jul/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/44184/
Subject: LU-14789 tests: make sanity 133f and 133g working
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 5a28f3bc4bd769d61dde54fdbf7ad11a16b47224

Comment by Peter Jones [ 29/Jul/21 ]

Landed for 2.15

Comment by Gerrit Updater [ 18/Jan/22 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46156
Subject: LU-14789 tests: fix sanity 133f/g/h again
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 847a4ebac3147a73c1543530dd69a98e27e86089

Comment by Gerrit Updater [ 15/May/23 ]

"Li Xi <lixi@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50993
Subject: LU-14789 sanity: wrong argument for find in test_133g
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8570167c142a31f3642467db724da75deee748f6

Comment by Li Xi [ 15/May/23 ]

The following huge output of test_133g is not correct. It means "find" does not include its "-name $PARAM" argument.

 

== sanity test 133g: Check reads/writes of server lustre proc files with bad area io ========================================================== 16:13:38 (1683821618)
CMD: onyx-103vm7 /usr/sbin/lctl get_param -n version 2>/dev/null
CMD: onyx-103vm7 /usr/sbin/lctl list_param -FR '*' | grep '=' |
tr -d = | egrep -v 'force_lbug|changelog_mask|daemon_file' |
xargs -n 1 find /proc/fs/lustre/
/sys/fs/lustre/
/sys/kernel/debug/lnet/
/sys/kernel/debug/lustre/ -name |
xargs -n 1 badarea_io
onyx-103vm7: find: 'at_early_margin': No such file or directory
onyx-103vm7: find: 'at_extra': No such file or directory
/proc/fs/lustre/
/proc/fs/lustre/lmv
/proc/fs/lustre/lod
/proc/fs/lustre/lod/lustre-MDT0000-mdtlov
/proc/fs/lustre/lod/lustre-MDT0000-mdtlov/pool
/proc/fs/lustre/lod/lustre-MDT0000-mdtlov/pools
/proc/fs/lustre/lod/lustre-MDT0000-mdtlov/mdt_obd
/proc/fs/lustre/lod/lustre-MDT0000-mdtlov/target_obd
/proc/fs/lustre/lov
/proc/fs/lustre/lov/lustre-MDT0000-mdtlov
/proc/fs/lustre/mdc
/proc/fs/lustre/mdt
/proc/fs/lustre/mdt/lustre-MDT0000
/proc/fs/lustre/mdt/lustre-MDT0000/exports
/proc/fs/lustre/mdt/lustre-MDT0000/exports/0@lo
/proc/fs/lustre/mdt/lustre-MDT0000/exports/0@lo/hash
/proc/fs/lustre/mdt/lustre-MDT0000/exports/0@lo/uuid
/proc/fs/lustre/mdt/lustre-MDT0000/exports/0@lo/stats
/proc/fs/lustre/mdt/lustre-MDT0000/exports/0@lo/export

 

Should I re-open this ticket or open a new one?

Comment by Peter Jones [ 15/May/23 ]

Given that the original fix has already gone out in a major release, I would suggest having a new ticket linked to this one. Then it will be clearer tracking-wise which versions contains which fixes.

Comment by Peter Jones [ 15/May/23 ]

Ah. It looks like you have done so already - 

 

Generated at Sat Feb 10 03:12:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.