[LU-4748] sanity test 116b: error: set_param: /proc/{fs,sys}/{lnet,lustre}/17%: Found no match Created: 11/Mar/14  Updated: 28/May/14  Resolved: 20/Mar/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.5.1
Fix Version/s: Lustre 2.6.0, Lustre 2.5.2

Type: Bug Priority: Critical
Reporter: Jian Yu Assignee: Emoly Liu
Resolution: Fixed Votes: 0
Labels: None
Environment:

Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/40/
Distro/Arch: RHEL6.5/x86_64
MDSCOUNT=4


Issue Links:
Related
is related to LU-4762 lctl set_param should return error wh... Resolved
Severity: 3
Rank (Obsolete): 13069

 Description   

While running sanity test 116b with MDSCOUNT=4, it failed as follows:

== sanity test 116b: QoS shouldn't LBUG if not enough OSTs found on the 2nd pass == 07:33:42 (1394462022)
CMD: shadow-26vm12 lctl get_param -n lov.*mdtlov*.qos_threshold_rr
CMD: shadow-26vm12 lctl set_param lov.*mdtlov*.qos_threshold_rr 0
lov.lustre-MDT0000-mdtlov.qos_threshold_rr=0
lov.lustre-MDT0001-mdtlov.qos_threshold_rr=0
lov.lustre-MDT0002-mdtlov.qos_threshold_rr=0
lov.lustre-MDT0003-mdtlov.qos_threshold_rr=0
CMD: shadow-26vm12 lctl set_param fail_loc=0x147
fail_loc=0x147
total: 20 creates in 0.09 seconds: 228.53 creates/second
CMD: shadow-26vm12 lctl set_param fail_loc=0
fail_loc=0
CMD: shadow-26vm12 lctl set_param lov.*mdtlov*.qos_threshold_rr 17% 17% 17% 17%
shadow-26vm12: error: set_param: /proc/{fs,sys}/{lnet,lustre}/17%: Found no match
lov.lustre-MDT0000-mdtlov.qos_threshold_rr=17%
lov.lustre-MDT0001-mdtlov.qos_threshold_rr=17%
lov.lustre-MDT0002-mdtlov.qos_threshold_rr=17%
lov.lustre-MDT0003-mdtlov.qos_threshold_rr=17%
 sanity test_116b: @@@@@@ FAIL: test_116b failed with 3

Maloo report: https://maloo.whamcloud.com/test_sets/9d5c1624-a861-11e3-a16f-52540035b04c

The same test passed with MDSCOUNT=2.



 Comments   
Comment by Peter Jones [ 11/Mar/14 ]

Guidance from Di

"If the failure caused by resetting the original threadhold_rr,
do_facet $SINGLEMDS lctl set_param lov.mdtlov.qos_threshold_rr $old_rr
probably we need add true at the end of the test. Though I did not check why "reset threashold_rr" is failed."

Comment by Emoly Liu [ 11/Mar/14 ]

There is a problem in test script when getting $old_rr.

old_rr=$(do_facet $SINGLEMDS lctl get_param -n lov.*mdtlov*.qos_threshold_rr)
[root@centos6-1 tests]# ../utils/lctl get_param -n lov.*mdtlov*.qos_threshold_rr
17%
17%
17%
17%
[root@centos6-1 tests]# ../utils/lctl get_param lov.*mdtlov*.qos_threshold_rr
lov.lustre-MDT0000-mdtlov.qos_threshold_rr=17%
lov.lustre-MDT0001-mdtlov.qos_threshold_rr=17%
lov.lustre-MDT0002-mdtlov.qos_threshold_rr=17%
lov.lustre-MDT0003-mdtlov.qos_threshold_rr=17%

Since the value is wrong, when we reset threshold_rr, the error happened.

BTW, I tried MDSCOUNT=2, it really did pass. I think there is something improper in parsing parameters format.

I will push a patch to fix $old_rr first, and then work on the parameters format.

Comment by Emoly Liu [ 11/Mar/14 ]

The patch to fix $old_rr is here: http://review.whamcloud.com/9580

Comment by Emoly Liu [ 13/Mar/14 ]

backport to b2_5: http://review.whamcloud.com/9636

Comment by Emoly Liu [ 13/Mar/14 ]

BTW, I tried MDSCOUNT=2, it really did pass. I think there is something improper in parsing parameters format.

I created another ticket LU-4762 to adress the parameter format issue in "lctl set_param"

Comment by Jodi Levi (Inactive) [ 20/Mar/14 ]

Patch landed to Master. Additional patch will be back ported soon.

Comment by Andreas Dilger [ 28/May/14 ]

Patch landed to b2_5 for 2.5.2.

Generated at Sat Feb 10 01:45:30 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.