[LU-4748] sanity test 116b: error: set_param: /proc/{fs,sys}/{lnet,lustre}/17%: Found no match Created: 11/Mar/14 Updated: 28/May/14 Resolved: 20/Mar/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0, Lustre 2.5.1 |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.5.2 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Jian Yu | Assignee: | Emoly Liu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/40/ |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 13069 | ||||||||
| Description |
|
While running sanity test 116b with MDSCOUNT=4, it failed as follows: == sanity test 116b: QoS shouldn't LBUG if not enough OSTs found on the 2nd pass == 07:33:42 (1394462022)
CMD: shadow-26vm12 lctl get_param -n lov.*mdtlov*.qos_threshold_rr
CMD: shadow-26vm12 lctl set_param lov.*mdtlov*.qos_threshold_rr 0
lov.lustre-MDT0000-mdtlov.qos_threshold_rr=0
lov.lustre-MDT0001-mdtlov.qos_threshold_rr=0
lov.lustre-MDT0002-mdtlov.qos_threshold_rr=0
lov.lustre-MDT0003-mdtlov.qos_threshold_rr=0
CMD: shadow-26vm12 lctl set_param fail_loc=0x147
fail_loc=0x147
total: 20 creates in 0.09 seconds: 228.53 creates/second
CMD: shadow-26vm12 lctl set_param fail_loc=0
fail_loc=0
CMD: shadow-26vm12 lctl set_param lov.*mdtlov*.qos_threshold_rr 17% 17% 17% 17%
shadow-26vm12: error: set_param: /proc/{fs,sys}/{lnet,lustre}/17%: Found no match
lov.lustre-MDT0000-mdtlov.qos_threshold_rr=17%
lov.lustre-MDT0001-mdtlov.qos_threshold_rr=17%
lov.lustre-MDT0002-mdtlov.qos_threshold_rr=17%
lov.lustre-MDT0003-mdtlov.qos_threshold_rr=17%
sanity test_116b: @@@@@@ FAIL: test_116b failed with 3
Maloo report: https://maloo.whamcloud.com/test_sets/9d5c1624-a861-11e3-a16f-52540035b04c The same test passed with MDSCOUNT=2. |
| Comments |
| Comment by Peter Jones [ 11/Mar/14 ] |
|
Guidance from Di "If the failure caused by resetting the original threadhold_rr, |
| Comment by Emoly Liu [ 11/Mar/14 ] |
|
There is a problem in test script when getting $old_rr. old_rr=$(do_facet $SINGLEMDS lctl get_param -n lov.*mdtlov*.qos_threshold_rr) [root@centos6-1 tests]# ../utils/lctl get_param -n lov.*mdtlov*.qos_threshold_rr 17% 17% 17% 17% [root@centos6-1 tests]# ../utils/lctl get_param lov.*mdtlov*.qos_threshold_rr lov.lustre-MDT0000-mdtlov.qos_threshold_rr=17% lov.lustre-MDT0001-mdtlov.qos_threshold_rr=17% lov.lustre-MDT0002-mdtlov.qos_threshold_rr=17% lov.lustre-MDT0003-mdtlov.qos_threshold_rr=17% Since the value is wrong, when we reset threshold_rr, the error happened. BTW, I tried MDSCOUNT=2, it really did pass. I think there is something improper in parsing parameters format. I will push a patch to fix $old_rr first, and then work on the parameters format. |
| Comment by Emoly Liu [ 11/Mar/14 ] |
|
The patch to fix $old_rr is here: http://review.whamcloud.com/9580 |
| Comment by Emoly Liu [ 13/Mar/14 ] |
|
backport to b2_5: http://review.whamcloud.com/9636 |
| Comment by Emoly Liu [ 13/Mar/14 ] |
I created another ticket |
| Comment by Jodi Levi (Inactive) [ 20/Mar/14 ] |
|
Patch landed to Master. Additional patch will be back ported soon. |
| Comment by Andreas Dilger [ 28/May/14 ] |
|
Patch landed to b2_5 for 2.5.2. |