[LU-6921] sanityn 77f test failed Lustre: DEBUG MARKER: sanityn test_77f: @@@@@@ FAIL: failed to operate on TBF rules Created: 28/Jul/15 Updated: 18/May/16 Resolved: 20/Oct/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Vinayak (Inactive) | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Epic: | test | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
stdout.log == sanityn test 77f: check TBF JobID nrs policy == 15:43:56 (1438011836) ost.OSS.ost_io.nrs_policies=tbf jobid ost.OSS.ost_io.nrs_policies=tbf jobid error: set_param: ost/OSS/ost_io/nrs_tbf_rule: Found no match sanityn test_77f: @@@@@@ FAIL: failed to operate on TBF rules Trace dump: = /usr/lib64/lustre/tests/../tests/test-framework.sh:4732:error_noexit() = /usr/lib64/lustre/tests/../tests/test-framework.sh:4763:error() = /usr/lib64/lustre/tests/sanityn.sh:2943:tbf_rule_operate() = /usr/lib64/lustre/tests/sanityn.sh:3008:test_77f() = /usr/lib64/lustre/tests/../tests/test-framework.sh:5010:run_one() = /usr/lib64/lustre/tests/../tests/test-framework.sh:5047:run_one_logged() = /usr/lib64/lustre/tests/../tests/test-framework.sh:4864:run_test() = /usr/lib64/lustre/tests/sanityn.sh:3043:main() Dumping lctl log to /tmp/test_logs/1438011830/sanityn.test_77f.*.1438011837.log FAIL 77f (1s) sanityn: FAIL: test_77f failed to operate on TBF rules Stopping clients: fre0107,fre0108 /mnt/lustre2 (opts:) Stopping client fre0108 /mnt/lustre2 opts: |
| Comments |
| Comment by Vinayak (Inactive) [ 28/Jul/15 ] |
|
sanityn test_77e, sanityn test_77g failed with the same reason : == sanityn test 77e: check TBF NID nrs policy == 14:29:25 (1438073965) ost.OSS.ost_io.nrs_policies=tbf nid ost.OSS.ost_io.nrs_policies=tbf nid error: set_param: ost/OSS/ost_io/nrs_tbf_rule: Found no match sanityn test_77e: @@@@@@ FAIL: failed to operate on TBF rules Trace dump: = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:4732:error_noexit() = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:4763:error() = /home/build/lustre-xx/lustre/tests/sanityn.sh:2943:tbf_rule_operate() = /home/build/lustre-xx/lustre/tests/sanityn.sh:2957:test_77e() = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:5010:run_one() = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:5047:run_one_logged() = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:4864:run_test() = /home/build/lustre-xx/lustre/tests/sanityn.sh:2987:main() Dumping lctl log to /tmp/test_logs/1438073937/sanityn.test_77e.*.1438073966.log FAIL 77e (2s) cleanup: ====================================================== == sanityn test complete, duration 30 sec == 14:29:27 (1438073967) sanityn: FAIL: test_77e failed to operate on TBF rules cli-1: warning: 'lctl conf_param' is deprecated, use 'lctl set_param -P' instead cli-1: warning: 'lctl conf_param' is deprecated, use 'lctl set_param -P' instead == sanityn test 77g: Change TBF type directly == 15:02:47 (1438075967) ost.OSS.ost_io.nrs_policies=tbf nid ost.OSS.ost_io.nrs_policies=tbf nid ost.OSS.ost_io.nrs_policies=tbf jobid ost.OSS.ost_io.nrs_policies=tbf jobid error: set_param: ost/OSS/ost_io/nrs_tbf_rule: Found no match sanityn test_77g: @@@@@@ FAIL: failed to operate on TBF rules Trace dump: = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:4732:error_noexit() = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:4763:error() = /home/build/lustre-xx/lustre/tests/sanityn.sh:2943:tbf_rule_operate() = /home/build/lustre-xx/lustre/tests/sanityn.sh:3064:test_77g() = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:5010:run_one() = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:5047:run_one_logged() = /home/build/lustre-xx/lustre/tests/../tests/test-framework.sh:4864:run_test() = /home/build/lustre-xx/lustre/tests/sanityn.sh:3076:main() Dumping lctl log to /tmp/test_logs/1438075942/sanityn.test_77g.*.1438075968.log FAIL 77g (2s) |
| Comment by Andreas Dilger [ 06/Aug/15 ] |
|
HI Li Xi, Wang Shilong, |
| Comment by Li Xi (Inactive) [ 07/Aug/15 ] |
|
Strange, looks like ost/OSS/ost_io/nrs_tbf_rule is missing |
| Comment by Li Xi (Inactive) [ 13/Aug/15 ] |
|
Hi Vinayak, which branch did you test? Was it a branch with TBF NRS policy? |
| Comment by Vinayak (Inactive) [ 13/Aug/15 ] |
|
Hi Li Xi >> Was it a branch with TBF NRS policy? Followed this to find TBF related changes. [root@cli-1 lustre-release]# git branch -a | grep master * master remotes/origin/HEAD -> origin/master remotes/origin/master [root@cli-1 lustre-release]# git log --oneline | grep -i TBF fb14b7b LU-6668 test: regression tests for NRS TBF policy e7ab554 LU-5580 ptlrpc: policy switch directly in tbf 75752e9 LU-3319 procfs: Move NRS TBF proc handling to seq_files 0539dc5 LU-4832 ptlrpc: fix incorrect name string in nrs_tbf 33e35c0 LU-3558 ptlrpc: Add the NRS TBF policy Please let me know if you want any other info or anything you want me to check on my side. |
| Comment by Gerrit Updater [ 08/Sep/15 ] |
|
Vinayak (vinayakswami.hariharmath@seagate.com) uploaded a new patch: http://review.whamcloud.com/16305 |
| Comment by Vinayak (Inactive) [ 08/Sep/15 ] |
|
Can anyone please let me know what is the behavior of this part (type) of script in sanityn.sh, test_77e, 77f, 77g tbf_rule_operate ost0 "start\ localhost\ {0@lo}\ 1000"
It is failing with Is this behavior same on your side also ? Looks like ost0 is not correctly interpreted on my side. If passes, I am using 4 node set up (2 OSTs, 1 MDS, 2 clients) |
| Comment by Vinayak (Inactive) [ 28/Sep/15 ] |
|
Hello Andreas, I have rebased the patch. http://review.whamcloud.com/#/c/16305/. Please let me know if anything else to be done. |
| Comment by Saurabh Tandan (Inactive) [ 29/Sep/15 ] |
|
Encountered same issue for sanity test_77g. 20:26:40:CMD: onyx-38vm4 lctl set_param ost.OSS.ost_io.nrs_tbf_rule=start\ dd_runas\ {dd.500}\ 50
20:26:40:onyx-38vm4: error: set_param: setting /proc/fs/lustre/ost/OSS/ost_io/nrs_tbf_rule=start dd_runas {dd.500} 50: Invalid argument
20:26:41:ost.OSS.ost_io.nrs_tbf_rule=start dd_runas {dd.500} 50
20:26:42: sanityn test_77g: @@@@@@ FAIL: failed to operate on TBF rules
|
| Comment by Vinayak (Inactive) [ 29/Sep/15 ] |
|
Hello Suarabh, Can you please try the patch http://review.whamcloud.com/#/c/16305/ and check if it fixes the problem. Thanks, |
| Comment by Kalpak Shah (Inactive) [ 20/Oct/15 ] |
|
http://review.whamcloud.com/#/c/16305/ is ready to be merged - Andreas and Li have given positive reviews. |
| Comment by Gerrit Updater [ 20/Oct/15 ] |
|
Andreas Dilger (andreas.dilger@intel.com) merged in patch http://review.whamcloud.com/16305/ |
| Comment by Andreas Dilger [ 20/Oct/15 ] |
|
I finally figured out why this test wasn't failing in our testing - in facet_host() it uses $ost_HOST for any facet named ostX if there isn't an explicit $ost0_HOST set in the configuration. In any case, the patch has been landed to master for 2.8.0. |