[LU-11430] sanity test 271d: too many arguments Created: 25/Sep/18  Updated: 18/Dec/18  Resolved: 06/Nov/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Minor
Reporter: Jian Yu Assignee: Mikhail Pershin
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

sanity test 271d failed as follows in DNE test sessions:

== sanity test 271d: DoM: read on open (1K file in reply buffer) ===================================== 05:03:01 (1534136581)
CMD: trevis-7vm4 /usr/sbin/lctl get_param -n version 2>/dev/null ||
				/usr/sbin/lctl lustre_build_version 2>/dev/null ||
				/usr/sbin/lctl --version 2>/dev/null | cut -d' ' -f2
warning: '-M' deprecated, use '--mdt-index' or '-m' instead
1+0 records in
1+0 records out
1000 bytes (1.0 kB) copied, 0.000146744 s, 6.8 MB/s
1+0 records in
1+0 records out
1000 bytes (1.0 kB) copied, 0.00174574 s, 573 kB/s
Append to the same page
/usr/lib64/lustre/tests/sanity.sh: line 17034: [: too many arguments
/usr/lib64/lustre/tests/sanity.sh: line 17034: 6
1
1
1: syntax error in expression (error token is "1
1
1")

Maloo reports:
https://testing.whamcloud.com/test_sets/792d8c78-9ff2-11e8-8ee3-52540065bddc
https://testing.whamcloud.com/test_sets/5b537c1a-c0ec-11e8-a9d9-52540065bddc
https://testing.whamcloud.com/test_sets/794a6aa2-bed1-11e8-b748-52540065bddc

The failure is affecting patch testing on master branch.



 Comments   
Comment by James Nunez (Inactive) [ 10/Oct/18 ]

We see this failure in sanity 271e for review-dne-* testing. For example https://testing.whamcloud.com/test_sets/8e26abfe-c8d4-11e8-b589-52540065bddc .

I think the issue is

17161         local num=$(lctl get_param -n mdc.*.stats |
17162                 awk '/ost_read/ {print $2}')
17163         local ra=$(lctl get_param -n mdc.*.stats |
17164                 awk '/req_active/ {print $2}')
17165         local rw=$(lctl get_param -n mdc.*.stats |
17166                 awk '/req_waittime/ {print $2}')
17167 
17168         [ -z $num ] || error "$num READ RPC occured"
17169         [ $ra == $rw ] || error "$((ra - rw)) resend occured"
17170         echo "... DONE"

For DNE, getting the param mdc.*.stats and awking for req_waittime and req_active will return multiple values, one for each mdc, and this can cause the error seen here.

I'm guessing we need to be more selective about what mdc to pull the stats from.

Mike - any thoughts on this failure?

Comment by Mikhail Pershin [ 25/Oct/18 ]

yes, indeed, I will prepare patch ASAP

Comment by Gerrit Updater [ 26/Oct/18 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33490
Subject: LU-11430 tests: get MDC stats by index
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d3f7d8f1ac35abe37cc86b7ef82e6e76ab1dd575

Comment by Gerrit Updater [ 06/Nov/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33490/
Subject: LU-11430 tests: get MDC stats by index
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: c50bd08b35757d20e606c10bb5b4474011dd07f6

Generated at Sat Feb 10 02:43:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.