[LU-2635] Interop 2.1.3<->2.4 failure on test suite sanity test_27d: setstripe: invalid option 'S' Created: 17/Jan/13  Updated: 22/Feb/13  Resolved: 22/Feb/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: LB
Environment:

server: 2.4
client: 2.1.3


Severity: 3
Rank (Obsolete): 6164

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/75c2e2be-5b55-11e2-b205-52540035b04c.

The sub-test test_27d failed with the following error:

setstripe failed

== sanity test 27d: create file with default settings ================== 13:14:06 (1357334046)
setstripe: invalid option -- 'S'
Create a new file with a specific striping pattern or
set the default striping pattern on an existing directory or
delete the default striping pattern from an existing directory
usage: setstripe [--size|-s stripe_size] [--count|-c stripe_count]
                 [--index|-i|--offset|-o start_ost_index]
                 [--pool|-p <pool>] <directory|filename>
       or 
       setstripe -d <directory>   (to delete default striping)
	stripe_size:  Number of bytes on each OST (0 filesystem default)
	              Can be specified with k, m or g (in KB, MB and GB
	              respectively)
	start_ost_index: OST index of first stripe (-1 default)
	stripe_count: Number of OSTs to stripe over (0 default, -1 all)
	pool:         Name of OST pool to use (default none)
 sanity test_27d: @@@@@@ FAIL: setstripe failed 


 Comments   
Comment by Zhenyu Xu [ 22/Jan/13 ]

2.1.3 client test script (sanity test_27d) does not use setstripe with '-S' while 2.4 test script uses. I don't know the interop test mechanism, does the test use client's test script or server's?

Intuitively I think it should use client's test script but the test result shows the other way.

Comment by Bob Glossman (Inactive) [ 23/Jan/13 ]

It's the scripts on the client that get executed. However I see the following in the node-provisioning logs:

' > /root/autotest_config.sh" on client-32vm2 via client-32vm2
12:35:55:
12:35:55:Executing "mv /usr/lib64/lustre/tests /usr/lib64/lustre/tests-old" on client-32vm2 via client-32vm2
12:35:55:
12:35:55:Executing "scp -r client-32vm7:/usr/lib64/lustre/tests /usr/lib64/lustre/tests" on client-32vm2 via client-32vm2

From that it appears the test scripts from a server are being copied onto the client during provisioning. Seems like a test setup flaw to me.

Comment by Zhenyu Xu [ 23/Jan/13 ]

Chris,

Could you take it a look?

Comment by Chris Gearing (Inactive) [ 24/Jan/13 ]

We have 2 modes.

client runs client scripts, servers run server scripts.
both run the server scripts.

The implementation works like this, each branch has a version number.
master: 5
b1_8: 0
b2_4: 5
b2_3: 5
b2_2: 0
b2_1: 0

If the versions are the same then they are compatible, if they diff then the server is copied to the client.

2.1 2.4 being a case in question, this idea was after a discussion with Andreas.

Shall we turn it off for know to see if that cures this problem.

Comment by Bob Glossman (Inactive) [ 24/Jan/13 ]

Worth turning it off for a trial. I suspect that will cure this problem, but might generate a whole set of new ones.

If we want to continue to use server scripts on clients we will need to version check all uses of setstripe -S in server scripts. Lots more than just this one in sanity, test 27d.

Comment by Zhenyu Xu [ 24/Jan/13 ]

I can see the dilemma here, our client side test scripts also run remote command on MDS/OST, so when the server client scripts are not compatible, it would come to problems just running either client scripts or server scripts only.

For this incompatible interop test, I guess this situation will be inevitable.

Comment by Jodi Levi (Inactive) [ 21/Feb/13 ]

We will be fixing this on b2_1.

Comment by Jian Yu [ 22/Feb/13 ]

I triggered autotest to perform the Lustre b2_1 and master interop testing on the latest builds:
Lustre b2_1 client: http://build.whamcloud.com/job/lustre-b2_1/176
Lustre master server: http://build.whamcloud.com/job/lustre-master/1269

sanity test 27d passed: https://maloo.whamcloud.com/test_sets/ffcd3138-7cb3-11e2-a108-52540035b04c

== sanity test 27d: create file with default settings ================== 21:36:54 (1361511414)
/mnt/lustre/d27/fdef has type file OK
4+0 records in
4+0 records out
16384 bytes (16 kB) copied, 0.00525656 s, 3.1 MB/s
Resetting fail_loc on all nodes...CMD: client-19vm1.lab.whamcloud.com,client-19vm2,client-19vm3,client-19vm4 lctl set_param -n fail_loc=0 2>/dev/null || true
done.

While fixing TT-1053, Chris has turned off the autotest feature of running server scripts on client node. So, the issue in this ticket does not exist.

Generated at Sat Feb 10 01:26:53 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.