[LU-1189] restore_lustre_params() needs run on active server nodes Created: 05/Mar/12  Updated: 05/Sep/17  Resolved: 26/Apr/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 2.1.4
Fix Version/s: Lustre 2.4.0, Lustre 2.1.5

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 5935

 Description   

This issue was created by maloo for Minh Diep <mdiep@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/8f2e25a2-662c-11e1-92b1-5254004bbbd3.

https://maloo.whamcloud.com/test_sets/8f2e25a2-662c-11e1-92b1-5254004bbbd3

When the last subtest is skipped, the entire test is marked failed even everything else passed



 Comments   
Comment by Minh Diep [ 04/Apr/12 ]

I can't reproduce this issue in the lab. I noticed that the failures are on group 'failover'. I wonder if this is specific issue to failover testing

Comment by Jian Yu [ 21/Dec/12 ]

Lustre Tag: v2_1_4_RC1
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/159/
Distro/Arch: RHEL6.3/x86_64
Test Group: failover

The same issue existed: https://maloo.whamcloud.com/test_sets/064bc9ea-4bb6-11e2-aa80-52540035b04c

Comment by Minh Diep [ 09/Jan/13 ]

actually, I found an error after all the subtest completed

09:02:17:PASS 11b (376s)
09:02:17:
09:02:17: SKIP: replay-vbr test_12a skipping ALWAYS excluded test 12a
09:02:17:CMD: client-27vm3 lctl set_param -n mdt.lustre-MDT0000.commit_on_sharing 0
09:02:17:client-27vm3: error: set_param: /proc/

{fs,sys}

/

{lnet,lustre}

/mdt/lustre-MDT0000/commit_on_sharing: Found no match

I believe the report marked as fail is correct behavior since the error happen outside of any subtest

Comment by Jian Yu [ 22/Feb/13 ]

The issue occurred in failover configuration.

After running replay-vbr test, the active MDS node has been changed, however, restore_lustre_params() still tries to restore the saved params on the original (now inactive) node, which causes the failure.

I'll submit a patch.

Comment by Jian Yu [ 07/Mar/13 ]

Patch for Lustre master branch: http://review.whamcloud.com/5628
Patch for Lustre b2_1 branch: http://review.whamcloud.com/5627

Comment by Jian Yu [ 26/Apr/13 ]

Patch for Lustre master branch: http://review.whamcloud.com/5628

Hi Oleg, could you please land the above patch on master branch? The issue in this ticket is affecting the failover testing:
https://maloo.whamcloud.com/test_sets/5a2aa132-ad66-11e2-bbea-52540035b04c

Comment by Peter Jones [ 26/Apr/13 ]

Landed for 2.1.5 and 2.4

Generated at Sat Feb 10 01:14:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.