[LU-8618] ha.sh improvements Created: 15/Sep/16  Updated: 16/Aug/17  Resolved: 14/Aug/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.10.1, Lustre 2.11.0

Type: Improvement Priority: Minor
Reporter: Elena Gryaznova Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
Rank (Obsolete): 9223372036854775807

 Description   

ha.sh script proposed improvements :

  • Remove hardcoded SIMUL and IOR paths.
  • Add -p max failover period parameter.
  • Add -r dry run parameter.
  • Add "iozone" load.
  • Add the possibilities to set the number of mpi threads per client.
  • CRM is not always configured to fail target back when their primary node is back. Adds the possibility to execute failback command if required.
  • The logs from other clients are also required if non mpi load fails on one client only. Dump logs from all clients.
  • Sometimes it is required to:
    run ha.sh with custom ior, simul parameters;
    start only the defined list of applications and to not start other of them;
    start MPI loads instances on defined number of clients.


 Comments   
Comment by Gerrit Updater [ 15/Sep/16 ]

Elena Gryaznova (elena.gryaznova@seagate.com) uploaded a new patch: http://review.whamcloud.com/22528
Subject: LU-8618 tests: ha.sh improvements
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: af00534480b715aaa66b5983cff409178e150224

Comment by Elena Gryaznova [ 07/Nov/16 ]

Jian,
mallo does not have any passed ha.sh test https://testing.hpdd.intel.com/test_sets/query?utf8=?&test_set%5Btest_set_script_id%5D=ad4160c6-7bb3-11e6-8afd-5254006e85c2&status%5B%5D=PASS&query_bugs=&warn%5Bnotice%5D=true&hosts=&test_session%5Buser_id%5D=&test_session%5Bquery_recent_period%5D=&test_session%5Bstart_date%5D=&test_session%5Bend_date%5D=&test_node%5Bos_type_id%5D=&test_node%5Bdistribution_type_id%5D=&test_node%5Barchitecture_type_id%5D=&test_node%5Bfile_system_type_id%5D=&test_node%5Blustre_branch_id%5D=&test_node_network%5Bnetwork_type_id%5D=&commit=Update+results

Please advice what is the next step from my side. Thanks.

Comment by Jian Yu [ 09/Nov/16 ]

Hi Elena,
The ha.sh test script was not written with the Lustre test framework, so it has not been supported by autotest system yet. The next step is for me to file a ticket to apply for enhancing autotest system to support running ha.sh. However, this might not be implemented in a short time.

Comment by Elena Gryaznova [ 09/Nov/16 ]

Jian,
thank you.

I will update SEA ticket with info.

Thanks.

Comment by Jian Yu [ 09/Nov/16 ]

Thank you, Elena.

Comment by Gerrit Updater [ 13/Aug/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/22528/
Subject: LU-8618 tests: ha.sh improvements
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d3a044086f5790fec2747c653dca26b8ec529e2d

Comment by Peter Jones [ 14/Aug/17 ]

Landed for 2.11

Comment by Gerrit Updater [ 14/Aug/17 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28524
Subject: LU-8618 tests: ha.sh improvements
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 811255493e62fd81a32d157688d8affb7545f057

Comment by Gerrit Updater [ 16/Aug/17 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28524/
Subject: LU-8618 tests: ha.sh improvements
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: 8620fc0dcad90d30b4094f7163f531129815d05b

Generated at Sat Feb 10 02:19:07 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.