[LU-14270] enhance ha.sh to delay power up node Created: 22/Dec/20  Updated: 26/Feb/21  Resolved: 26/Feb/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: Lustre 2.15.0

Type: Improvement Priority: Minor
Reporter: Elena Gryaznova Assignee: Elena Gryaznova
Resolution: Fixed Votes: 0
Labels: None

Epic/Theme: patch
Rank (Obsolete): 9223372036854775807

 Description   

In ClusterStor system a node in UNCLEAN state can be STONITHed after it has passed power down and power up already.
The patch adds ha_power_up_delay() in order to delay node's power up until CRM state became OFFLINE when failover pair is set and for $NODE_UP_DELAY seconds otherwise.
LOAD_TIMEOUT is added in order to ha_load_timeout be tunable.
The failover pair list for all victims are to be set via new -f option.



 Comments   
Comment by Gerrit Updater [ 22/Dec/20 ]

Elena Gryaznova (c17455@cray.com) uploaded a new patch: https://review.whamcloud.com/41070
Subject: LU-14270 tests: delay node's power up
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 81bd4db8f0e0a423a953ebc353b1a12be23a5f9f

Comment by Gerrit Updater [ 26/Feb/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41070/
Subject: LU-14270 tests: delay node's power up
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ce0b7ed04461d7909501a88f1a3c2982b765ccf4

Comment by Peter Jones [ 26/Feb/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:08:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.