Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.10.0
-
3
-
9223372036854775807
Description
REQFAIL is the number of times that a sleep is allowed to be
less than $MINSLEEP before the test is considered a fail.
The result of
"DURATION / SERVER_FAILOVER_PERIOD * REQFAIL_PERCENT / 100"
may not be an integer (165.6) and test fails with :
"Failed to load with for a minimum
period of 166 times ( REQFAIL=165 )".
The example of test failure :
==== Checking the clients loads AFTER failed client reintegrated -- failure NOT OK WARNING: failover, client reintegration and check_client_loads time exceeded SERVER_FAILOVER_PERIOD - MINSLEEP! Failed to load the filesystem with I/O for a minimum period of 120 166 times ( REQFAIL=165 ). This iteration, the load was only applied for sleep=63 seconds. Estimated max recovery time : 1475 Probably the hardware is taking excessively long time to boot. Try to increase SERVER_FAILOVER_PERIOD (current is 300), bug 20918 2017-06-06 20:08:31 Terminating clients loads ... Duration: 49680 Server failover period: 300 seconds Exited after: 49810 seconds Number of failovers before exit: mds1 failed over 166 times Status: FAIL: rc=6