[LU-9800] recovery-mds-scale test_failover_mds: test_failover_mds returned 1 Created: 26/Jul/17  Updated: 08/May/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.1, Lustre 2.11.0, Lustre 2.10.4
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Casper Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

trevis, failover
server: EL7, zfs, branch master, v2.10.50.3, b3612
client: EL7, branch master, v2.10.50.3, b3612


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

https://testing.hpdd.intel.com/test_sessions/55045e59-2766-4676-91ae-45a2fa2f4e91

The dd client loads for test_failover_mds and test_failover_ost are running out of space.

This looks different than LU-5788 because the recovery-mds subtests ran for 22 hours. The
LU-5788 failures normally happened in less than a minute.

From both test_logs:

Client load failed 

From both run_dd_debug logs:

dd: error writing ‘/mnt/lustre/d0.dd-trevis-49vm5.trevis.hpdd.intel.com/dd-file’: No space left on device
964211+0 records in
964210+0 records out
+ '[' 1 -eq 0 ']'
2017-07-11 23:55:13: dd failed


 Comments   
Comment by James Casper [ 26/Sep/17 ]

2.10.1:
https://testing.hpdd.intel.com/test_sessions/790f2242-72e4-4458-b15b-c8948b16efef

Generated at Sat Feb 10 02:29:24 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.