[LU-6884] recovery-random-scale test_fail_client_mds: test_fail_client_mds returned 5 Created: 20/Jul/15  Updated: 14/Dec/21  Resolved: 14/Dec/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

client and server: lustre-master build # 3093 RHEL6.6


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/529eb580-2793-11e5-9951-5254006e85c2.

The sub-test test_fail_client_mds failed with the following error:

test_fail_client_mds returned 5

test failed after mds failover 59 times. OST console

14:39:18:Lustre: lustre-OST0004: Bulk IO write error with 0dba3d09-d787-b9a0-adb0-2a4d766bd704 (at 10.1.5.232@tcp), client will retry: rc -110
14:39:18:Lustre: Skipped 4 previous similar messages
14:39:18:Lustre: lustre-OST0006: haven't heard from client 0dba3d09-d787-b9a0-adb0-2a4d766bd704 (at 10.1.5.232@tcp) in 55 seconds. I think it's dead, and I am evicting it. exp ffff88001a03d000, cur 1436538827 expire 1436538797 last 1436538772
14:39:18:Lustre: Skipped 6 previous similar messages
14:39:18:LustreError: 6776:0:(ldlm_lib.c:3077:target_bulk_io()) @@@ Eviction on bulk WRITE  req@ffff8800235f4080 x1506317723273632/t0(0) o4->0dba3d09-d787-b9a0-adb0-2a4d766bd704@10.1.5.232@tcp:362/0 lens 608/448 e 0 to 0 dl 1436538862 ref 1 fl Interpret:/0/0 rc 0/0


 Comments   
Comment by Sarah Liu [ 21/Jul/15 ]

another instance seen in RHEL7.1 server/SLES11SP3 client

https://testing.hpdd.intel.com/test_sets/80e032b4-2623-11e5-92e6-5254006e85c2

Generated at Sat Feb 10 02:04:07 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.