[LU-6314] Interop 2.5.3<->2.7.0 recovery-small test_113: eviction happened Created: 02/Mar/15  Updated: 10/Oct/21  Resolved: 10/Oct/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0, Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Won't Fix Votes: 0
Labels: None
Environment:

server: 2.5.3
client: lustre-master build #26


Severity: 3
Rank (Obsolete): 17669

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/2e24e490-c04b-11e4-b6c3-5254006e85c2.

The sub-test test_113 failed with the following error:

eviction happened

MDS console

22:58:41:Lustre: DEBUG MARKER: == recovery-small test 113: ldlm enqueue dropped reply should not cause deadlocks == 22:54:11 (1425164051)
22:58:41:Lustre: DEBUG MARKER: lctl set_param fail_loc=0x8000030c
22:58:41:Lustre: *** cfs_fail_loc=30c, val=2147483648***
22:58:41:LustreError: 975:0:(ldlm_lib.c:2415:target_send_reply_msg()) @@@ dropping reply  req@ffff88006b0a5000 x1494390532739020/t0(0) o101->a4f92770-5343-a21f-9435-8ed753bc9a88@10.1.4.138@tcp:0/0 lens 592/568 e 0 to 0 dl 1425164058 ref 1 fl Interpret:/0/0 rc 0/0
22:58:41:Lustre: DEBUG MARKER: lctl set_param fail_loc=0
22:58:41:LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) ### lock callback timer expired after 151s: evicting client at 10.1.4.138@tcp  ns: mdt-lustre-MDT0000_UUID lock: ffff8800647f7500/0xbcf7cf507493690a lrc: 3/0,0 mode: PR/PR res: [0x20000b7b8:0x3:0x0].0 bits 0x1b rrc: 2 type: IBT flags: 0x60200000000020 nid: 10.1.4.138@tcp remote: 0xccf6d7b6b5a4d12e expref: 11 pid: 975 timeout: 4345202453 lvb_type: 0
22:58:41:LustreError: 0:0:(ldlm_lockd.c:344:waiting_locks_callback()) Skipped 2 previous similar messages
22:58:41:LNet: Service thread pid 975 completed after 150.55s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
22:58:41:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  recovery-small test_113: @@@@@@ FAIL: eviction happened 
22:58:41:Lustre: DEBUG MARKER: recovery-small test_113: @@@@@@ FAIL: eviction happened


 Comments   
Comment by Oleg Drokin [ 02/Mar/15 ]

LU-2827 that we test for here was actually only landed after 2.5.3 was cut, so it's just a case of a test not making sure the server has the fix.

Generated at Sat Feb 10 01:59:09 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.