[LU-12439] Convert "Response timed out..." in lnet_finalize_expired_responses to CDEBUG Created: 14/Jun/19  Updated: 30/Jul/19  Resolved: 30/Jul/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.13.0

Type: Improvement Priority: Minor
Reporter: Chris Horn Assignee: Chris Horn
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

I've noticed that this error message causes a lot of noise when a router node goes down. For example:

I rebooted two routers on a system with just 40 compute nodes. The first of these error messages popped up about 1 minute or so after I initiated a reboot of the routers:

Reboot started at - Fri Jun 14 14:34:35 CDT 2019

saturn-smw:/var/opt/cray/log/p2-current # grep -m 1 lnet_finalize_expired_responses console-20190614
2019-06-14T14:35:33.704182-05:00 c0-1c1s9n3 LNet: 10316:0:(lib-move.c:2888:lnet_finalize_expired_responses()) Response timed out: md = ffff8810119a32a8: nid = 485@gni4

In the time it took the routers to reboot, about 8 minutes, there were 797 entries from lnet_finalize_expired_responses in the console log:

saturn-smw:/var/opt/cray/log/p2-current # grep -c lnet_finalize_expired_responses console-20190614
797
saturn-smw:/var/opt/cray/log/p2-current #

I don't see much value from this message for system administrators, so I think it should be converted to a CDEBUG



 Comments   
Comment by Gerrit Updater [ 14/Jun/19 ]

Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/35233
Subject: LU-12439 lnet: Convert noisy timeout error to cdebug
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d949d310c440bb7a3f264307cfa83407f7349b17

Comment by Gerrit Updater [ 30/Jul/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35233/
Subject: LU-12439 lnet: Convert noisy timeout error to cdebug
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: bd3ed8cb7165dc14f05fa4e7845aa1a4211ef6c4

Comment by Peter Jones [ 30/Jul/19 ]

Landed for 2.13

Generated at Sat Feb 10 02:52:36 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.