[LU-12886] A lot of LNetError: lnet_peer_ni_add_to_recoveryq_locked() messages - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.12.3
Labels:
- llnl
Environment:
2.12.3 RC1 (vanilla) on servers, CentOS 7.6, patched kernel; 2.12.0 + patches on clients

Severity:
4
Rank (Obsolete):
9223372036854775807

Description

After upgrading our servers on Fir (Sherlock's /scratch) to Lustre 2.12.3 RC1, we are noticing a lot of these messages on all Lustre servers:

LNetError: 49537:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.201@o2ib7 added to recovery queue. Health = 900

The NIDs reported are our Lustre routers, that are still running 2.12.0+patches (they are on the client clusters).

Attaching logs from all servers as lnet_recoveryq.log

This doesn't seem to have an impact on production and so far 2.12.3 RC1 has been just great for us (and we run it without additional patches now!). Thanks!

Stéphane

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

fir-io1-s1.log
932 kB
20/Oct/19 5:15 PM
lnet_recoveryq.log
20 kB
19/Oct/19 6:48 PM

Issue Links

is related to

LU-13071 LNet Health: reduce log severity

Resolved

Activity

People

Assignee:: Amir Shehata (Inactive)

Reporter:: Stephane Thiell

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 19/Oct/19 6:50 PM

Updated:: 10/Apr/20 9:57 PM

Resolved:: 10/Apr/20 9:57 PM