[LU-16548] LNet: lnd_timeout value reported by lnetctl may be different from what is actually used Created: 10/Feb/23  Updated: 09/Jul/23  Resolved: 09/Jul/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Serguei Smirnov Assignee: Frank Sehr
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

There's an lnd_timeout calculated as a function of transaction_timeout and retry_count. This is the value displayed by "lnetctl global show". However, each LND may define its own timeout by setting timeout module parameter to a positive value, which overrides the higher-level lnd_timeout defined by LNet.

It is misleading to report only LNet-level timeout. __ Since actual timeout is lnd-specific, it should be reported among the settings per corresponding LND.



 Comments   
Comment by Gerrit Updater [ 12/Apr/23 ]

"Frank Sehr <fsehr@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50620
Subject: LU-16548 lnet: lnd_timeout value reported by lnetctl may be
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b216b7051dd327fbd1c6a4f911aa55901eec683b

Comment by Gerrit Updater [ 09/Jun/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50620/
Subject: LU-16548 lnet: report actual timeout used by lnd
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 56097c490465cb67a87639192b1fee396acbfd24

Comment by Peter Jones [ 09/Jun/23 ]

Landed for 2.16

Comment by Gerrit Updater [ 16/Jun/23 ]

"Frank Sehr <fsehr@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51342
Subject: LU-16548 lnet: report actual timeout used by lnd
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 15af8effceb91635d7c7f29cc68e43e73e3c1dba

Comment by Shaun Tancheff [ 23/Jun/23 ]

A fix for gnilnd build is needed.

Comment by Frank Sehr [ 23/Jun/23 ]

What fix is needed?

Comment by Shaun Tancheff [ 26/Jun/23 ]

Re-Opened waiting for ("LU-16548 lnet: Fixing missing gnilnd define CURRENT_LND_VERSION") to land.

Comment by Frank Sehr [ 26/Jun/23 ]

Thanks Shaun

Comment by Gerrit Updater [ 08/Jul/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51342/
Subject: LU-16548 lnet: Fixing missing gnilnd define CURRENT_LND_VERSION
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 094ae18ed8a995a1323833054bdfed613fa884c5

Comment by Peter Jones [ 09/Jul/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:27:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.