[LU-10672] lnet_notify() called incorrectly Created: 15/Feb/18  Updated: 16/Aug/22  Resolved: 03/Mar/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: Lustre 2.11.0

Type: Bug Priority: Minor
Reporter: John Hammond Assignee: James A Simmons
Resolution: Fixed Votes: 0
Labels: lnet

Issue Links:
Related
is related to LU-9019 Migrate lustre to standard 64 bit tim... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Recent test logs contain several messages of the form

[15236.586059] LNet: 2816:0:(router.c:1822:lnet_notify()) Ignoring prediction from 10.9.5.183@tcp of 10.9.5.186@tcp down 15180764 seconds in the future

See for example https://testing.hpdd.intel.com/test_logs/58f7b44e-1224-11e8-a10a-52540065bddc/show_text

lnet_notify() expects callers to pass an absolute time in seconds for its when parameter. But it looks like it's getting a relative value from LNetCtl():

        case IOC_LIBCFS_NOTIFY_ROUTER: {
                time64_t deadline = ktime_get_real_seconds() - data->ioc_u64[0];

                return lnet_notify(NULL, data->ioc_nid, data->ioc_flags,
                                   deadline);
        }

And it's getting timestamp in jiffies in ksocknal_peer_failed():

        if (notify)
                lnet_notify(peer_ni->ksnp_ni, peer_ni->ksnp_id.nid, 0,
                            cfs_time_seconds(last_alive)); /* to jiffies */

The other call sites should be audited as well.

This seems to be partially due to LU-9019.



 Comments   
Comment by Gerrit Updater [ 16/Feb/18 ]

James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/31339
Subject: LU-10672 lnet: pass in only time64_t to lnet_notify
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9b19545486660cd7ab176e1c71894fcde86a07fa

Comment by Gerrit Updater [ 03/Mar/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31339/
Subject: LU-10672 lnet: pass in only time64_t to lnet_notify
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 5b5538e9e728292f1cb5501228a13b8f4787dd97

Comment by Peter Jones [ 03/Mar/18 ]

Landed for 2.11

Comment by Gerrit Updater [ 16/Aug/22 ]

"Akash B <akash-b@hpe.com>" uploaded a new patch: https://review.whamcloud.com/48226
Subject: LU-10672 utils: snapshot support to foreign host
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 3730e0df3ac14a468741329361f28e8d75e8bcdc

Generated at Sat Feb 10 02:37:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.