[LU-1158] nanosecond timestamp support for Lustre - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: Lustre 2.4.0
Labels:
- always_except

Story Points:
3
Rank (Obsolete):
4515

Description

The current Lustre network protocol has support for a 64-bit timestamp of seconds, but does not have a field for passing the nanosecond timestamp from clients to servers and back again.

It would be relatively straight-forward to put 3x __u32 nanosecond timestamps in the reserved fields in struct obdo and struct mdt_body. These fields are currently always initialized to 0, so there wouldn't even need to be a protocol change or feature to begin using these fields for nanoseconds - just copy them in/out of the RPC structures, and old clients/servers will just store 0 there, and ignore any nanosecond timestamps that are sent to them (no differently than they do today).

It is more complex to add the nanosecond timestamps to struct ost_lvb, which is most commonly used for glimpse locks (stat) on OST objects. This will require a structure change to fit the extra 3x __u32 nanosecond timestamps into ost_lvb, which may require a protocol change. It may be possible if this structure is passed in a separate ptlrpc message buffer that the larger size will be ignored by older clients, which would avoid the need for additional complexity for interoperability.

Attachments

Issue Links

is related to

LU-12922 pjdfstest chown_00: POSIX compliance failed on lustre

Open

LU-18069 Incorrect timespec64_to_ns() in sles15sp5 kernel

Open

LU-4050 NFS reexport issue

Resolved

LU-18108 mdt_rec_reint template is not consistent

Resolved

LU-10934 integrate statx() API with Lustre

Resolved

LU-11971 Send file creation time to clients

Resolved

LUDOC-92 Nanosecond Time Stamps Doc Changes

Resolved

is related to

LU-17963 sometime touch command cannot change mtime

Resolved

LU-9019 Migrate lustre to standard 64 bit time kernel API

Resolved

Trackbacks

Summer Intern: Plusses and deltas board Isami joined Whamcloud and leaves tomorrow after working the summer on site at TACC. He completed the nanosecond time stamp

Summer Intern, Plusses and deltas board Isami joined Whamcloud and leaves tomorrow after working the summer on site at TACC. He completed the nanosecond time stamp

Changelog 2.1 Changes from version 2.1.2 to version 2.1.3 Server support for kernels: 2.6.18308.13.1.el5 (RHEL5) 2.6.32279.2.1.el6 (RHEL6) Client support for unpatched kernels: 2.6.18308.13.1.el5 (RHEL5) 2.6.32279.2.1....

Project Ideas Information for Developers Working on Lustre will quickly introduce a developer to the always interesting and impressive world of HPC, where the systems are the largest in the world,...

(2 is related to, 2 is related to , 4 Trackbacks)

Sub-Tasks

Progress

Subtle and infrequent failure of test 39a in sanityn.sh noticed after ns timestamp implementation

Resolved

WC Triage

Activity

[LU-1158] nanosecond timestamp support for Lustre

Gerrit Updater added a comment - 24/Jul/24 12:22 AM - edited

~~"Feng Lei <flei@whamcloud.com>" uploaded a new patch:~~ https://review.whamcloud.com/c/fs/lustre-release/+/55849
~~Subject: LU-1158 general: interop of nanosecond timestamps~~
~~Project: fs/lustre-release~~
~~Branch: master~~
~~Current Patch Set: 1~~
~~Commit: c8f3fc718664abbc56eb432ecaca7c2faea00942~~

Gerrit Updater added a comment - 24/Jul/24 12:22 AM - edited "Feng Lei <flei@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/55849 Subject: LU-1158 general: interop of nanosecond timestamps Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: c8f3fc718664abbc56eb432ecaca7c2faea00942

Gerrit Updater added a comment - 04/Dec/23 4:34 AM

"Feng Lei <flei@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53313
Subject: LU-1158 general: support nanosecond timestamps
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 3dc43cd08efc8347cdceed3939be7c33256a081c

Gerrit Updater added a comment - 04/Dec/23 4:34 AM "Feng Lei <flei@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53313 Subject: LU-1158 general: support nanosecond timestamps Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3dc43cd08efc8347cdceed3939be7c33256a081c

Andreas Dilger added a comment - 20/Mar/18 9:45 PM

Clearly I wasn't thinking of only a __u32 nanoseconds timestamp, since that won't even last until the end of this comment, let alone to Y2038, but rather in addition to the existing 64-bit seconds field, as was discussed in the original description.

On the other hand, passing the entire timestamp as a 64-bit nanosecond timestamp gives us roughly 292 years before signed overflow, and could simplify the wire protocol change (no need to change the size of the fields) with the added complexity that there would need to be a connection flag to indicate if this client is sending timestamps in seconds or nanoseconds. Potentially we could just assume any value larger than 2^32 is going to be nanoseconds (it is unlikely that any current Lustre releases would still be running in 20 years, and conversely 2^60 ns is needed to get to 2006, so they are very unlikely to conflict), but using an OBD_CONNECT_NANOSECONDS flag is not too hard. The main complexity is that this flag would need to be checked in many places, possibly places where the client export is not easily available, so just making a decision based on the size of the timestamp is relatively safe, and the old seconds format could eventually be deprecated with little effort.

Andreas Dilger added a comment - 20/Mar/18 9:45 PM Clearly I wasn't thinking of only a __u32 nanoseconds timestamp, since that won't even last until the end of this comment, let alone to Y2038, but rather in addition to the existing 64-bit seconds field, as was discussed in the original description. On the other hand, passing the entire timestamp as a 64-bit nanosecond timestamp gives us roughly 292 years before signed overflow, and could simplify the wire protocol change (no need to change the size of the fields) with the added complexity that there would need to be a connection flag to indicate if this client is sending timestamps in seconds or nanoseconds. Potentially we could just assume any value larger than 2^32 is going to be nanoseconds (it is unlikely that any current Lustre releases would still be running in 20 years, and conversely 2^60 ns is needed to get to 2006, so they are very unlikely to conflict), but using an OBD_CONNECT_NANOSECONDS flag is not too hard. The main complexity is that this flag would need to be checked in many places, possibly places where the client export is not easily available, so just making a decision based on the size of the timestamp is relatively safe, and the old seconds format could eventually be deprecated with little effort.

James A Simmons added a comment - 20/Mar/18 5:32 PM

2^32 -1 nanoseconds gives us 4.294967295 seconds until overflow. So using just 32 bit nanoseconds time stamps are not very useful. As Atrem pointed out LNet already sends 64 bit time in seconds. We can use a 32 bit field to add nanosecond value along the already used seconds send. That is why I compared it to struct timespec64 = { time64_t tv_sec; long tv_nsec }

Atrem all the needed infrastructure to support the linux kernel 64 bit time handling has been merged to the latest lustre. Just don't use the cfs time wrappers since they will be going away.

James A Simmons added a comment - 20/Mar/18 5:32 PM 2^32 -1 nanoseconds gives us 4.294967295 seconds until overflow. So using just 32 bit nanoseconds time stamps are not very useful. As Atrem pointed out LNet already sends 64 bit time in seconds. We can use a 32 bit field to add nanosecond value along the already used seconds send. That is why I compared it to struct timespec64 = { time64_t tv_sec; long tv_nsec } Atrem all the needed infrastructure to support the linux kernel 64 bit time handling has been merged to the latest lustre. Just don't use the cfs time wrappers since they will be going away.

Andreas Dilger added a comment - 20/Mar/18 4:07 PM

Hmm, when does ns-since-epoch overflow? Maybe 2 extra bits from the 2^30 ns in a 32-bit field... That would simplify the protocol change, and give us 4*140 years extra?

Andreas Dilger added a comment - 20/Mar/18 4:07 PM Hmm, when does ns-since-epoch overflow? Maybe 2 extra bits from the 2^30 ns in a 32-bit field... That would simplify the protocol change, and give us 4*140 years extra?

Artem Blagodarenko (Inactive) added a comment - 20/Mar/18 2:23 PM

simmonsja So, do you think ~~LU-9019~~ helps "transmit struct timespec64 over the wire" somehow?

Artem Blagodarenko (Inactive) added a comment - 20/Mar/18 2:23 PM simmonsja So, do you think LU-9019 helps "transmit struct timespec64 over the wire" somehow?

James A Simmons added a comment - 20/Mar/18 2:18 PM

Oh I see. You want to basically transmit struct timespec64 over the wire. I was thinking in terms of nanoseconds since the epoch being transmitted.

James A Simmons added a comment - 20/Mar/18 2:18 PM Oh I see. You want to basically transmit struct timespec64 over the wire. I was thinking in terms of nanoseconds since the epoch being transmitted.

Andreas Dilger added a comment - 19/Mar/18 9:20 PM

James, I don't understand your comment. Why would we ever want more than a 32-bit field for nanoseconds? Surely there can't be more than 2^32 nanoseconds in a second? I understand that there can be leap seconds and other time adjustments that might result in over 10^9 nanoseconds in a second, but using a full 64-bit field for nanoseconds in the network protocol is just a waste of space.

Andreas Dilger added a comment - 19/Mar/18 9:20 PM James, I don't understand your comment. Why would we ever want more than a 32-bit field for nanoseconds? Surely there can't be more than 2^32 nanoseconds in a second? I understand that there can be leap seconds and other time adjustments that might result in over 10^9 nanoseconds in a second, but using a full 64-bit field for nanoseconds in the network protocol is just a waste of space.

Artem Blagodarenko (Inactive) added a comment - 19/Mar/18 5:49 PM

simmonsja thanks a lot for answer! I am going to look ~~LU-9019~~ now.

Artem Blagodarenko (Inactive) added a comment - 19/Mar/18 5:49 PM simmonsja thanks a lot for answer! I am going to look LU-9019 now.

James A Simmons added a comment - 19/Mar/18 5:11 PM

This overlaps with the 64 bit time work I have been doing. Now that we support ktime_t this can easily be handled. Just as a note DO NOT use 32 bit fields for nanoseconds. This with not work after 2038 and due to that upstream will reject the patch.

James A Simmons added a comment - 19/Mar/18 5:11 PM This overlaps with the 64 bit time work I have been doing. Now that we support ktime_t this can easily be handled. Just as a note DO NOT use 32 bit fields for nanoseconds. This with not work after 2038 and due to that upstream will reject the patch.

Artem Blagodarenko (Inactive) added a comment - 19/Mar/18 3:30 PM

Are any plans exist currently to rewrite the patch against the variable sized LVB patch?

Artem Blagodarenko (Inactive) added a comment - 19/Mar/18 3:30 PM Are any plans exist currently to rewrite the patch against the variable sized LVB patch?

People

Assignee:: Feng Lei

Reporter:: Andreas Dilger

Votes:: 0 Vote for this issue

Watchers:: 18 Start watching this issue

Dates

Created:: 01/Mar/12 2:10 PM

Updated:: 17/Feb/25 1:31 AM