[LU-6492] less +F doesn't work Created: 24/Apr/15  Updated: 10/Oct/21  Resolved: 10/Oct/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.3
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Nick Tan (Inactive) Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

CentOS 6.4 x86_64


Epic/Theme: lustre-2.5.3
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

node 1 streams data into an ever-growing file (e.g. a log file).

node 2 can see the file changing with stat:

$ stat logs/*.3
File: `debug.o12716414.3'
Size: 33472307 Blocks: 65360 IO Block: 4194304 regular file
Device: 4a737072h/1249079410d Inode: 144115224750877831 Links: 1
Access: (0664/rw-rw-r-) Uid: ( 1087/ tzuyenw) Gid: ( 2117/teamcott)
Access: 2015-04-24 13:33:09.000000000 +0800
Modify: 2015-04-24 13:33:13.000000000 +0800
Change: 2015-04-24 13:33:13.000000000 +0800

$ stat logs/*.3
File: `debug.o12716414.3'
Size: 33485617 Blocks: 65360 IO Block: 4194304 regular file
Device: 4a737072h/1249079410d Inode: 144115224750877831 Links: 1
Access: (0664/rw-rw-r-) Uid: ( 1087/ tzuyenw) Gid: ( 2117/teamcott)
Access: 2015-04-24 13:33:09.000000000 +0800
Modify: 2015-04-24 13:33:18.000000000 +0800
Change: 2015-04-24 13:33:18.000000000 +0800

On node 2, a tail -f (which calls fstat() to determine if the file has changed) works:

13:34:45.548212 read(3, ""..., 8192) = 5118 <0.005387>
13:34:45.547101 write(1, ""..., 5118) = 5118 <0.001003>
13:34:45.555175 read(3, "", 8192) = 0 <0.000124>
13:34:45.555367 fstat(3,

{st_mode=S_IFREG|0664, st_size=33712023, ...}) = 0 <0.000100>
13:34:45.555566 nanosleep({1, 0}, NULL) = 0 <1.000124>
13:34:46.555811 fstat(3, {st_mode=S_IFREG|0664, st_size=33712023, ...}

) = 0 <0.000185>
13:34:46.556136 nanosleep(

{1, 0}, NULL) = 0 <1.000118>
13:34:47.556374 fstat(3, {st_mode=S_IFREG|0664, st_size=33712023, ...}) = 0 <0.000175>
13:34:47.556684 nanosleep({1, 0}

, NULL) = 0 <1.000134>
13:34:48.556948 fstat(3,

{st_mode=S_IFREG|0664, st_size=33712023, ...}) = 0 <0.000156>
13:34:48.557234 nanosleep({1, 0}, NULL) = 0 <1.000110>
13:34:49.557444 fstat(3, {st_mode=S_IFREG|0664, st_size=33712023, ...}

) = 0 <0.037993>
13:34:49.595556 read(3, ""..., 8192) = 8192 <0.333306>
...
13:34:49.937886 read(3, "", 8192) = 0 <0.000126>
13:34:49.938080 fstat(3,

{st_mode=S_IFREG|0664, st_size=33725333, ...}

) = 0 <0.000105>

Interestingly, even though the file size didn't change (we can see the st_size there), it evidently indicated some kind of change (perhaps mtime?) because tail knew to try the read() again, and the read() succeeded. After the successful read, it did another read/fstat, and at that point the st_size was updated.

However "less" in Follow mode (which doesn't do fstat() before read()) doesn't work:

13:33:11.949002 nanosleep(

{1, 0}, 0x7fff832a7c30) = 0 <1.000156>
13:33:12.949280 read(4, "", 5324) = 0 <0.000028>
13:33:12.949752 nanosleep({1, 0}

, 0x7fff832a7c30) = 0 <1.000155>
13:33:13.950040 read(4, "", 5324) = 0 <0.000031>
13:33:13.950496 nanosleep(

{1, 0}, 0x7fff832a7c30) = 0 <1.000148>
13:33:14.950767 read(4, "", 5324) = 0 <0.000029>
13:33:14.951249 nanosleep({1, 0}

, 0x7fff832a7c30) = 0 <1.000150>
13:33:15.951520 read(4, "", 5324) = 0 <0.000026>
13:33:15.952015 nanosleep(

{1, 0}, 0x7fff832a7c30) = 0 <1.000147>
13:33:16.952287 read(4, "", 5324) = 0 <0.000029>
13:33:16.952738 nanosleep({1, 0}

, 0x7fff832a7c30) = 0 <1.000148>
13:33:17.953009 read(4, "", 5324) = 0 <0.000026>
etc.

Despite the fact that the file is growing (as seen by stat in another terminal), the read() acts like the file descriptor is still at the end of the file. The logfile grows every 10 seconds, but you can leave that less running for hours and hours, it'll never see it.



 Comments   
Comment by Phil Schwan (Inactive) [ 18/Nov/15 ]

This doesn't seem all that minor to me – it's a pretty fundamental violation of the POSIX semantics on which Lustre was based, non?

This bug is basically saying that if you're extending a file on one node, you can't rely on read() returning the correct result on another node. That's major!

In fact, I'm sure we had automated tests for cases exactly like this, back in 2005. I'm surprised a bug like this could even make it into a release.

Comment by Robert Read (Inactive) [ 18/Nov/15 ]

Hey, Phil, nice to see you!

I was curious so I tried to reproduce this on 2.7.0 and CentOS 6.7 using this:

[ec2-user@client00 ~]$ for i in $(seq 100); do echo "$i $(date)"; sleep 10; done > /mnt/scratch/test/watchme 

Then ran "less +F" on a different client, and I am seeing the updates as they are written:

~
~
1 Wed Nov 18 17:21:41 UTC 2015
2 Wed Nov 18 17:21:51 UTC 2015
3 Wed Nov 18 17:22:01 UTC 2015
4 Wed Nov 18 17:22:11 UTC 2015
5 Wed Nov 18 17:22:21 UTC 2015
6 Wed Nov 18 17:22:31 UTC 2015
7 Wed Nov 18 17:22:41 UTC 2015
8 Wed Nov 18 17:22:51 UTC 2015
9 Wed Nov 18 17:23:01 UTC 2015
10 Wed Nov 18 17:23:11 UTC 2015
11 Wed Nov 18 17:23:21 UTC 2015
12 Wed Nov 18 17:23:31 UTC 2015
13 Wed Nov 18 17:23:41 UTC 2015
14 Wed Nov 18 17:23:51 UTC 2015
15 Wed Nov 18 17:24:01 UTC 2015
Waiting for data... (interrupt to abort)

So either a bug crept into 2.5.3 (seems unlikely) or we'll need a more sophisticated reproducer.

Comment by Phil Schwan (Inactive) [ 02/Dec/15 ]

Hóla, Roberto!

That's... interesting. It's 100% reproducible on my machine with a particular workload, but we've poked around a few others and some workstations behave differently – despite being (by all appearances) the same (same kernel, same version of Lustre, same mounts, same config).

So it is indeed not quite so simple as it first appeared. Fair enough.

Generated at Sat Feb 10 02:00:42 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.