[LU-3440] ftello() system call not grok'ing with expected file position location Created: 05/Jun/13  Updated: 10/Jun/13  Resolved: 10/Jun/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.5
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Karl W Schulz (Inactive) Assignee: Bob Glossman (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Attachments: File gib.c    
Issue Links:
Related
is related to LU-3044 LSeek SEEK_CUR gives incorrect value ... Resolved
Severity: 3
Rank (Obsolete): 8569

 Description   

We have an external chemistry application (GROMACS) which relies on the ftello() system call during it's checkpointing mechanism. On our Lustre 2.1.5 system, we see evidence that fello() does not report to be at the current EOF after multiple writes. To demonstrate, attached is a small reproducer attempting to mimic the usage within the application which iterates writing the same data and compares the results of ftello() immediately after the write (for which the app is assuming to now be at current eof) versus the the result from ftello() after a seek to SEEK_END has been performed.

On Lustre 2.1.5, these seem to deviate after the first write iteration on our system with the attached. We do not see this on Lustre 1.8.6 or a vanilla file system like ext3.

Below is example output from the reproducer on 3 filesystems. The only outlier below is in the first run on 2.1.5 during iteration #1. The app is expecting to be at an offset of 5120 after the second write, but ftello() reports an offset of 6656.

-----------------------------------------------
Running in Lustre 2.1.5

staff$ ./a.out 2 512
size of the problem: 2 iters: 512

iter = 0
off_t offset before write to file: 0
off_t offset after write to file: 2560
off_t offset after seek: 2560

iter = 1
off_t offset before write to file: 2560
off_t offset after write to file: 6656
off_t offset after seek: 5120

-----------------------------------------------
Running in Lustre 1.8.6

./a.out 2 512
size of the problem: 2 iters: 512

iter = 0
off_t offset before write to file: 0
off_t offset after write to file: 2560
off_t offset after seek: 2560

iter = 1
off_t offset before write to file: 2560
off_t offset after write to file: 5120
off_t offset after seek: 5120

-----------------------------------------------
Running in vanilla ext3:

staff$ ./a.out 2 512
size of the problem: 2 iters: 512

iter = 0
off_t offset before write to file: 0
off_t offset after write to file: 2560
off_t offset after seek: 2560

iter = 1
off_t offset before write to file: 2560
off_t offset after write to file: 5120
off_t offset after seek: 5120
-----------------------------------------------



 Comments   
Comment by Bob Glossman (Inactive) [ 05/Jun/13 ]

dup of LU-3044? fixed in master, not in b2_1

Comment by Peter Jones [ 05/Jun/13 ]

Bob

Oleg agrees with your assessment. Could you please port the fix to b2_1 for TACC>

Thanks

Peter

Comment by Jodi Levi (Inactive) [ 10/Jun/13 ]

Duplicate of LU-3044

Generated at Sat Feb 10 01:33:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.