[LU-2580] cp with FIEMAP support creates completely sparse file Created: 07/Jan/13 Updated: 24/Apr/13 Resolved: 05/Mar/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0, Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Kit Westneat (Inactive) | Assignee: | Peter Jones |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LB | ||
| Environment: |
SLES 11 SP2 (client), Lustre 2.1.2 RHEL6 (server) |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Severity: | 2 | ||||||||||||
| Rank (Obsolete): | 6020 | ||||||||||||
| Description |
|
We are seeing an issue at KIT where cp will occasionally use the FIEMAP extension to create a completely sparse file instead of actually copying the file. It seems to occur under a workload involving creating and deleting many files at once. It only involves a single client though, it's not a parallel workload. Relevant strace from 'bad' cp: strace from 'good' cp: The strace didn't print the stat block information, but I'm assuming the st_blocks == 0 in the bad one. I will ask the customer to get a full strace -v to confirm, but it appears to be something similar to |
| Comments |
| Comment by Kit Westneat (Inactive) [ 07/Jan/13 ] |
|
There is a 41MB (800MB uncompressed) client debug log from the event. Would it be useful to upload it somewhere? |
| Comment by Peter Jones [ 07/Jan/13 ] |
|
Thanks for the ticket Kit. |
| Comment by Andreas Dilger [ 09/Jan/13 ] |
|
Kit, is this only happening when trying to copy a 1-stripe file that was recently created? Is the file created on the same node or a remote node? What does "stat" report on the file before it is copied? |
| Comment by Kit Westneat (Inactive) [ 09/Jan/13 ] |
|
Here is the stat, stracing the cp reports the same: Response from customer: > is this only happening when trying to copy a 1-stripe file that was recently created? > Is the file created on the same node or a remote node? > What does "stat" report on the file before it is copied? |
| Comment by Kit Westneat (Inactive) [ 17/Jan/13 ] |
|
stats and md5sums of both good and bad cps. In the bad cp, the file only reports 1 used block. In the good one, the file reports 32 blocks. straces confirm that's what is seen by cp. |
| Comment by Andreas Dilger [ 17/Jan/13 ] |
|
What version of fileutils is in use here? Was it part of the distro, or upgraded afterward? |
| Comment by Kit Westneat (Inactive) [ 22/Jan/13 ] |
|
From KIT: They are using normal fileutils (which are part of coreutils) of the SLES11 SP2 distribution. coreutils version is 8.12 and release is 6.23.1. (Source RPM is coreutils-8.12-6.23.1.src.rpm) |
| Comment by Kalpak Shah (Inactive) [ 20/Feb/13 ] |
|
I don't think this issue is related to FIEMAP. stat reported st_blocks=1 for the file and a size of 12899 bytes. So cp correctly called the FIEMAP ioctl. The problem seems to be Lustre reporting wrong number of blocks on a recently created/written file. This fix leads stat to report st_blocks=1 instead of 0 - http://git.whamcloud.com/?p=fs/lustre-release.git;a=commitdiff;h=829845ac9ddbdfd170de215742c033ea1102db3e;hp=fc4b46df111bbf9d2207265d18b3f0d72f49502c |
| Comment by Kalpak Shah (Inactive) [ 21/Feb/13 ] |
|
Further regarding the ftruncate that we see in the strace (instead of the fseek that I was expecting) - even though Lustre says st_blocks=1, fiemap ioctl says that no blocks are allocated leading to the ftruncate call with the size of the file. On SLES11 SP2 with coreutils-8.12-6.19.1, looks like cp is always setting the FIEMAP_FLAG_SYNC flag as well. |
| Comment by Andreas Dilger [ 21/Feb/13 ] |
|
Kalpak, AFAIK the st_blocks value is only used to determine whether the file is sparse (st_blocks < st_size / 512) or dense (st_blocks >= st_size / 512). For dense files they are copied via "while (read() > 0) write()", and for sparse files newer "cp" copies only the list of extents returned by FIEMAP. In both cases, my understanding is that st_blocks is not used for determining how much data is copied. The problem, as I see it, is that Lustre FIEMAP (which only returns something useful to "cp" for single-striped files) does not return FIEMAP_EXTENT_DELALLOC extents for pages that are only in the client cache and not on the OST yet. "cp" should be using FIEMAP_FLAG_SYNC and causing all of the cached extents to be flushed, but somehow this isn't happening. |
| Comment by Peter Jones [ 27/Feb/13 ] |
|
Could you please clarify as to what versions of Lustre (and any patches running) that are being used here? You mention that it is Lustre 2.1.2 servers but what version of Lustre is being used on the client? |
| Comment by Kalpak Shah (Inactive) [ 04/Mar/13 ] |
|
Peter, the clients are running Lustre 2.3 on SLES11 SP2. |
| Comment by Peter Jones [ 04/Mar/13 ] |
|
Thanks Kalpak. With any patches applied? |
| Comment by Kalpak Shah (Inactive) [ 05/Mar/13 ] |
|
Update from KIT: With Lustre 2.3.0 on the client and patches 4477 and 4659 from
|
| Comment by Peter Jones [ 05/Mar/13 ] |
|
ok thanks for the update. That explains why we have been unable to reproduce this issue on the latest 2.4 code. I will close out this ticket. |
| Comment by Cory Spitz [ 28/Mar/13 ] |
|
This bug is still applicable to 2.1 when using cp built from coreutils 8.12, right? [I can't confirm that, but I think we're seeing this on 2.2] If so and since b2_1 is still the current maintenance branch, do we want to land a fix there? |