Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
Lustre 2.4.1
-
None
-
server: centos 2.1.5 server OR centos 2.4.1 server
client: sles11sp2 2.4.1 client
Source can be found at github.com/jlan/lustre-nas. The tag for the client is 2.4.1-1nasC.
-
3
-
12006
Description
Users reported a data corruption problem. We have a test script to reproduce the problem.
When run in a Lustre file system with a sles11sp2 host as the remote host, the script fails (sum reports 00000). It works if the remote host is running sles11sp1 or CentOS.
— cut here for test5.sh —
#!/bin/sh
host=${1:-endeavour2}
rm -fr zz hosts
cp /etc/hosts hosts
#fsync hosts
ssh $host "cd $PWD && mkdir -p zz && cp hosts zz/"
sum hosts zz/hosts
— cut here —
Good result:
./test5.sh r301i0n0
61609 41 hosts
61609 41 zz/hosts
Bad result:
./test5.sh r401i0n2
61609 41 hosts
00000 41 zz/hosts
Notes:
- If the copied file is small enough (e.g., /etc/motd), the script succeeds.
- If you uncomment the fsync, the script succeeds.
- When it fails, stat reports no blocks have been allocated to the zz/hosts file:
$ stat zz/hosts
File: `zz/hosts'
Size: 41820 Blocks: 0 IO Block: 2097152 regular file
Device: 914ef3a8h/2437870504d Inode: 163153538715835056 Links: 1
Access: (0644/rw-rr-) Uid: (10491/dtalcott) Gid: ( 1179/ cstaff)
Access: 2013-12-12 09:24:46.000000000 -0800
Modify: 2013-12-12 09:24:46.000000000 -0800
Change: 2013-12-12 09:24:46.000000000 -0800
- If you run in an NFS file system, the script usually succeeds, but sometimes reports a no such file error on the sum of zz/hosts. After a few seconds, though, the file appears, with the correct sum. (Typical NFS behavior.)
- Acts the same on nbp7 and nbp8.
Attachments
Issue Links
- duplicates
-
LU-3219 FIEMAP does not sync data or return cached pages
- Resolved