Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
Lustre 2.1.0
-
None
-
RHEL6
-
3
-
6592
Description
While porting zero copy patch to RHEL6 we have found some corruptions during IO while raid5/6 reconstruction. I think it's should be affect to RHEL5 also.
it's easy to replicated by
echo 32 > /sys/block/md0/md/stripe_cache_size
echo 0 > /proc/fs/lustre/obdfilter/<ost_name>/writethrough_cache_enable
echo 0 > /proc/fs/lustre/obdfilter/<ost_name>/read_cache_enable
and fail one of disks with
mdadm /dev/mdX --fail /dev/....
after it verify data is correct.
[root@sjlustre1-o1 ~]# dd if=/dev/urandom of=test.1 oflag=direct bs=128k
count=8
8+0 records in
8+0 records out
1048576 bytes (1.0 MB) copied, 0.157819 seconds, 6.6 MB/s
[root@sjlustre1-o1 ~]# md5sum test.1
4ec4d0b67a2b3341795706605e0b0a28 test.1
[root@sjlustre1-o1 ~]# md5sum test.1 > test.1.md5
[root@sjlustre1-o1 ~]# dd if=test.1 iflag=direct of=/lustre/stry/test.1
oflag=direct bs=128k
8+0 records in
8+0 records out
1048576 bytes (1.0 MB) copied, 0.319458 seconds, 3.3 MB/s
[root@sjlustre1-o1 ~]# dd if=/lustre/stry/test.1 iflag=direct of=test.2
oflag=direct bs=128k
8+0 records in
8+0 records out
1048576 bytes (1.0 MB) copied, 0.114691 seconds, 9.1 MB/s
[root@sjlustre1-o1 ~]# md5sum test.1 test.2
4ec4d0b67a2b3341795706605e0b0a28 test.1
426c976b75fa3ce5b5ae22b5195f85fd test.2
after work problem identified as two bugs in zcopy patch.
1) raid5 set a flag UPTODATE to stripe with staled pointers from DIO and try to copy data from these pointers during READ phase.
2) restoring pages from stripe cache issue.
please verify it's issue on RHEL5 env (we don't have it's now).
Attachments
Issue Links
- Trackbacks
llmount.sh uses loopback devices (even with clearing OST_MOUNT_OPTS/MDS_MOUNT_OPTS as suggested by Alexey.) These devices create indirection that may obscure the problem.
As an alternative to llmount.sh I'm manually creating the filesystem. I have used the following steps on RHEL5 and my RHEL6. I have been unable to recreate the bug reliably. As you suggest, I am working on a method to predictably perform zerocopy writes.
Eric, can you run these commands on your RHEL6 environment to confirm that these instructions reproduce this bug on RHEL6.
Create a MD device.
Create MDS/MDT and mount.
Create OST on the MD device and mount on OSS.
Mount the Lustre fs.
# mount -t lustre 10.0.0.1@tcp0:/temp /mnt/lustre ... # mount /dev/sda1 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) /dev/sdb11 on /mnt/mdt type lustre (rw) /dev/md0 on /mnt/ost1 type lustre (rw) 10.0.0.1@tcp0:/temp on /mnt/lustre type lustre (rw)
Turn off stripe size etc.
Copy file onto Lustre, fail drive and copy off.