Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.2.0, Lustre 2.4.0, Lustre 2.1.1
-
3
-
4740
Description
Our testing has revealed that lustre 2.1 is far more likely than 1.8 to return short reads and writes (return code says fewer bytes read/written than requested).
So far, the frequent reproducer is IOR shared single file, transfer size 128MB, block size 256MB, 32 client nodes, and 512 tasks evenly spread over the clients.
The file is only striped over 2 OSTs.
When the read() or write() return value is less than the requested amount, the size is, in every instance that I have seen thus far, a multiple of 1MB.
I suspect that other loads will show the same problem. I think that our more common large-transfer-request work loads come from our file-per-process apps though, so we'll run some tests to see if it is easy to reproduce there as well.
Attachments
Issue Links
- is duplicated by
-
LU-2683 Client deadlock in cl_lock_mutex_get
- Resolved
-
LU-1065 High rate of obd_ping failure with client <-> OST evictions
- Resolved
- Trackbacks
-
Changelog 2.1 Changes from version 2.1.1 to version 2.1.2 Server support for kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1.el6 (RHEL6) Client support for unpatched kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1....
-
Changelog 2.2 version 2.2.0 Support for networks: o2iblnd OFED 1.5.4 Server support for kernels: 2.6.32220.4.2.el6 (RHEL6) Client support for unpatched kernels: 2.6.18274.18.1.el5 (RHEL5) 2.6.32220.4.2.el6 (RHEL6) 2.6.32.360....