[LU-6385] write and read test are overlap with obdfilter-survey on osd-zfs Created: 19/Mar/15 Updated: 03/Feb/16 Resolved: 30/Oct/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Shuichi Ihara (Inactive) | Assignee: | Nathaniel Clark |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
master, RHEL7.1, OSD-ZFS |
||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
When obdfilter-survey on osd-zfs, the read test starts before all writes completed. # size=65536 nobjlo=1 nobjhi=1 thrlo=1 thrhi=1 tests_str="write read" \
/usr/bin/obdfilter-survey
Here is iostat result when obdfilter-survey is running. Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util dm-6 0.00 0.00 0.00 5506.00 0.00 678.23 252.27 5.43 0.99 0.00 0.99 0.18 98.00 dm-6 0.00 0.00 0.00 6467.00 0.00 808.51 256.04 5.94 0.92 0.00 0.92 0.15 100.00 dm-6 0.00 0.00 0.00 6136.00 0.00 766.75 255.92 5.94 0.97 0.00 0.97 0.16 100.00 dm-6 0.00 0.00 0.00 6808.00 0.00 850.88 255.96 5.94 0.87 0.00 0.87 0.15 100.00 dm-6 0.00 0.00 0.00 7520.00 0.00 940.00 256.00 6.19 0.82 0.00 0.82 0.13 100.00 dm-6 0.00 0.00 0.00 8505.00 0.00 1063.01 255.97 6.50 0.76 0.00 0.76 0.12 100.00 dm-6 0.00 0.00 0.00 8847.00 0.00 1105.88 256.00 5.94 0.67 0.00 0.67 0.11 100.00 dm-6 0.00 0.00 0.00 9424.00 0.00 1168.48 253.93 5.92 0.63 0.00 0.63 0.10 98.80 dm-6 0.00 0.00 0.00 9631.00 0.00 1203.88 256.00 5.95 0.62 0.00 0.62 0.10 100.00 dm-6 0.00 0.00 250.00 8748.00 31.25 1093.38 255.97 6.37 0.68 5.20 0.55 0.11 100.00 dm-6 0.00 0.00 73.00 6805.00 9.12 850.62 256.00 5.84 0.87 42.63 0.42 0.15 100.00 dm-6 0.00 0.00 621.00 4144.00 77.62 517.88 255.95 4.28 0.93 5.04 0.31 0.21 100.00 dm-6 0.00 0.00 903.00 3501.00 112.87 437.62 256.00 3.98 0.90 3.32 0.28 0.23 100.00 dm-6 0.00 0.00 804.00 3631.00 100.50 454.00 256.06 3.96 0.83 3.35 0.27 0.23 100.00 dm-6 0.00 0.00 864.00 3522.00 108.00 440.12 255.94 3.98 0.98 3.83 0.28 0.23 100.10 dm-6 0.00 0.00 987.00 3512.00 123.37 439.00 256.00 3.99 0.87 2.97 0.28 0.22 100.00 dm-6 0.00 0.00 869.00 3515.00 108.62 438.07 255.39 3.96 0.92 3.55 0.27 0.23 100.00 dm-6 0.00 0.00 879.00 3493.00 109.87 432.71 254.17 3.98 0.91 3.42 0.28 0.23 100.00 dm-6 0.00 0.00 1001.00 4029.00 125.12 503.62 256.00 3.97 0.79 2.99 0.24 0.20 100.00 dm-6 0.00 0.00 465.00 5915.00 58.13 739.38 256.00 3.96 0.61 6.35 0.16 0.16 100.00 dm-6 0.00 0.00 3642.00 820.00 455.25 96.58 253.28 3.15 0.72 0.84 0.16 0.22 100.00 dm-6 0.00 0.00 4794.00 0.00 599.25 0.00 256.00 3.00 0.63 0.63 0.00 0.21 100.00 dm-6 0.00 0.00 4419.00 0.00 552.38 0.00 256.00 3.01 0.68 0.68 0.00 0.23 100.00 dm-6 0.00 0.00 4422.00 0.00 552.75 0.00 256.00 2.99 0.68 0.68 0.00 0.23 100.00 dm-6 0.00 0.00 4417.00 0.00 552.12 0.00 256.00 3.01 0.68 0.68 0.00 0.23 100.00 dm-6 0.00 0.00 4433.00 0.00 554.12 0.00 256.00 2.99 0.68 0.68 0.00 0.23 100.00 dm-6 0.00 0.00 4413.00 0.00 551.62 0.00 256.00 3.01 0.68 0.68 0.00 0.23 100.00 dm-6 0.00 0.00 4407.00 0.00 550.88 0.00 256.00 3.00 0.68 0.68 0.00 0.23 100.00 dm-6 0.00 0.00 4416.00 0.00 552.00 0.00 256.00 2.99 0.68 0.68 0.00 0.23 100.00 dm-6 0.00 0.00 4399.00 0.00 549.88 0.00 256.00 2.98 0.68 0.68 0.00 0.23 100.00 dm-6 0.00 0.00 4365.00 0.00 545.62 0.00 256.00 2.98 0.68 0.68 0.00 0.23 99.40 dm-6 0.00 0.00 4380.00 0.00 547.50 0.00 256.00 3.00 0.69 0.69 0.00 0.23 100.00 dm-6 0.00 0.00 4392.00 0.00 549.00 0.00 256.00 2.98 0.68 0.68 0.00 0.23 100.00 It shows write and read are overlap which means read tests started before all data flushed. |
| Comments |
| Comment by Andreas Dilger [ 19/Mar/15 ] |
|
The osd-ldiskfs code is flushing all writes to disk before it is available for reads. However, osd-zfs is doing writeback caching of writes, so this behaviour isn't totally surprising. With the OST code, it is flushing the writes under DLM locks before they are cancelled and granted to the reading client, but I suspect obdfilter-survey isn't doing DLM locking of the data? |
| Comment by Wang Shilong (Inactive) [ 23/Mar/15 ] |
|
Hi Andreas Dilger, What do you mean for writeback caching of writes? |
| Comment by Andreas Dilger [ 23/Mar/15 ] |
|
For osd-ldiskfs, the writes are submitted synchronously to disk during RPC processing using submit_bio(), and the reply is only sent after the data write is committed to disk because of the wait_event() in osd_do_bio(). The client may still get an RPC reply before the transaction is committed to reduce latency. |
| Comment by Gerrit Updater [ 23/Mar/15 ] |
|
Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/14143 |
| Comment by Gerrit Updater [ 18/Aug/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14143/ |
| Comment by Peter Jones [ 18/Aug/15 ] |
|
Landed for 2.8 |
| Comment by Nathaniel Clark [ 26/Oct/15 ] |
|
Patch http://review.whamcloud.com/14143 only fixes issue when called within test framework from lustre/tests/obdfilter-survey.sh This issue still exists if called as present in lustre-iokit rpm. |
| Comment by Nathaniel Clark [ 26/Oct/15 ] |
|
Patch for master: http://review.whamcloud.com/16942 |
| Comment by Gerrit Updater [ 26/Oct/15 ] |
|
Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/16942 |
| Comment by Gerrit Updater [ 30/Oct/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16942/ |
| Comment by Peter Jones [ 30/Oct/15 ] |
|
Landed for 2.8 |