[LU-12687] Fast ENOSPC on direct I/O Created: 23/Aug/19 Updated: 07/Apr/22 Resolved: 24/Sep/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.14.0, Lustre 2.12.6 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Vladimir Saveliev | Assignee: | Vladimir Saveliev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||
| Description |
|
On direct I/O a client gets substantial amount of grants without consuming them so that direct I/O writes face with ENOSPC long before disk space is over. [root@sl75master tests]# OSTSIZE=100000 sh llmount.sh ... Updated after 6s: wanted 'procname_uid' got 'procname_uid' disable quota as required [root@sl75master tests]# lfs df -h UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 122.4M 1.9M 109.5M 2% /mnt/lustre[MDT:0] lustre-OST0000_UUID 69.4M 1.2M 61.4M 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 69.4M 1.2M 61.4M 2% /mnt/lustre[OST:1] filesystem_summary: 138.9M 2.5M 122.7M 2% /mnt/lustre [root@sl75master tests]# dd if=/dev/zero of=/mnt/lustre/file bs=4k count=100 oflag=direct dd: error writing ‘/mnt/lustre/file’: No space left on device 54+0 records in 53+0 records out 217088 bytes (217 kB) copied, 0.138233 s, 1.6 MB/s [root@sl75master tests]# |
| Comments |
| Comment by Gerrit Updater [ 23/Aug/19 ] |
|
Vladimir Saveliev (c17830@cray.com) uploaded a new patch: https://review.whamcloud.com/35896 |
| Comment by Patrick Farrell (Inactive) [ 23/Aug/19 ] |
|
Vladimir, This is a dupe of We have a patch in work there - In fact, I think it's ready except for fixing the test. I think having direct i/o consume grants correctly rather than not using grant at all is our preferred solution, and that's what |
| Comment by Patrick Farrell (Inactive) [ 23/Aug/19 ] |
|
Sorry, patch is: |
| Comment by Andreas Dilger [ 06/Sep/19 ] |
|
I agree with Patrick here that the Conversely, if O_DIRECT consumes all of the grant, but doesn't request more (as with the patch here) then if there are cached writes after this it will not have any grant and will start with small (PAGE_SIZE) synchronous writes until it has some grant. It would be great if that patch could be refreshed and landed. |
| Comment by Alex Zhuravlev [ 27/Jan/20 ] |
|
I'm hitting this almost every run of sanity.sh in 248b |
| Comment by Andreas Dilger [ 09/Jul/20 ] |
|
Alex, if you are able to reproduce this easily, could you please verify patch https://review.whamcloud.com/35896 " |
| Comment by Gerrit Updater [ 10/Jul/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35896/ |
| Comment by Peter Jones [ 11/Jul/20 ] |
|
Landed for 2.14 |
| Comment by Gerrit Updater [ 15/Jul/20 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39386 |
| Comment by Gerrit Updater [ 28/Jul/20 ] |
|
|
| Comment by Mikhail Pershin [ 30/Aug/20 ] |
|
Regarding landed patch https://review.whamcloud.com/35896/ - I've noted that added code does pretty the same as osc_enter_cache_try() but skips the following checks: if (cli->cl_dirty_pages < cli->cl_dirty_max_pages) { if (atomic_long_add_return(1, &obd_dirty_pages) <= obd_max_dirty_pages) { ... for cl_dirty_max_pages limit. Instead of that we just consuming write grants always. Was that intentional or the same limits check should be added for DIO case too? |
| Comment by Mikhail Pershin [ 30/Aug/20 ] |
|
Vladimir, Andreas, I reopened this due to question I have about already landed patch, please check my previous comment. |
| Comment by Andreas Dilger [ 31/Aug/20 ] |
|
Hi Mike, I don't think that the max_dirty_pages limit applies in the case of O_DIRECT writes, because the client is not caching those pages locally. It is immediately sending the pages for write to the OST, so it doesn't actually need the grant or to check the dirty limit. The grant usage is only to prevent clients from holding grant that cannot be used by the O_DIRECT writes. In most cases, the client will get grant back immediately in the reply, unless the filesystem is low on free space. |
| Comment by Mikhail Pershin [ 31/Aug/20 ] |
|
yes, I understand that, meanwhile we don't just consume grants but also increase cl_dirty_pages and obd_dirty_pages counters during that and do that without regards to limits, so technically these values can become greater than their cap and prevent cache entering in osc_enter_cache_try() for a non-DIO writers. So actually the question is - should we increase cl_dirty_pages and obd_dirty_pages at all? Or better keep them untouched with DIO writes. Probably that is not big deal because as you said - it returns back quite quickly |
| Comment by Vladimir Saveliev [ 07/Sep/20 ] |
osc_enter_cache_try() checks the dirty page counter in order to throttle dirty page generation. In case of DIO we want to submit I/O as soon as possible. So, I guess that is why check for dirty page counter is missing in the patch.
cl_dirty_pages and obd_dirty_pages are increased for DIO case in order to minimize changes: osc_free_grant() remains unchanged with that.
I see no problem with that as it decreases write rpc congestion on network and on server. |
| Comment by Mikhail Pershin [ 07/Sep/20 ] |
|
OK, if you see no problem with that then ticket can be closed |
| Comment by Gerrit Updater [ 15/Sep/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39386/ |
| Comment by Peter Jones [ 24/Sep/20 ] |
|
Seems to be landed |