[LU-12687] Fast ENOSPC on direct I/O Created: 23/Aug/19  Updated: 07/Apr/22  Resolved: 24/Sep/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.14.0, Lustre 2.12.6

Type: Bug Priority: Minor
Reporter: Vladimir Saveliev Assignee: Vladimir Saveliev
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-4664 sync write should consume grant on cl... Resolved
is related to LU-12757 sanity-lfsck test 36a fails with '(N)... Resolved
is related to LU-13766 tgt_grant_check() lsrza-OST000a: cli ... Resolved
is related to LU-13847 sanity test_64f: grants mismatch Closed
is related to LU-4198 Improve IO performance when using DIR... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

On direct I/O a client gets substantial amount of grants without consuming them so that direct I/O writes face with ENOSPC long before disk space is over.
The below example shows how dd oflag=direct fails to write 400kb to ~70mb ost.

[root@sl75master tests]# OSTSIZE=100000 sh llmount.sh
...
Updated after 6s: wanted 'procname_uid' got 'procname_uid'
disable quota as required
[root@sl75master tests]# lfs df -h
UUID                       bytes        Used   Available Use% Mounted on
lustre-MDT0000_UUID       122.4M        1.9M      109.5M   2% /mnt/lustre[MDT:0]
lustre-OST0000_UUID        69.4M        1.2M       61.4M   2% /mnt/lustre[OST:0]
lustre-OST0001_UUID        69.4M        1.2M       61.4M   2% /mnt/lustre[OST:1]

filesystem_summary:       138.9M        2.5M      122.7M   2% /mnt/lustre

[root@sl75master tests]# dd if=/dev/zero of=/mnt/lustre/file bs=4k count=100 oflag=direct
dd: error writing ‘/mnt/lustre/file’: No space left on device
54+0 records in
53+0 records out
217088 bytes (217 kB) copied, 0.138233 s, 1.6 MB/s
[root@sl75master tests]# 


 Comments   
Comment by Gerrit Updater [ 23/Aug/19 ]

Vladimir Saveliev (c17830@cray.com) uploaded a new patch: https://review.whamcloud.com/35896
Subject: LU-12687 osc: consume grants for direct I/O
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9492bed7688f492c756901c9c357dbacbace4485

Comment by Patrick Farrell (Inactive) [ 23/Aug/19 ]

Vladimir,

This is a dupe of LU-4664

We have a patch in work there - In fact, I think it's ready except for fixing the test.

I think having direct i/o consume grants correctly rather than not using grant at all is our preferred solution, and that's what LU-4664 implements.  Is there a specific reason why direct i/o should not request grant, rather than fixing the consume side as in LU-4664?

Comment by Patrick Farrell (Inactive) [ 23/Aug/19 ]

Sorry, patch is:
https://review.whamcloud.com/#/c/9454/

Comment by Andreas Dilger [ 06/Sep/19 ]

I agree with Patrick here that the LU-4664 patch is the right way to go. If we don't consume grants on O_DIRECT writes, then it is possible for those writes to run out of space when there is still grant available on the client.

Conversely, if O_DIRECT consumes all of the grant, but doesn't request more (as with the patch here) then if there are cached writes after this it will not have any grant and will start with small (PAGE_SIZE) synchronous writes until it has some grant.

It would be great if that patch could be refreshed and landed.

Comment by Alex Zhuravlev [ 27/Jan/20 ]

I'm hitting this almost every run of sanity.sh in 248b

Comment by Andreas Dilger [ 09/Jul/20 ]

Alex, if you are able to reproduce this easily, could you please verify patch https://review.whamcloud.com/35896 "LU-12687 osc: consume grants for direct I/O" fixes it for you? That is about to land.

Comment by Gerrit Updater [ 10/Jul/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35896/
Subject: LU-12687 osc: consume grants for direct I/O
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 05f326a7988a7a0d6954d1b0d318315526209ae6

Comment by Peter Jones [ 11/Jul/20 ]

Landed for 2.14

Comment by Gerrit Updater [ 15/Jul/20 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39386
Subject: LU-12687 osc: consume grants for direct I/O
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: c89c8423cf730df97ddc19a8981f978c79fabdfa

Comment by Gerrit Updater [ 28/Jul/20 ]

Olaf Faaland-LLNL (faaland1@llnl.gov) uploaded a new patch: https://review.whamcloud.com/39517
Subject: LU-12687 canary: comment-only change
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bcec79eb40dda0c26bffb9ec950a142034866df6

Comment by Mikhail Pershin [ 30/Aug/20 ]

Regarding landed patch https://review.whamcloud.com/35896/ - I've noted that added code does pretty the same as osc_enter_cache_try() but skips the following checks:

	if (cli->cl_dirty_pages < cli->cl_dirty_max_pages) {
		if (atomic_long_add_return(1, &obd_dirty_pages) <=
		    obd_max_dirty_pages) {
                    ...

for cl_dirty_max_pages limit. Instead of that we just consuming write grants always. Was that intentional or the same limits check should be added for DIO case too?

Comment by Mikhail Pershin [ 30/Aug/20 ]

Vladimir, Andreas, I reopened this due to question I have about already landed patch, please check my previous comment.

Comment by Andreas Dilger [ 31/Aug/20 ]

Hi Mike, I don't think that the max_dirty_pages limit applies in the case of O_DIRECT writes, because the client is not caching those pages locally. It is immediately sending the pages for write to the OST, so it doesn't actually need the grant or to check the dirty limit. The grant usage is only to prevent clients from holding grant that cannot be used by the O_DIRECT writes. In most cases, the client will get grant back immediately in the reply, unless the filesystem is low on free space.

Comment by Mikhail Pershin [ 31/Aug/20 ]

yes, I understand that, meanwhile we don't just consume grants but also increase cl_dirty_pages and obd_dirty_pages counters during that and do that without regards to limits, so technically these values can become greater than their cap and prevent cache entering in osc_enter_cache_try() for a non-DIO writers. So actually the question is - should we increase cl_dirty_pages and obd_dirty_pages at all? Or better keep them untouched with DIO writes. Probably that is not big deal because as you said - it returns back quite quickly

Comment by Vladimir Saveliev [ 07/Sep/20 ]

I've noted that added code does pretty the same as osc_enter_cache_try() but skips the following checks:

osc_enter_cache_try() checks the dirty page counter in order to throttle dirty page generation. In case of DIO we want to submit I/O as soon as possible. So, I guess that is why check for dirty page counter is missing in the patch.

So actually the question is - should we increase cl_dirty_pages and obd_dirty_pages at all? Or better keep them untouched with DIO writes. Probably that is not big deal because as you said - it returns back quite quickly

cl_dirty_pages and obd_dirty_pages are increased for DIO case in order to minimize changes: osc_free_grant() remains unchanged with that.

technically these values can become greater than their cap and prevent cache entering in osc_enter_cache_try() for a non-DIO writers.

I see no problem with that as it decreases write rpc congestion on network and on server.

Comment by Mikhail Pershin [ 07/Sep/20 ]

OK, if you see no problem with that then ticket can be closed 

Comment by Gerrit Updater [ 15/Sep/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39386/
Subject: LU-12687 osc: consume grants for direct I/O
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: ab95f0c361a1e0e65ac51fe15ada4cb1e7eeaa1e

Comment by Peter Jones [ 24/Sep/20 ]

Seems to be landed

Generated at Sat Feb 10 02:54:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.