Details

    • 3
    • 9223372036854775807

    Description

      On direct I/O a client gets substantial amount of grants without consuming them so that direct I/O writes face with ENOSPC long before disk space is over.
      The below example shows how dd oflag=direct fails to write 400kb to ~70mb ost.

      [root@sl75master tests]# OSTSIZE=100000 sh llmount.sh
      ...
      Updated after 6s: wanted 'procname_uid' got 'procname_uid'
      disable quota as required
      [root@sl75master tests]# lfs df -h
      UUID                       bytes        Used   Available Use% Mounted on
      lustre-MDT0000_UUID       122.4M        1.9M      109.5M   2% /mnt/lustre[MDT:0]
      lustre-OST0000_UUID        69.4M        1.2M       61.4M   2% /mnt/lustre[OST:0]
      lustre-OST0001_UUID        69.4M        1.2M       61.4M   2% /mnt/lustre[OST:1]
      
      filesystem_summary:       138.9M        2.5M      122.7M   2% /mnt/lustre
      
      [root@sl75master tests]# dd if=/dev/zero of=/mnt/lustre/file bs=4k count=100 oflag=direct
      dd: error writing ‘/mnt/lustre/file’: No space left on device
      54+0 records in
      53+0 records out
      217088 bytes (217 kB) copied, 0.138233 s, 1.6 MB/s
      [root@sl75master tests]# 
      

      Attachments

        Issue Links

          Activity

            [LU-12687] Fast ENOSPC on direct I/O
            pjones Peter Jones added a comment -

            Seems to be landed

            pjones Peter Jones added a comment - Seems to be landed

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39386/
            Subject: LU-12687 osc: consume grants for direct I/O
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: ab95f0c361a1e0e65ac51fe15ada4cb1e7eeaa1e

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39386/ Subject: LU-12687 osc: consume grants for direct I/O Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: ab95f0c361a1e0e65ac51fe15ada4cb1e7eeaa1e

            OK, if you see no problem with that then ticket can be closed 

            tappro Mikhail Pershin added a comment - OK, if you see no problem with that then ticket can be closed 

            I've noted that added code does pretty the same as osc_enter_cache_try() but skips the following checks:

            osc_enter_cache_try() checks the dirty page counter in order to throttle dirty page generation. In case of DIO we want to submit I/O as soon as possible. So, I guess that is why check for dirty page counter is missing in the patch.

            So actually the question is - should we increase cl_dirty_pages and obd_dirty_pages at all? Or better keep them untouched with DIO writes. Probably that is not big deal because as you said - it returns back quite quickly

            cl_dirty_pages and obd_dirty_pages are increased for DIO case in order to minimize changes: osc_free_grant() remains unchanged with that.

            technically these values can become greater than their cap and prevent cache entering in osc_enter_cache_try() for a non-DIO writers.

            I see no problem with that as it decreases write rpc congestion on network and on server.

            vsaveliev Vladimir Saveliev added a comment - I've noted that added code does pretty the same as osc_enter_cache_try() but skips the following checks: osc_enter_cache_try() checks the dirty page counter in order to throttle dirty page generation. In case of DIO we want to submit I/O as soon as possible. So, I guess that is why check for dirty page counter is missing in the patch. So actually the question is - should we increase cl_dirty_pages and obd_dirty_pages at all? Or better keep them untouched with DIO writes. Probably that is not big deal because as you said - it returns back quite quickly cl_dirty_pages and obd_dirty_pages are increased for DIO case in order to minimize changes: osc_free_grant() remains unchanged with that. technically these values can become greater than their cap and prevent cache entering in osc_enter_cache_try() for a non-DIO writers. I see no problem with that as it decreases write rpc congestion on network and on server.
            tappro Mikhail Pershin added a comment - - edited

            yes, I understand that, meanwhile we don't just consume grants but also increase cl_dirty_pages and obd_dirty_pages counters during that and do that without regards to limits, so technically these values can become greater than their cap and prevent cache entering in osc_enter_cache_try() for a non-DIO writers. So actually the question is - should we increase cl_dirty_pages and obd_dirty_pages at all? Or better keep them untouched with DIO writes. Probably that is not big deal because as you said - it returns back quite quickly

            tappro Mikhail Pershin added a comment - - edited yes, I understand that, meanwhile we don't just consume grants but also increase cl_dirty_pages and obd_dirty_pages  counters during that and do that without regards to limits, so technically these values can become greater than their cap and prevent cache entering in osc_enter_cache_try()  for a non-DIO writers. So actually the question is - should we increase cl_dirty_pages  and obd_dirty_pages at all? Or better keep them untouched with DIO writes. Probably that is not big deal because as you said - it returns back quite quickly

            Hi Mike, I don't think that the max_dirty_pages limit applies in the case of O_DIRECT writes, because the client is not caching those pages locally. It is immediately sending the pages for write to the OST, so it doesn't actually need the grant or to check the dirty limit. The grant usage is only to prevent clients from holding grant that cannot be used by the O_DIRECT writes. In most cases, the client will get grant back immediately in the reply, unless the filesystem is low on free space.

            adilger Andreas Dilger added a comment - Hi Mike, I don't think that the max_dirty_pages limit applies in the case of O_DIRECT writes, because the client is not caching those pages locally. It is immediately sending the pages for write to the OST, so it doesn't actually need the grant or to check the dirty limit. The grant usage is only to prevent clients from holding grant that cannot be used by the O_DIRECT writes. In most cases, the client will get grant back immediately in the reply, unless the filesystem is low on free space.

            Vladimir, Andreas, I reopened this due to question I have about already landed patch, please check my previous comment.

            tappro Mikhail Pershin added a comment - Vladimir, Andreas, I reopened this due to question I have about already landed patch, please check my previous comment.

            Regarding landed patch https://review.whamcloud.com/35896/ - I've noted that added code does pretty the same as osc_enter_cache_try() but skips the following checks:

            	if (cli->cl_dirty_pages < cli->cl_dirty_max_pages) {
            		if (atomic_long_add_return(1, &obd_dirty_pages) <=
            		    obd_max_dirty_pages) {
                                ...
            

            for cl_dirty_max_pages limit. Instead of that we just consuming write grants always. Was that intentional or the same limits check should be added for DIO case too?

            tappro Mikhail Pershin added a comment - Regarding landed patch  https://review.whamcloud.com/35896/ - I've noted that added code does pretty the same as osc_enter_cache_try() but skips the following checks: if (cli->cl_dirty_pages < cli->cl_dirty_max_pages) { if (atomic_long_add_return(1, &obd_dirty_pages) <= obd_max_dirty_pages) { ... for cl_dirty_max_pages limit. Instead of that we just consuming write grants always. Was that intentional or the same limits check should be added for DIO case too?
            gerrit Gerrit Updater added a comment - - edited

            Olaf Faaland-LLNL (faaland1@llnl.gov) uploaded a new patch: https://review.whamcloud.com/39517
            Subject: LU-12687 canary: comment-only change
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: bcec79eb40dda0c26bffb9ec950a142034866df6

            gerrit Gerrit Updater added a comment - - edited Olaf Faaland-LLNL (faaland1@llnl.gov) uploaded a new patch: https://review.whamcloud.com/39517 Subject: LU-12687 canary: comment-only change Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: bcec79eb40dda0c26bffb9ec950a142034866df6

            Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39386
            Subject: LU-12687 osc: consume grants for direct I/O
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: c89c8423cf730df97ddc19a8981f978c79fabdfa

            gerrit Gerrit Updater added a comment - Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39386 Subject: LU-12687 osc: consume grants for direct I/O Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: c89c8423cf730df97ddc19a8981f978c79fabdfa

            People

              vsaveliev Vladimir Saveliev
              vsaveliev Vladimir Saveliev
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: