Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7015

Grant space and reserved blocks percent parameters

Details

    • Question/Request
    • Resolution: Fixed
    • Minor
    • Lustre 2.8.0
    • Lustre 2.5.4
    • None
    • RHEL-6.6, lustre-2.5.4
    • 9223372036854775807

    Description

      Our system space utilization on one of our systems is high, and as we work to prune some of this data, we're exploring some other space tunings.

      One of our admins noted the "cur_grant_bytes" osc parameter. When we looked at a few clients, we saw that this variable often exceeds the max_dirty_mb, sometimes by an order of magnitude. We usually use 64MB of dirty cache per osc per client. Is there an upper limit to this cur_grants_bytes parameter? What are the side effects of setting this value to some lower value (or 0)? Can we reduce this client grant while there is active I/O, and can we do this for all osc connections simultaneously (for a collective of millions of osc connections) for a system? Is this documented well anywhere?

      Additionally, we are looking into tuning the reserved_blocks_percent parameter. The Lustre manual states that 5% is the minimum, but is that a sane value for all OST sizes?

      Thanks,

      Jesse

      Attachments

        Issue Links

          Activity

            [LU-7015] Grant space and reserved blocks percent parameters
            ezell Matt Ezell added a comment -

            I just ran a quick test on our TDS system. I took a newly mounted client and created 50 files striped across OST 0. I backgrounded 50 dd processes against those files and gathered logs with +cache enabled on the client and server.

            The first thing I noticed it that the server very quickly increased the grant to the client, maybe even before the client had a chance to realize it.

            00002000:00000020:4.0:1439910325.382841:0:36645:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608
            00002000:00000020:4.0:1439910325.383027:0:31053:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608
            00002000:00000020:4.0:1439910325.383615:0:36646:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608
            00002000:00000020:4.0:1439910325.383775:0:36647:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608
            00002000:00000020:4.0:1439910325.384272:0:36648:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608
            00002000:00000020:4.0:1439910325.385007:0:36649:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608
            00002000:00000020:4.0:1439910325.385154:0:36650:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608
            00002000:00000020:6.0:1439910325.416668:0:36648:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 335872 granting: 8388608
            00002000:00000020:5.0:1439910325.417207:0:36649:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608
            00002000:00000020:6.0:1439910325.417262:0:36645:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608
            00002000:00000020:6.0:1439910325.433766:0:31053:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 29917184 granting: 8388608
            00002000:00000020:4.0:1439910325.433773:0:36646:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 8187904 granting: 8388608
            00002000:00000020:5.0:1439910325.433789:0:31052:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 22822912 granting: 8388608
            00002000:00000020:6.0:1439910325.434528:0:31054:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 25923584 granting: 8388608
            00002000:00000020:4.0:1439910325.434534:0:36647:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 25845760 granting: 8388608
            00002000:00000020:5.0:1439910325.591676:0:36650:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 32403456 granting: 8388608
            00002000:00000020:4.0:1439910325.591852:0:36652:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 32382976 granting: 8388608
            00002000:00000020:5.0:1439910325.591860:0:36647:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 32608256 granting: 8388608
            00002000:00000020:6.0:1439910325.593790:0:31054:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 30371840 granting: 8388608
            00002000:00000020:5.0:1439910325.595378:0:36651:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 29700096 granting: 8388608
            00002000:00000020:4.0:1439910325.595384:0:31052:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 29696000 granting: 8388608
            

            The server granted it 56MB before the client even reported having a grant.

            I haven't read all of the grant-related code, so take this analysis with a grain of salt.

            Is the want parameter supposed to be an absolute or relative value?

            lustre/ofd/ofd_grant.c:ofd_grant()
                    /* Grant some fraction of the client's requested grant space so that
                     * they are not always waiting for write credits (not all of it to
                     * avoid overgranting in face of multiple RPCs in flight).  This
                     * essentially will be able to control the OSC_MAX_RIF for a client.
                     *
                     * If we do have a large disparity between what the client thinks it
                     * has and what we think it has, don't grant very much and let the
                     * client consume its grant first.  Either it just has lots of RPCs
                     * in flight, or it was evicted and its grants will soon be used up. */
                    if (curgrant >= want || curgrant >= fed->fed_grant + grant_chunk)
                               RETURN(0);
            

            This looks like want is being used as an absolute value. Assuming want should be absolute, do we also need a check to ensure that fed->fed_grant isn't much larger than want?

            lustre/ofd/ofd_grant.c:ofd_grant()
            grant = min(want, left);
            ...
                    /* Limit to ofd_grant_chunk() if not reconnect/recovery */
                    if ((grant > grant_chunk) && conservative)
                            grant = grant_chunk;
            ...
                    ofd->ofd_tot_granted += grant;
                    fed->fed_grant += grant;
            

            This looks like want is a relative value.

            So the clients repeatedly says "I want 32MB" and the server takes that request, lowers it to grant_chunk (8MB), and grants it 8MB repeatedly until the client claims it has at least 32MB.

            According to Andreas in LU-3859, OBD_CONNECT_GRANT_SHRINK isn't set, so this is never cleaned up automatically. Is there a reason this is disabled?

            ezell Matt Ezell added a comment - I just ran a quick test on our TDS system. I took a newly mounted client and created 50 files striped across OST 0. I backgrounded 50 dd processes against those files and gathered logs with +cache enabled on the client and server. The first thing I noticed it that the server very quickly increased the grant to the client, maybe even before the client had a chance to realize it. 00002000:00000020:4.0:1439910325.382841:0:36645:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608 00002000:00000020:4.0:1439910325.383027:0:31053:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608 00002000:00000020:4.0:1439910325.383615:0:36646:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608 00002000:00000020:4.0:1439910325.383775:0:36647:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608 00002000:00000020:4.0:1439910325.384272:0:36648:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608 00002000:00000020:4.0:1439910325.385007:0:36649:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608 00002000:00000020:4.0:1439910325.385154:0:36650:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608 00002000:00000020:6.0:1439910325.416668:0:36648:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 335872 granting: 8388608 00002000:00000020:5.0:1439910325.417207:0:36649:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608 00002000:00000020:6.0:1439910325.417262:0:36645:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 0 granting: 8388608 00002000:00000020:6.0:1439910325.433766:0:31053:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 29917184 granting: 8388608 00002000:00000020:4.0:1439910325.433773:0:36646:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 8187904 granting: 8388608 00002000:00000020:5.0:1439910325.433789:0:31052:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 22822912 granting: 8388608 00002000:00000020:6.0:1439910325.434528:0:31054:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 25923584 granting: 8388608 00002000:00000020:4.0:1439910325.434534:0:36647:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 25845760 granting: 8388608 00002000:00000020:5.0:1439910325.591676:0:36650:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 32403456 granting: 8388608 00002000:00000020:4.0:1439910325.591852:0:36652:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 32382976 granting: 8388608 00002000:00000020:5.0:1439910325.591860:0:36647:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 32608256 granting: 8388608 00002000:00000020:6.0:1439910325.593790:0:31054:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 30371840 granting: 8388608 00002000:00000020:5.0:1439910325.595378:0:36651:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 29700096 granting: 8388608 00002000:00000020:4.0:1439910325.595384:0:31052:0:(ofd_grant.c:662:ofd_grant()) atlastds-OST0000: cli ce2fb5d0-e502-410d-675d-3b8d0dd26305/ffff880806fe3c00 wants: 33554432 current grant 29696000 granting: 8388608 The server granted it 56MB before the client even reported having a grant. I haven't read all of the grant-related code, so take this analysis with a grain of salt. Is the want parameter supposed to be an absolute or relative value? lustre/ofd/ofd_grant.c:ofd_grant() /* Grant some fraction of the client's requested grant space so that * they are not always waiting for write credits (not all of it to * avoid overgranting in face of multiple RPCs in flight). This * essentially will be able to control the OSC_MAX_RIF for a client. * * If we do have a large disparity between what the client thinks it * has and what we think it has, don't grant very much and let the * client consume its grant first. Either it just has lots of RPCs * in flight, or it was evicted and its grants will soon be used up. */ if (curgrant >= want || curgrant >= fed->fed_grant + grant_chunk) RETURN(0); This looks like want is being used as an absolute value. Assuming want should be absolute, do we also need a check to ensure that fed->fed_grant isn't much larger than want ? lustre/ofd/ofd_grant.c:ofd_grant() grant = min(want, left); ... /* Limit to ofd_grant_chunk() if not reconnect/recovery */ if ((grant > grant_chunk) && conservative) grant = grant_chunk; ... ofd->ofd_tot_granted += grant; fed->fed_grant += grant; This looks like want is a relative value. So the clients repeatedly says "I want 32MB" and the server takes that request, lowers it to grant_chunk (8MB), and grants it 8MB repeatedly until the client claims it has at least 32MB. According to Andreas in LU-3859 , OBD_CONNECT_GRANT_SHRINK isn't set, so this is never cleaned up automatically. Is there a reason this is disabled?
            ezell Matt Ezell added a comment -

            It looks like ofd_grant_space_left() uses ofd->ofd_osfs.os_bavail, so it appears to take the reserved space into account.

            ezell Matt Ezell added a comment - It looks like ofd_grant_space_left() uses ofd->ofd_osfs.os_bavail, so it appears to take the reserved space into account.
            ezell Matt Ezell added a comment -

            I guess until we get usage down or a patch for this, we will need to periodically shrink grants on clients to avoid ENOSPC.

            The source of the question about reserved space was to better understand when a user might get ENOSPC. Would it be when a client has exhausted its grant and (kbytesfree - (tot_granted/1024)) <= 0 or does it use (kbytesavail - (tot_granted/1024)) <= 0 ?

            The Lustre Operations manual has a pretty strong warning about lowering the reserved space:

            Reducing the space reservation can cause severe performance degradation as the OST file system becomes more than 95% full, due to difficulty in locating large areas of contiguous free space. This performance degradation may persist even if the space usage drops below 95% again. It is recommended NOT to reduce the reserved disk space below 5%.

            But if that will give us a little headroom, we know grants will help keep us from getting too close to completely empty.

            ezell Matt Ezell added a comment - I guess until we get usage down or a patch for this, we will need to periodically shrink grants on clients to avoid ENOSPC. The source of the question about reserved space was to better understand when a user might get ENOSPC. Would it be when a client has exhausted its grant and (kbytesfree - (tot_granted/1024)) <= 0 or does it use (kbytesavail - (tot_granted/1024)) <= 0 ? The Lustre Operations manual has a pretty strong warning about lowering the reserved space: Reducing the space reservation can cause severe performance degradation as the OST file system becomes more than 95% full, due to difficulty in locating large areas of contiguous free space. This performance degradation may persist even if the space usage drops below 95% again. It is recommended NOT to reduce the reserved disk space below 5%. But if that will give us a little headroom, we know grants will help keep us from getting too close to completely empty.
            hanleyja Jesse Hanley added a comment -

            Thanks Oleg for the detail!

            These servers were originally formatted with Lustre 2.4. When I checked, it looks like the OSTs are at 5% reserved:

            Block count: 3755999232
            Reserved block count: 187799961

            187799961 / 3755999232 * 100 ~= 5%

            With that being the case, can/should we lower this to a smaller reserved block count?

            Also, do I need to submit a new case about the server logic?

            Thanks!

            Jesse

            hanleyja Jesse Hanley added a comment - Thanks Oleg for the detail! These servers were originally formatted with Lustre 2.4. When I checked, it looks like the OSTs are at 5% reserved: Block count: 3755999232 Reserved block count: 187799961 187799961 / 3755999232 * 100 ~= 5% With that being the case, can/should we lower this to a smaller reserved block count? Also, do I need to submit a new case about the server logic? Thanks! – Jesse
            green Oleg Drokin added a comment -

            cur_grant_bytes is how much grant was received from this OST by a client. It has no direct relation to max_dirty_mb other than max_dirty_mb cannot be higher than this.

            Technically the calculation for the grant request is

                            long max_in_flight = (cli->cl_max_pages_per_rpc <<
                                                  PAGE_CACHE_SHIFT) *
                                                 (cli->cl_max_rpcs_in_flight + 1);
                            oa->o_undirty = max(cli->cl_dirty_max_pages << PAGE_CACHE_SHIFT,
                                                max_in_flight);
            

            This is how much every client RPC requests.
            It's a max of either your num RPCs in flight or max_dirty_mb

            Theoretically we should not exceed this value (want = o_undirty):

                    /* Grant some fraction of the client's requested grant space so that
                     * they are not always waiting for write credits (not all of it to
                     * avoid overgranting in face of multiple RPCs in flight).  This
                     * essentially will be able to control the OSC_MAX_RIF for a client.
                     *
                     * If we do have a large disparity between what the client thinks it
                     * has and what we think it has, don't grant very much and let the
                     * client consume its grant first.  Either it just has lots of RPCs
                     * in flight, or it was evicted and its grants will soon be used up. */
                    if (curgrant >= want || curgrant >= fed->fed_grant + grant_chunk)
                               RETURN(0);
            ...
                    grant = min(want, left);
                    /* round grant upt to the next block size */
                    grant = (grant + (1 << ofd->ofd_blockbits) - 1) &
                            ~((1ULL << ofd->ofd_blockbits) - 1);
                    /* Limit to ofd_grant_chunk() if not reconnect/recovery */
                    if ((grant > grant_chunk) && conservative)
                            grant = grant_chunk;
            ...
                    ofd->ofd_tot_granted += grant;
                    fed->fed_grant += grant;
            

            So I imagine biggest case could be that if a client sends a bunch of requests while the grant is nearly at the max, then every one of those RPCs would return 2M of grant each,
            which I guess theoretically should only allow to get a client to receive at most 2x of the max_dirty_mb or max_requests_in_flight megabytes (though if you are in recovery then every request could bring as much grant).
            Overall it seems there's seem to be a bit of a logic flaw in server-side granting logic where after initial checks want should be recalculated as want -= grant or something like that.

            While you can write a low value into the proc file, it would only have an effect of releasing the extra grant above the value you write there immediately, but the value does not stick and the grant would keep accumulating according to the calculations above.

            As for the reserved_blocks_percent, do you mean the ext4 reservation? I think recent e2fspogs reduced that to smaller value for large filesystem sizes by default already.

            green Oleg Drokin added a comment - cur_grant_bytes is how much grant was received from this OST by a client. It has no direct relation to max_dirty_mb other than max_dirty_mb cannot be higher than this. Technically the calculation for the grant request is long max_in_flight = (cli->cl_max_pages_per_rpc << PAGE_CACHE_SHIFT) * (cli->cl_max_rpcs_in_flight + 1); oa->o_undirty = max(cli->cl_dirty_max_pages << PAGE_CACHE_SHIFT, max_in_flight); This is how much every client RPC requests. It's a max of either your num RPCs in flight or max_dirty_mb Theoretically we should not exceed this value (want = o_undirty): /* Grant some fraction of the client's requested grant space so that * they are not always waiting for write credits (not all of it to * avoid overgranting in face of multiple RPCs in flight). This * essentially will be able to control the OSC_MAX_RIF for a client. * * If we do have a large disparity between what the client thinks it * has and what we think it has, don't grant very much and let the * client consume its grant first. Either it just has lots of RPCs * in flight, or it was evicted and its grants will soon be used up. */ if (curgrant >= want || curgrant >= fed->fed_grant + grant_chunk) RETURN(0); ... grant = min(want, left); /* round grant upt to the next block size */ grant = (grant + (1 << ofd->ofd_blockbits) - 1) & ~((1ULL << ofd->ofd_blockbits) - 1); /* Limit to ofd_grant_chunk() if not reconnect/recovery */ if ((grant > grant_chunk) && conservative) grant = grant_chunk; ... ofd->ofd_tot_granted += grant; fed->fed_grant += grant; So I imagine biggest case could be that if a client sends a bunch of requests while the grant is nearly at the max, then every one of those RPCs would return 2M of grant each, which I guess theoretically should only allow to get a client to receive at most 2x of the max_dirty_mb or max_requests_in_flight megabytes (though if you are in recovery then every request could bring as much grant). Overall it seems there's seem to be a bit of a logic flaw in server-side granting logic where after initial checks want should be recalculated as want -= grant or something like that. While you can write a low value into the proc file, it would only have an effect of releasing the extra grant above the value you write there immediately, but the value does not stick and the grant would keep accumulating according to the calculations above. As for the reserved_blocks_percent, do you mean the ext4 reservation? I think recent e2fspogs reduced that to smaller value for large filesystem sizes by default already.
            ezell Matt Ezell added a comment -

            To include some numbers:

            # ls exports | wc -l
            20190
            # echo "$(cat tot_granted) / $(ls exports | wc -l)" | bc
            116961044
            # echo "100 * $(cat tot_granted) / 1024 / $(cat kbytestotal)" | bc
            15
            

            Our 20,000 clients average 116MB of grants per OST, resulting in 15% of the OST reserved for grants. That means when any OST hits 85% full, users start getting ENOSPC. I picked a random client and grant sizes range from 2MB to 343MB per OSC.

            ezell Matt Ezell added a comment - To include some numbers: # ls exports | wc -l 20190 # echo "$(cat tot_granted) / $(ls exports | wc -l)" | bc 116961044 # echo "100 * $(cat tot_granted) / 1024 / $(cat kbytestotal)" | bc 15 Our 20,000 clients average 116MB of grants per OST, resulting in 15% of the OST reserved for grants. That means when any OST hits 85% full, users start getting ENOSPC. I picked a random client and grant sizes range from 2MB to 343MB per OSC.

            People

              green Oleg Drokin
              hanleyja Jesse Hanley
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: