Details

    • Bug
    • Resolution: Unresolved
    • Medium
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      The default osd-ldiskfs.*. readcache_max_filesize parameter is currently "~0ULL" so the OSS will try to cache even very large files that do not fit into the total RAM size. This is inefficient to try and cache these objects.

      The default readcache_max_filesize should be tuned at startup to take the actual RAM size into account (e.g. (totalram_pages << PAGE_SHIFT) / 64) so that the OSS RAM can be better utilized to cache objects that actually fit into memory.

      Attachments

        Issue Links

          Activity

            [LU-19147] don't try to cache large objects

            Note, however, that despite the name, the readcache_max_filesize is actually the object size, and as such if a file is large but has many stripes (eg. PFL), then the "size" stored in this parameter may be quite misleading.

            It would be possible to realign this parameter to better take the actual file size into account instead of the object size. This could potentially by done by looking at the "trusted.fid" xattr to determine the total stripe count of the current component and multiplying the object size by the stripe count. That would give a much better idea of the file size, but doesn't take into account the part of the file that may be beyond the current component.

            That said, the object size may be the better metric (even if it is more confusing for users) because it is the "local" object data that makes a difference for the OSS cache usage.

            As for the default value, it makes sense to scale this by RAM size. On the one hand, a large limit like RAM/4 allows caching even large files on the OSS, but a more practical limit might be RAM/64 or RAM/128 (eg. 1-2GB on a 128GB OSS) so that more files can be cached, and taking into account that there will typically be multiple OSTs per OSS and for large files it is likely that there will be multiple stripes on the same OSS.

            adilger Andreas Dilger added a comment - Note, however, that despite the name, the readcache_max_filesize is actually the object size, and as such if a file is large but has many stripes (eg. PFL), then the "size" stored in this parameter may be quite misleading. It would be possible to realign this parameter to better take the actual file size into account instead of the object size. This could potentially by done by looking at the "trusted.fid" xattr to determine the total stripe count of the current component and multiplying the object size by the stripe count. That would give a much better idea of the file size, but doesn't take into account the part of the file that may be beyond the current component. That said, the object size may be the better metric (even if it is more confusing for users) because it is the "local" object data that makes a difference for the OSS cache usage. As for the default value, it makes sense to scale this by RAM size. On the one hand, a large limit like RAM/4 allows caching even large files on the OSS, but a more practical limit might be RAM/64 or RAM/128 (eg. 1-2GB on a 128GB OSS) so that more files can be cached, and taking into account that there will typically be multiple OSTs per OSS and for large files it is likely that there will be multiple stripes on the same OSS.

            People

              wc-triage WC Triage
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: