Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8591

allow specifying ZFS blocksize via ladvise



    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
    • Rank (Obsolete):


      As a continuation of LU-4931 "New feature of giving server/storage side advice of accessing file" and LU-4865 "osd-zfs: increase object block size dynamically as object grows" it is useful to be able to specify the ZFS blocksize for a file directly from the client, so that the OSS doesn't have to guess at this itself.

      This can be done via the "ladvise" functionality by adding a new LU_LADVISE_BLOCKSIZE advice. There are reserved fields in the lu_ladvise struct for passing arbitrary information as part of the request.

      struct lu_ladvise {
              __u16 lla_advice;       /* advice type */
              __u16 lla_value1;       /* values for different advice types */
              __u32 lla_value2;   
              __u64 lla_start;        /* first byte of extent for advice */
              __u64 lla_end;          /* last byte of extent for advice */
              __u32 lla_value3;   
              __u32 lla_value4;   

      The blocksize can be up to 16MB (or possibly more in the future), so if specified in bytes it should go into lla_value2, but it might make sense to limit this to be a power-of-two value, in which case lla_value1 would be enough. It makes sense to #define lla_blocksize lla_valueX to map this to the correct field so that there are no errors in usage from userspace or in the kernel. It doesn't make sense to specify lla_start or lla_end for a single object, but it may be useful for PFL files, so that should be handled transparently by osd-zfs. It should return an error from osd-ldiskfs, or from osd-zfs if the blocksize is specified but it doesn't match the current blocksize and the blocksize is smaller than the current file size. Increasing the osd-zfs blocksize should be permitted, but it should be "sticky" on the object (at least until the object is evicted from RAM) so that dynamic blocksize heuristics from LU-7226 don't override the user-specified value.


          Issue Links



              wc-triage WC Triage
              adilger Andreas Dilger
              1 Vote for this issue
              4 Start watching this issue