Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15520

OST object projid and quota reset

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.15.0
    • 9223372036854775807

    Description

      There needs to be a mechanism to reset the project IDs on the OST objects if they become inconsistent with the project ID on the MDT inodes. In some rare cases they can be different (in particular projid=0 on the OST objects, even though they have data). This can lead to inconsistencies between "du -sk <dir>" and "lfs quota -p <projid> <dir>" that are not corrected by running "e2fsck" or "tune2fs" or "lfs project -cr <dir>".

      The "lfs project -cr <dir>" command will only verify the projid on the MDT inode matches the parent directory, but it doesn't verify the projid on the OST objects also matches. In most cases, that is the correct behavior, since the MDT and OST projids should be in sync and checking every OST object can increase the RPC count by an order of magnitude.

      Running "lfs project -r -p 0 <dir>; lfs project -r -p <projid> <dir>" will reset the projid of every file, but this has the drawback that the project quota of the affected directory tree will be totally inaccurate for the whole time that this process is running (maybe hours), and affects objects that may have the correct projid already.

      One option is to add an option like "lfs project -r -p <projid> --force-osts <dir>", that also sends an RPC to all of the OSTs to update their project ID, even if the MDT projid matches. While this will still send an RPC for every object in the directory tree, it doesn't have to modify the OST objects if the projid matches already. That would avoid a potentially large number of OST disk IOPS (double if the "-p 0; -p <projid>" method is used).

      It would also seem relatively easy to have OST write RPCs with obdo.o_projid that have a non-zero value to update OST objects that have projid==0. This would be the same one-shot mechanism to set the projid for OST objects as is done with UID/GID on first write. This depends on the clients consistently setting o_projid with the bulk write and setattr RPCs, and the OST to check this, but it can be done without interoperability concerns, especially since the o_projid would always relate to the inode and not to the process that is accessing the file.

      It would also be useful if LFSCK would verify the MDT inode UID/GID/PROJID match those on on OST objects of a file, and update them if different.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              adilger Andreas Dilger
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: