Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-20339

gss: client mount corrupts ASCII SSK keyfile (missing O_TRUNC on rewrite)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • Lustre 2.17.0
    • None
    • 3
    • 9223372036854775807

      While testing the recently added ASCII SSK key format from LU-19073, I ran into an error when converting a type=server SSK keyfile to the ASCII format then attempting to mount with it (the client has no DH prime baked in yet):

      lgss_sk -a -m /root/tenantX.key        # binary -> ASCII (~3227 bytes)
      chmod 400 /root/tenantX.key
      mount -t lustre -o skpath=/root/tenantX.key <mgs-nids>:/<fs> /mnt/x
      

      The mount fails and the keyfile is left corrupted on disk:

      Generating DH parameters to turn /root/tenantX.key into a client key, this can take a while...
      File /root/tenantX.key does not have a complete key: got 3227 bytes, expected 2407 bytes
      mount.lustre: Error loading shared keys: Required key not available
      

      The same key in binary form mounts fine; only the ASCII form fails.

      The problem

      When a client mounts with a type=server key, sk_load_keyfile() (lustre/utils/gss/sk_utils.c) generates the DH prime on the fly and rewrites the keyfile in binary form via write_config_file(). write_config_file() opens the file with O_WRONLY | O_CREAT and, on the overwrite path, without O_TRUNC. The binary image (sizeof(struct sk_keyfile_config), 2407 bytes) is smaller than the ASCII representation (~3227 bytes), so the rewrite overwrites only the front of the file and leaves a stale ASCII tail. The immediate re-read no longer matches the "Lustre SSK v1.0" magic, falls into the binary branch, and fails the size check, surfacing -ENOKEY as "Required key not available".

      Only a shrinking rewrite (ASCII -> binary) triggers it; binary-to-binary rewrites are the same size, which is why it was not seen before ASCII keyfiles (LU-19073).

      Proposed Fix
      Add O_TRUNC on the overwrite path in write_config_file() so a rewrite always replaces the full file contents; the create-only path keeps O_EXCL.

            mrasobarnett Matt Rásó-Barnett
            mrasobarnett Matt Rásó-Barnett
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: