Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11459

llsom_sync updates LSOM data for some files when called with non-existant user

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.12.0
    • 3
    • 9223372036854775807

    Description

      Looks like llsom_sync is not working correctly when a non-existent changelog user is entered.

      For example, I've registered two changelog users on the MDS, set llite.*.xattr_cache = 0 on the client (vm4) and create a new file and overwrite an existing file:

      [vm3 ~]# dd if=/dev/urandom of=/lustre/scratch/ddfile3 bs=37k count=200
      200+0 records in
      200+0 records out
      7577600 bytes (7.6 MB) copied, 0.0464932 s, 163 MB/s
      
      [vm4 ~]# echo "aa" > /lustre/scratch/ddfile2
      

      Looking at the LSOM data, we can see the size is correct, but the blocks are not updated.

      [vm4 ~]# lfs getsom /lustre/scratch/ddfile2
      file: /lustre/scratch/ddfile2 size: 3 blocks: 0 flags: 4
      [vm4 ~]# lfs getsom /lustre/scratch/ddfile3
      file: /lustre/scratch/ddfile3 size: 7577600 blocks: 0 flags: 4
      

      Now, sync the LSOM data, but give it a bad changelog user ID.

      [vm4 ~]# llsom_sync --mdt scratch-MDT0000 --user cli4 -v /lustre/scratch/
      Start receiving records
      Processed changelog record index:5 type:XATTR(0xf) FID:[0x200000402:0x1:0x0]
      Processed changelog record index:6 type:CREAT(0x1) FID:[0x200000402:0x2:0x0]
      Processed changelog record index:7 type:XATTR(0xf) FID:[0x200000402:0x2:0x0]
      Processed changelog record index:8 type:CLOSE(0xb) FID:[0x200000402:0x2:0x0]
      Processed changelog record index:9 type:XATTR(0xf) FID:[0x200000402:0x1:0x0]
      Processed changelog record index:10 type:TRUNC(0xd) FID:[0x200000402:0x1:0x0]
      Processed changelog record index:11 type:XATTR(0xf) FID:[0x200000402:0x1:0x0]
      Processed changelog record index:12 type:CLOSE(0xb) FID:[0x200000402:0x1:0x0]
      finished reading [scratch-MDT0000]
      Start to sync 2 records.
      record 1651949901960989620:8, updated LSOM for fid [0x200000402:0x2:0x0] size:7577600 blocks:14800
      llsom_sync: cannot purge records for 'cli4': Invalid argument (22)
      llsom_sync: failed to clear changelog record: cli4:8: Invalid argument (22)
      [vm4 ~]# lfs getsom /lustre/scratch/ddfile3
      file: /lustre/scratch/ddfile3 size: 7577600 blocks: 14800 flags: 4
      [vm4 ~]# lfs getsom /lustre/scratch/ddfile2
      file: /lustre/scratch/ddfile2 size: 3 blocks: 0 flags: 4
      

      The changelog record purge error is to be expected since there is no user cli4. The problem is, one file's LSOM data (blocks) is updated, the other file's data is not updated.

      From the output of llsom_sync, it looks like updating of the LSOM file data is interrupted when it figured out that the user is not valid. It seems like there are two issues here:
      1. We should update the LSOM data of all files or none of the files when a bad user ID is input
      2. We are not checking the validity of the user at an appropriate time.

      Looking at the llsom_sync code, we don't check the changelog user in llsom_sync until we call llapi_changelog_clear() to purge changelog records and this routine produces an error.

      Using a valid changelog user, then all file's LSOM data are updated.

      [vm4 ~]# llsom_sync --mdt scratch-MDT0000 --user cl2 -v /lustre/scratch/
      Start receiving records
      Processed changelog record index:5 type:XATTR(0xf) FID:[0x200000402:0x1:0x0]
      Processed changelog record index:6 type:CREAT(0x1) FID:[0x200000402:0x2:0x0]
      Processed changelog record index:7 type:XATTR(0xf) FID:[0x200000402:0x2:0x0]
      Processed changelog record index:8 type:CLOSE(0xb) FID:[0x200000402:0x2:0x0]
      Processed changelog record index:9 type:XATTR(0xf) FID:[0x200000402:0x1:0x0]
      Processed changelog record index:10 type:TRUNC(0xd) FID:[0x200000402:0x1:0x0]
      Processed changelog record index:11 type:XATTR(0xf) FID:[0x200000402:0x1:0x0]
      Processed changelog record index:12 type:CLOSE(0xb) FID:[0x200000402:0x1:0x0]
      Processed changelog record index:13 type:XATTR(0xf) FID:[0x200000402:0x2:0x0]
      finished reading [scratch-MDT0000]
      Start to sync 2 records.
      record 1651949901960989620:8, updated LSOM for fid [0x200000402:0x2:0x0] size:7577600 blocks:14800
      record 1651949963251084516:12, updated LSOM for fid [0x200000402:0x1:0x0] size:3 blocks:8
      [vm4 ~]# lfs getsom /lustre/scratch/ddfile3
      file: /lustre/scratch/ddfile3 size: 7577600 blocks: 14800 flags: 4
      [vm4 ~]# lfs getsom /lustre/scratch/ddfile2
      file: /lustre/scratch/ddfile2 size: 3 blocks: 8 flags: 4
      

      Attachments

        Issue Links

          Activity

            People

              qian_wc Qian Yingjin
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: