Description
Looks like llsom_sync is not working correctly when a non-existent changelog user is entered.
For example, I've registered two changelog users on the MDS, set llite.*.xattr_cache = 0 on the client (vm4) and create a new file and overwrite an existing file:
[vm3 ~]# dd if=/dev/urandom of=/lustre/scratch/ddfile3 bs=37k count=200 200+0 records in 200+0 records out 7577600 bytes (7.6 MB) copied, 0.0464932 s, 163 MB/s [vm4 ~]# echo "aa" > /lustre/scratch/ddfile2
Looking at the LSOM data, we can see the size is correct, but the blocks are not updated.
[vm4 ~]# lfs getsom /lustre/scratch/ddfile2 file: /lustre/scratch/ddfile2 size: 3 blocks: 0 flags: 4 [vm4 ~]# lfs getsom /lustre/scratch/ddfile3 file: /lustre/scratch/ddfile3 size: 7577600 blocks: 0 flags: 4
Now, sync the LSOM data, but give it a bad changelog user ID.
[vm4 ~]# llsom_sync --mdt scratch-MDT0000 --user cli4 -v /lustre/scratch/ Start receiving records Processed changelog record index:5 type:XATTR(0xf) FID:[0x200000402:0x1:0x0] Processed changelog record index:6 type:CREAT(0x1) FID:[0x200000402:0x2:0x0] Processed changelog record index:7 type:XATTR(0xf) FID:[0x200000402:0x2:0x0] Processed changelog record index:8 type:CLOSE(0xb) FID:[0x200000402:0x2:0x0] Processed changelog record index:9 type:XATTR(0xf) FID:[0x200000402:0x1:0x0] Processed changelog record index:10 type:TRUNC(0xd) FID:[0x200000402:0x1:0x0] Processed changelog record index:11 type:XATTR(0xf) FID:[0x200000402:0x1:0x0] Processed changelog record index:12 type:CLOSE(0xb) FID:[0x200000402:0x1:0x0] finished reading [scratch-MDT0000] Start to sync 2 records. record 1651949901960989620:8, updated LSOM for fid [0x200000402:0x2:0x0] size:7577600 blocks:14800 llsom_sync: cannot purge records for 'cli4': Invalid argument (22) llsom_sync: failed to clear changelog record: cli4:8: Invalid argument (22) [vm4 ~]# lfs getsom /lustre/scratch/ddfile3 file: /lustre/scratch/ddfile3 size: 7577600 blocks: 14800 flags: 4 [vm4 ~]# lfs getsom /lustre/scratch/ddfile2 file: /lustre/scratch/ddfile2 size: 3 blocks: 0 flags: 4
The changelog record purge error is to be expected since there is no user cli4. The problem is, one file's LSOM data (blocks) is updated, the other file's data is not updated.
From the output of llsom_sync, it looks like updating of the LSOM file data is interrupted when it figured out that the user is not valid. It seems like there are two issues here:
1. We should update the LSOM data of all files or none of the files when a bad user ID is input
2. We are not checking the validity of the user at an appropriate time.
Looking at the llsom_sync code, we don't check the changelog user in llsom_sync until we call llapi_changelog_clear() to purge changelog records and this routine produces an error.
Using a valid changelog user, then all file's LSOM data are updated.
[vm4 ~]# llsom_sync --mdt scratch-MDT0000 --user cl2 -v /lustre/scratch/ Start receiving records Processed changelog record index:5 type:XATTR(0xf) FID:[0x200000402:0x1:0x0] Processed changelog record index:6 type:CREAT(0x1) FID:[0x200000402:0x2:0x0] Processed changelog record index:7 type:XATTR(0xf) FID:[0x200000402:0x2:0x0] Processed changelog record index:8 type:CLOSE(0xb) FID:[0x200000402:0x2:0x0] Processed changelog record index:9 type:XATTR(0xf) FID:[0x200000402:0x1:0x0] Processed changelog record index:10 type:TRUNC(0xd) FID:[0x200000402:0x1:0x0] Processed changelog record index:11 type:XATTR(0xf) FID:[0x200000402:0x1:0x0] Processed changelog record index:12 type:CLOSE(0xb) FID:[0x200000402:0x1:0x0] Processed changelog record index:13 type:XATTR(0xf) FID:[0x200000402:0x2:0x0] finished reading [scratch-MDT0000] Start to sync 2 records. record 1651949901960989620:8, updated LSOM for fid [0x200000402:0x2:0x0] size:7577600 blocks:14800 record 1651949963251084516:12, updated LSOM for fid [0x200000402:0x1:0x0] size:3 blocks:8 [vm4 ~]# lfs getsom /lustre/scratch/ddfile3 file: /lustre/scratch/ddfile3 size: 7577600 blocks: 14800 flags: 4 [vm4 ~]# lfs getsom /lustre/scratch/ddfile2 file: /lustre/scratch/ddfile2 size: 3 blocks: 8 flags: 4
Attachments
Issue Links
- mentioned in
-
Page Loading...