[LU-12026] verify that MDS stores atime/mtime/ctime during LSOM update Created: 27/Feb/19 Updated: 23/Apr/20 Resolved: 25/Nov/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0, Lustre 2.13.0 |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.12.4 |
| Type: | Task | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | Qian Yingjin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
In order to make direct inode scanning on the MDT useful, in addition to storing the file size/blocks via LSOM on the MDT, we also need to store the atime/mtime/ctime on the MDT inodes when the LSOM attributes are updated. Currently the atime is already lazily updated on the MDS (at close time), but I'm not sure if the final mtime/ctime are sent to the MDS at close time, nor whether they are updated on the MDT inode by the MDS. If this is not being done, then any MDT-only scanning will be broken. |
| Comments |
| Comment by Andreas Dilger [ 27/Feb/19 ] |
|
It would also be useful to add a sanity test case for this, to verify that it is working properly (e.g. by checking the "lfs find" RPC count to verify that it didn't do a glimpse on the file when scanning for -mtime or -size. |
| Comment by Andreas Dilger [ 09/Sep/19 ] |
|
Yingjin, can you please at least look at the MDT code and/or run a quick manual test to verify that the mtime and ctime are updated on the MDT inode when the file is closed after a write. It would need a test that creates/writes a file, sleeps for 5s, then writes it again. Then, check the MDT with debugfs to see if the inode timestamps are updated. If this is not working for 2.13 then we need to make a patch and backport to 2.12/EXA5, otherwise MDT scanning will not be working properly. If this is working properly then making a test is less critical. |
| Comment by Qian Yingjin [ 18/Sep/19 ] |
|
Sure, I will work on it sooner. Sorry for late reply. |
| Comment by Qian Yingjin [ 24/Sep/19 ] |
|
Hi Andreas, I did a simple manual test on my local system that verifies the inode timestamps are updated: /dev/mapper/mds1_flakey 125368 1956 112176 2% /mnt/lustre-mds1 /dev/mapper/ost1_flakey 325368 13512 284696 5% /mnt/lustre-ost1 /dev/mapper/ost2_flakey 325368 13508 284700 5% /mnt/lustre-ost2 192.168.150.128@tcp:/lustre 650736 27020 569396 5% /mnt/lustre [root@qian tests]# dd if=/dev/zero of=/mnt/lustre/test bs=1k count=1 1+0 records in 1+0 records out 1024 bytes (1.0 kB) copied, 0.000959122 s, 1.1 MB/s [root@qian tests]# sleep 5 [root@qian tests]# stat /mnt/lustre/test File: '/mnt/lustre/test' Size: 1024 Blocks: 8 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205272502273 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Context: unconfined_u:object_r:unlabeled_t:s0 Access: 2019-09-24 10:59:52.000000000 +0800 Modify: 2019-09-24 10:59:52.000000000 +0800 Change: 2019-09-24 10:59:52.000000000 +0800 Birth: - [root@qian tests]# debugfs -c -R 'stat ROOT/test' /dev/mapper/mds1_flakey debugfs 1.45.2.wc1 (27-May-2019) /dev/mapper/mds1_flakey: catastrophic mode - not reading inode or group bitmaps Inode: 162 Type: regular Mode: 0644 Flags: 0x0 Generation: 667952766 Version: 0x00000001:00000001 User: 0 Group: 0 Project: 0 Size: 0 File ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x5d8986a8:9963a170 -- Tue Sep 24 10:59:52 2019 atime: 0x5d8986a8:9963a170 -- Tue Sep 24 10:59:52 2019 mtime: 0x5d8986a8:9963a170 -- Tue Sep 24 10:59:52 2019 crtime: 0x5d8986a8:9963a170 -- Tue Sep 24 10:59:52 2019 Size of extra inode fields: 32 Extended attributes: trusted.lma (24) = 00 00 00 00 00 00 00 00 01 04 00 00 02 00 00 00 01 00 00 00 00 00 00 00 lma: fid=[0x200000401:0x1:0x0] compat=0 incompat=0 trusted.lov (56) security.selinux (37) = "unconfined_u:object_r:unlabeled_t:s0\000" trusted.link (46) trusted.som (24) = 04 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 08 00 00 00 00 00 00 00 BLOCKS: [root@qian tests]# dd if=/dev/zero of=/mnt/lustre/test bs=1k count=1 1+0 records in 1+0 records out 1024 bytes (1.0 kB) copied, 0.000835408 s, 1.2 MB/s [root@qian tests]# stat /mnt/lustre/test File: '/mnt/lustre/test' Size: 1024 Blocks: 1 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205272502273 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Context: unconfined_u:object_r:unlabeled_t:s0 Access: 2019-09-24 10:59:52.000000000 +0800 Modify: 2019-09-24 11:53:23.000000000 +0800 Change: 2019-09-24 11:53:23.000000000 +0800 Birth: - [root@qian tests]# debugfs -c -R 'stat ROOT/test' /dev/mapper/mds1_flakey debugfs 1.45.2.wc1 (27-May-2019) /dev/mapper/mds1_flakey: catastrophic mode - not reading inode or group bitmaps Inode: 162 Type: regular Mode: 0644 Flags: 0x0 Generation: 667952766 Version: 0x00000001:00000001 User: 0 Group: 0 Project: 0 Size: 0 File ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x5d899333:00000000 -- Tue Sep 24 11:53:23 2019 atime: 0x5d8986a8:9963a170 -- Tue Sep 24 10:59:52 2019 mtime: 0x5d899333:00000000 -- Tue Sep 24 11:53:23 2019 crtime: 0x5d8986a8:9963a170 -- Tue Sep 24 10:59:52 2019 Size of extra inode fields: 32 Extended attributes: trusted.lma (24) = 00 00 00 00 00 00 00 00 01 04 00 00 02 00 00 00 01 00 00 00 00 00 00 00 lma: fid=[0x200000401:0x1:0x0] compat=0 incompat=0 trusted.lov (56) security.selinux (37) = "unconfined_u:object_r:unlabeled_t:s0\000" trusted.link (46) trusted.som (24) = 04 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 BLOCKS: Should I write a test script in sanity.sh to verify it later?
Regards, Qian |
| Comment by Andreas Dilger [ 24/Sep/19 ] |
|
Qian, I'm happy for now that we have confirmed the timestamp attributes are being updated on the MDT inodes. It isn't really clear to me where this is being done? Looking at the mdt_mfd_close() code:
/* Update atime on close only. */
if ((open_flags & MDS_FMODE_EXEC || open_flags & MDS_FMODE_READ ||
open_flags & MDS_FMODE_WRITE) && (ma->ma_valid & MA_INODE) &&
(ma->ma_attr.la_valid & LA_ATIME)) {
/* Set the atime only. */
ma->ma_valid = MA_INODE;
ma->ma_attr.la_valid = LA_ATIME;
rc = mo_attr_set(info->mti_env, next, ma);
}
it seems that this is only updating atime but not mtime and ctime. For now it seems we are doing the right thing, but I'm not yet convinced that we are doing the right thing all the time. |
| Comment by Qian Yingjin [ 24/Sep/19 ] |
|
Yes, You're right!
After further testing, I found that in some test cases, the timestamps are not being updated on MDT. In the previous tests, the reason that the timestamps were updated is the command "dd if=/dev/zero of=/mnt/lustre/test bs=1k count=2" truncates the file when open the file. After add the "conv=notrunc", the timestamps are difference.
[root@qian tests]# dd if=/dev/zero of=/mnt/lustre/test bs=1k count=2 conv=notrunc 2+0 records in 2+0 records out 2048 bytes (2.0 kB) copied, 0.00368011 s, 557 kB/s [root@qian tests]# stat /mnt/lustre/test File: '/mnt/lustre/test' Size: 2048 Blocks: 8 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205272502273 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Context: unconfined_u:object_r:unlabeled_t:s0 Access: 2019-09-24 10:59:52.000000000 +0800 Modify: 2019-09-24 21:28:32.000000000 +0800 Change: 2019-09-24 21:28:32.000000000 +0800 Birth: - [root@qian tests]# debugfs -c -R 'stat ROOT/test' /dev/mapper/mds1_flakey debugfs 1.45.2.wc1 (27-May-2019) /dev/mapper/mds1_flakey: catastrophic mode - not reading inode or group bitmaps Inode: 162 Type: regular Mode: 0644 Flags: 0x0 Generation: 667952766 Version: 0x00000001:00000001 User: 0 Group: 0 Project: 0 Size: 0 File ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x5d8994b8:00000000 -- Tue Sep 24 11:59:52 2019 atime: 0x5d8986a8:9963a170 -- Tue Sep 24 10:59:52 2019 mtime: 0x5d8994b8:00000000 -- Tue Sep 24 11:59:52 2019 crtime: 0x5d8986a8:9963a170 -- Tue Sep 24 10:59:52 2019 Size of extra inode fields: 32 Extended attributes: trusted.lma (24) = 00 00 00 00 00 00 00 00 01 04 00 00 02 00 00 00 01 00 00 00 00 00 00 00 lma: fid=[0x200000401:0x1:0x0] compat=0 incompat=0 trusted.lov (56) security.selinux (37) = "unconfined_u:object_r:unlabeled_t:s0\000" trusted.link (46) trusted.som (24) = 04 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 08 00 00 00 00 00 00 00 BLOCKS: I will patch the llite and MDT code to make it update mtime and ctime accordingly.
Regards, Qian |
| Comment by Gerrit Updater [ 25/Sep/19 ] |
|
Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/36286 |
| Comment by Gerrit Updater [ 22/Oct/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36286/ |
| Comment by Gerrit Updater [ 26/Nov/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36869 |
| Comment by Gerrit Updater [ 17/Jan/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36869/ |