[LU-14765] sanity-flr test_44c: mirror split does not reduce block# Created: 16/Jun/21 Updated: 14/Apr/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.15.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | Zhenyu Xu |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | always_except | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
This issue was created by maloo for S Buisson <sbuisson@ddn.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/8d692dc1-910c-4ee7-baf0-21734795ed81 test_44c failed with the following error: mirror split does not reduce block# 10485760 != 4194304 sanity-flr test_44c seems buggy. It starts by writing 10MB, but it reports a file size of 4MB: ** before mirror ops, file blocks=4096 KiB VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Andreas Dilger [ 24/Jun/21 ] |
|
Test was added by patch https://review.whamcloud.com/43168 " maybe a short sleep is needed before fetching the first attributes? |
| Comment by Andreas Dilger [ 24/Jun/21 ] |
|
Bobijam, can you please take a look. |
| Comment by Gerrit Updater [ 25/Jun/21 ] |
|
Bobi Jam (bobijam@hotmail.com) uploaded a new patch: https://review.whamcloud.com/44074 |
| Comment by Vladimir Saveliev [ 14/Dec/21 ] |
|
+1 on master |
| Comment by Andreas Dilger [ 06/Jan/22 ] |
|
+2 on master: |
| Comment by Qian Yingjin [ 26/Jan/22 ] |
|
+1 |
| Comment by Zhenyu Xu [ 03/Mar/22 ] |
|
The block number of a mirrored file cannot be accurate IMO, since getattr of the file only retrieve info from the OST objects of one mirror, not including those of another mirrors. And mirror extend can only add block numbers of one mirror upon the already inaccurate block numbers of the file; and if multiple mirror extension is executed by "-N <# of mirrors>" option, the situation is exacerbated. So I'm wondering whether we need to keep the semantic meaning of block # in a mirrored file. |
| Comment by Andreas Dilger [ 03/Mar/22 ] |
|
Shouldn't "stat" be using the blocks in the SOM xattr, and that should be kept correct? I would rather ensure this kept updated for the few times that the SOM is put in "strict" mode, so that incremental mirror addition is also correct. For this test, it might make sense to have it only use "lfs getsom -b" to verify the SOM data directly. In order for that to work, I also think that the trusted.som xattr never be cached on the client. |
| Comment by Zhenyu Xu [ 08/Mar/22 ] |
|
I don't see that the inode's block number (the number reported by "stat") corresponds to the SOM block number after several mirror extend calls, even SOM is in "strict" mode. Updated SOM is not always passed back and translated to inode's block number. |
| Comment by Andreas Dilger [ 08/Mar/22 ] |
|
i would think that when SOM sets the STRICT flag that all client writes are complete and committed, so the blocks could should be correct at this time? Alternately, should we allow the SOM blocks count to be increased from later client RPCs, like LSOM does, so that the size is correct/strict and the blocks at least get "more correct" over time? |
| Comment by Zhenyu Xu [ 09/Mar/22 ] |
|
yes, write do account correct block number changes, but a file with multiple mirrors will mess the block number base (so I'm thinking even mirror extend need to invalidate the block number in SOM). And its block number cannot be correct until it is split to remain only one mirror. |
| Comment by Andreas Dilger [ 10/Mar/22 ] |
|
Bobijam, is there something unusual with how the SOM blocks count is updated? I'd think that it is just using the sum of the blocks counts for all mirror objects at the time the file is closed. If some of the mirrors are stale, then blocks count may not be totally accurate, but after "lfs mirror resync" the client doing the resync should be totally uptodate and can send this to the MDS. |
| Comment by Artem Blagodarenko (Inactive) [ 14/Mar/22 ] |
|
+1 on master |
| Comment by Zhenyu Xu [ 18/Mar/22 ] |
The initial issues raised is that 'stat' does not show file blocks number increase after it has extra mirror merged. If one mirror is added, supposed that SOM is strict and it reflects all blocks of the file, then we can account the blocks number accurately after the merge; if the SOM is not strict, then after the merge the SOM is still not accurate, then 'stat' need to rely on glimpse size to collect the file's blocks number, but glimpse would only choose one mirror to calculate blocks number for one mirror, and the result blocks number would not be bigger than that of the pre-mirror-extended file. The similar scenarios also happens on mirror split, if SOM is not accurate on the file's blocks number before the mirror operation, the blocks number would not be showed to be decreased after the mirror split as well. |
| Comment by Emoly Liu [ 26/Apr/22 ] |
|
+1 on master: |
| Comment by Nikitas Angelinas [ 13/May/22 ] |
|
+1 on master: https://testing.whamcloud.com/test_sets/d5df04c7-cfe9-4a8e-99a9-e7b2beb3f55b |
| Comment by Gerrit Updater [ 18/May/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44074/ |
| Comment by Andreas Dilger [ 09/Mar/23 ] |
|
This is causing user-visible issues in the field, since "du" with STRICT SOM reports block numbers that are too large after the mirror has been split from the file. There was some work in Rather than have the client do an extra "stat" of all mirrors on each close (which is expensive and mostly useless), I think the right answer is that this extra stat is only needed when doing the mirror split operation. That will ensure the right values are stored in the STRICT SOM, without adding overhead to the very common close operation. |
| Comment by Gerrit Updater [ 16/Mar/23 ] |
|
"Zhenyu Xu <bobijam@hotmail.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50310 |