[LU-10465] increase default stripe size to 4MB Created: 08/Jan/18 Updated: 05/Jan/24 Resolved: 18/Nov/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.11.0 |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Jian Yu | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LTS12 | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||||||
| Description |
|
Patch https://review.whamcloud.com/25336 for |
| Comments |
| Comment by Jian Yu [ 08/Jan/18 ] |
|
Here is the patch to increase default stripe size to 4MB: https://review.whamcloud.com/27151 |
| Comment by Jian Yu [ 08/Jan/18 ] |
|
Hi Cliff, Could you please test the above patch according to the following suggestion from Andreas?
Thank you. |
| Comment by Cliff White (Inactive) [ 08/Jan/18 ] |
|
Saurabh is doing performance testing now, we'll get this into the schedule. |
| Comment by Saurabh Tandan (Inactive) [ 08/Jan/18 ] |
|
I will add this into my schedule and get back with results. |
| Comment by Andreas Dilger [ 08/Jan/18 ] |
|
In light of That said, I think it is less harmful if the default stripe size is larger than the RPC size, than if the stripe size is smaller than the RPC size. |
| Comment by Gerrit Updater [ 13/Feb/18 ] |
|
Jian Yu (jian.yu@intel.com) uploaded a new patch: https://review.whamcloud.com/31292 |
| Comment by Andreas Dilger [ 22/Feb/18 ] |
|
Ihara or Vitaly, do you have any performance test results from testing with the default stripe_size of 4MB (not the RPC size)? Do you already run with this in production? We're just looking to see if the default should be increased. It looks like a good improvement for ZFS, but our testing for ldiskfs is mixed, so it would be good to get some more feedback from more systems if possible. |
| Comment by Gerrit Updater [ 06/Mar/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27151/ |
| Comment by Peter Jones [ 06/Mar/18 ] |
|
Landed for 2.11 |
| Comment by Andreas Dilger [ 08/Mar/18 ] |
|
Due to problems related to Rather than revert the whole patch, I would recommend to submit a new patch that is only changing the default stripe size, and leave the test fixes in place. That allows developers to specify different default stripe size without hitting unrelated failures, and simplifies testing in the future. Once the patch to change the default stripe size back to 1MB has landed this ticket should be moved to 2.12. |
| Comment by Gerrit Updater [ 08/Mar/18 ] |
|
Jian Yu (jian.yu@intel.com) uploaded a new patch: https://review.whamcloud.com/31589 |
| Comment by Gerrit Updater [ 12/Mar/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31589/ |
| Comment by Peter Jones [ 12/Mar/18 ] |
|
We've backed off changing the default for 2.11 |
| Comment by Andreas Dilger [ 08/Jul/19 ] |
|
I think with Ihara, if you get a chance, could you please run a 4MB default stripe size for IOR FPP and SSF to see the performance impact. Vitaly, it would also be good to know if this change improves or hurts performance on your systems before it becomes the default. |
| Comment by Gerrit Updater [ 23/Jan/20 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37318 |
| Comment by Andreas Dilger [ 24/Jan/20 ] |
|
Mike, it looks like there are still failures in the DoM tests when the default stripe size is changed to 4MB. Could you please take a look. == sanity test 272c: DoM migration: DOM file to the OST-striped file (composite) ===================== 23:32:52 (1579822372) lfs migrate: cannot create composite file '/mnt/lustre/d272c.sanity/.:VOLATILE:0000:5EB80E61': Invalid argument error: lfs migrate: /mnt/lustre/d272c.sanity/f272c.sanity: cannot create volatile file: Operation not permitted sanity test_272c: @@@@@@ FAIL: failed to migrate to the new composite layout |
| Comment by Mikhail Pershin [ 24/Jan/20 ] |
|
Andreas, this happens because new component stripes are not aligned with an old ones. We have original file with 1MB DOM component and not defined second component, e.g. it become 4MB stripes by default. Test tries to migrate to PFL layout with 2MB as first component stripe and second component by default. With 1MB default it works, but with 4MB boundaries are not aligned. So test should be changed to migrate to the same first stripe size it seems. Meanwhile I wonder, is that OK that components are not aligned by stripe size in a file? E.g. for original file we have [0, 1MB) for DOM and then 4MB stripes, so whole file has stripes as 1MB, 4MB, 4, ... and each 4MB stripe is not aligned at 4MB from the file beginning. I am not sure is that a problem or not, but wouldn't be better to adjust new component stripe sizes to be aligned with that component start? |
| Comment by Andreas Dilger [ 24/Jan/20 ] |
|
The second chunk (start of second component, after DoM component) should just be a bit shorter, starting at 1MB and ending at 4MB, with a "hole" at the start where the DoM opponent is. Then the rest of the chunks in the second component would be properly sized/aligned at 4MB. It is done this way so that eg. the DoM data could be written into the second component without having to move all the later data. |
| Comment by Mikhail Pershin [ 24/Jan/20 ] |
|
Test failed due to that error: (lod_lov.c:1934:lod_verify_striping()) stripe size isn't aligned, stripe_sz: 4194304, [0, 2097152) So this check in lod_verify_striping() is not quite correct, isn't it? If original file has 1MB for first component and 3MB, 4MB ... for the second, then it can be migrated to 2MB for the first and then 2MB, 4MB, ... for the second component. |
| Comment by Mikhail Pershin [ 24/Jan/20 ] |
|
Probably I see where problem is, test does the following: $LFS migrate -E 2M -c1 -E -1 -c2 $dom without implicitly set stripe size, LOD uses default size 4MB and fails because it is over whole component 2MB. Considering the user may not know default MDT stripe size, I'd say it is not his fault to use 2MB component, and either lfs or LOD should take care and reduce stripe size to the component size maybe? |
| Comment by Gerrit Updater [ 21/Feb/20 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37661 |
| Comment by Gerrit Updater [ 23/Apr/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37661/ |
| Comment by Gerrit Updater [ 17/Sep/20 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39957 |
| Comment by Gerrit Updater [ 23/Sep/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/37318/ |
| Comment by Gerrit Updater [ 04/Nov/23 ] |
|
"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52989 |
| Comment by Gerrit Updater [ 18/Nov/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52989/ |
| Comment by Peter Jones [ 18/Nov/23 ] |
|
Landed for 2.16 |