[LU-12852] growing a PFL file with last stripe as -1 fails Created: 12/Oct/19 Updated: 20/May/20 Resolved: 08/Feb/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.8 |
| Fix Version/s: | Lustre 2.14.0, Lustre 2.12.5 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Mahmoud Hanafi | Assignee: | Emoly Liu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Severity: | 2 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
If a file/dir is striped with last stripe count set to -1 growing the file fails.
$ lfs setstripe -E 256M -c 1 -E 16G -c 4 -E -1 -S 4M -c -1 pfldir
$ echo hello > pfldir/test
$ echo helpo >> pfldir/test
-bash: echo: write error: No space left on device
$ lfs getstripe pfldir/test
pfldir/test
lcm_layout_gen: 3
lcm_mirror_count: 1
lcm_entry_count: 3
lcme_id: 1
lcme_mirror_id: 0
lcme_flags: init
lcme_extent.e_start: 0
lcme_extent.e_end: 268435456
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 313
lmm_objects:
- 0: { l_ost_idx: 313, l_fid: [0x101390000:0x110967ab:0x0] }
lcme_id: 2
lcme_mirror_id: 0
lcme_flags: 0
lcme_extent.e_start: 268435456
lcme_extent.e_end: 17179869184
lmm_stripe_count: 4
lmm_stripe_size: 1048576
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: -1
lcme_id: 3
lcme_mirror_id: 0
lcme_flags: 0
lcme_extent.e_start: 17179869184
lcme_extent.e_end: EOF
lmm_stripe_count: -1
lmm_stripe_size: 4194304
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: -1
$ lfs setstripe -E 256M -c 1 -E 16G -c 4 -E -1 -S 4M -c 10 pfldir
$ echo hello > pfldir/test
$ echo helpo >> pfldir/test
This worked.
|
| Comments |
| Comment by Andreas Dilger [ 12/Oct/19 ] |
|
Hi Mahmoud, could you please collect the console logs from the client and MDS around the time that the error is hit, if there is anything printed. If nothing interesting is shown, please collect "lctl dk" logs from the client and MDS around this time. I tested this with my local 2.10.6 client and didn't have any problems. $ lfs setstripe -E 32M -c 1 -S 1M -E 10G -c 4 -E -1 -c -1 -S 4M /myth/tmp/tmp/pfl
$ echo hello > /myth/tmp/tmp/pfl
$ echo hellop >> /myth/tmp/tmp/pfl
$ lfs getstripe /myth/tmp/tmp/pfl2
/myth/tmp/tmp/pfl2
lcm_layout_gen: 4
lcm_entry_count: 3
lcme_id: 1
lcme_flags: init
lcme_extent.e_start: 0
lcme_extent.e_end: 33554432
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_pattern: 1
lmm_layout_gen: 0
lmm_stripe_offset: 4
lmm_objects:
- 0: { l_ost_idx: 4, l_fid: [0x100040000:0x3121d7:0x0] }
lcme_id: 2
lcme_flags: init
lcme_extent.e_start: 33554432
lcme_extent.e_end: 10737418240
lmm_stripe_count: 4
lmm_stripe_size: 1048576
lmm_pattern: 1
lmm_layout_gen: 0
lmm_stripe_offset: 0
lmm_objects:
- 0: { l_ost_idx: 0, l_fid: [0x100000000:0x22cb28:0x0] }
- 1: { l_ost_idx: 1, l_fid: [0x100010000:0x1b19ec:0x0] }
- 2: { l_ost_idx: 2, l_fid: [0x100020000:0x26cea4:0x0] }
- 3: { l_ost_idx: 3, l_fid: [0x100030000:0x2223f3:0x0] }
lcme_id: 3
lcme_flags: init
lcme_extent.e_start: 10737418240
lcme_extent.e_end: EOF
lmm_stripe_count: 5
lmm_stripe_size: 4194304
lmm_pattern: 1
lmm_layout_gen: 0
lmm_stripe_offset: 0
lmm_objects:
- 0: { l_ost_idx: 0, l_fid: [0x100000000:0x22cb29:0x0] }
- 1: { l_ost_idx: 1, l_fid: [0x100010000:0x1b19ed:0x0] }
- 2: { l_ost_idx: 2, l_fid: [0x100020000:0x26cea5:0x0] }
- 3: { l_ost_idx: 3, l_fid: [0x100030000:0x2223f4:0x0] }
- 4: { l_ost_idx: 4, l_fid: [0x100040000:0x3121d8:0x0] }
How many OSTs in the filesystem? Is there any chance that you have the patch https://review.whamcloud.com/35617 " |
| Comment by Mahmoud Hanafi [ 12/Oct/19 ] |
|
There are 342 OSTs. we don't have client nid is 10.151.11.62@o2ib
when you try lfs migrate you get error
# lfs migrate text.txt lfs migrate: cannot get group lock: No space left on device (28) error: lfs migrate: /nobackupp2/whzhu/text.txt: cannot get group lock: No space left on device r417i2n16 ~ # |
| Comment by Mahmoud Hanafi [ 12/Oct/19 ] |
|
Better debug |
| Comment by Andreas Dilger [ 12/Oct/19 ] |
|
Are you able to create a non-PFL file with "lfs setstripe -c -1" in this filesystem? With 342 OSTs this exceeds the normal 4KB limit for xattrs (160 stripes) unless the MDT has the "ea_inode" feature enabled. |
| Comment by Mahmoud Hanafi [ 12/Oct/19 ] |
|
lfs setstripe -c -1 give 165 stripes.
# lfs getstripe /nobackupp2/mhanafi/WIDESTRIPE/test
/nobackupp2/mhanafi/WIDESTRIPE/test
lmm_stripe_count: 165
lmm_stripe_size: 1048576
lmm_pattern: raid0
lmm_layout_gen: 0
lmm_stripe_offset: 242
obdidx objid objid group
242 286064211 0x110cfe53 0
243 287444989 0x11220ffd 0
204 166993747 0x9f41f53 0
143 168014735 0xa03b38f 0
This is what I expect with PFL, but it tries to set more than this. We don't have ea_inode feature enabled. Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery flex_bg dirdata sparse_super large_file huge_file uninit_bg dir_nlink extra_isize quota I thought I had tested PFL in 2.10.6 and it used to work. We don't have 2.10.6 running anymore so I can't test. |
| Comment by Andreas Dilger [ 12/Oct/19 ] |
|
Part of the problem is that with PFL layouts, there is not room for the full 165 stripes to fit into the ~4KiB xattr space, because each component consumes some space (maybe 3 stripes worth each), and the layouts within those components also consumes space (1 and 4 stripes respectively). That means it would be possible to declare a layout that could use maybe 150 stripes in the third component without exceeding the 4KB xattr limit. Alternately, the ea_inode feature can be enabled on the MDT using the "tune2fs -O ea_inode /dev/<mdtdev>" in order to allow larger layouts (up to 2000 stripes). While this could potentially be done from the ldiskfs point of view while the MDT is mounted, the MDS code does not check for the maximum xattr size to change while it is mounted. Since this would need an MDT remount to take effect anyway, it may as well be done while the MDT is unmounted. In this case, the e2fsprogs should to be at least 1.44.5.wc1, but preferably the most recent version 1.45.2.wc1. |
| Comment by Mahmoud Hanafi [ 12/Oct/19 ] |
|
I think this is a bug. PFL should create the correct number of stripes as with the non-PFL file. |
| Comment by Andreas Dilger [ 15/Oct/19 ] |
|
Mahmoud, could you clarify what you mean by "correct number of stripes" in this case? Without the "ea_inode" feature, then PFL will just not have as much space to store stripes as a non-PFL file. Hopefully by "correct number of stripes" you mean "whatever will still fit into the remaining xattr space", which is probably about 150 in your case, but will vary based on the number and size of the previous components. If you enable the "ea_inode" feature then you would actually be able to store the full 342 stripes in the last component. |
| Comment by Mahmoud Hanafi [ 16/Oct/19 ] |
|
"correct number of stripes" meaning what ever it can fit. Like the non-PFL case. It shouldn't fail.
|
| Comment by Andreas Dilger [ 17/Oct/19 ] |
|
It looks like there is already a function "lod_get_stripe_count()" that is supposed to be checking the maximum xattr size and restricting the stripe count to this limit. It may be that the calculation is slightly incorrect (e.g. not taking into account the xattr overhead), so changing this slightly would work: - max_stripes = lov_mds_md_max_stripe_count(easize, LOV_MAGIC_V3); + max_stripes = lov_mds_md_max_stripe_count(easize, LOV_MAGIC_V3) - 1; but I haven't tested this yet. |
| Comment by Peter Jones [ 05/Dec/19 ] |
|
Could you please create a patch to address this issue? |
| Comment by Gerrit Updater [ 06/Dec/19 ] |
|
Emoly Liu (emoly@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36947 |
| Comment by Gerrit Updater [ 08/Feb/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36947/ |
| Comment by Peter Jones [ 08/Feb/20 ] |
|
Landed for 2.14 |
| Comment by Gerrit Updater [ 10/Feb/20 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37512 |
| Comment by Gerrit Updater [ 01/May/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37512/ |