[LU-2677] Adding LMA to OST object Created: 25/Jan/13 Updated: 29/Nov/17 Resolved: 11/Apr/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Di Wang | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | MB | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 6250 | ||||||||||||
| Description |
|
Per Discussion with Alex, LMA should be added to OST object EA as well. |
| Comments |
| Comment by nasf (Inactive) [ 25/Jan/13 ] |
|
Add FID-in-LMA for OST object means you need to check FLD during OI scrub, but when OI scrub run, the FLD or related init work may be not ready yet. So please make sure your patch will not break something. |
| Comment by Andreas Dilger [ 25/Jan/13 ] |
|
Pasting thread from email thread about this topic recently:
|
| Comment by Andreas Dilger [ 25/Jan/13 ] |
|
Alex later replied:
|
| Comment by Andreas Dilger [ 25/Jan/13 ] |
|
Alex later replied again:
|
| Comment by Andreas Dilger [ 25/Jan/13 ] |
|
So, in conclusion, if this is going to be done for 2.4 (which would be the right place for it), then the "filter_fid" needs to be replaced with a simple xattr that just contains the OST object parent FID + stripe index for LFSCK Phase 2 and "ll_recover_lost_found_objs". It needs to fit within the 256-byte inode size used on the OSTs along with the LMA.
|
| Comment by Alex Zhuravlev [ 31/Jan/13 ] |
|
Andreas, what's the deadline for these changes? I'm fine to work on this, if needed, but the inspections are supposed to be the top priority at the moment. |
| Comment by Andreas Dilger [ 31/Jan/13 ] |
|
Alex, this can reasonably be considered a bug fix since it has a noticeable impact on performance, so it can be landed after the feature freeze. |
| Comment by Andreas Dilger [ 05/Mar/13 ] |
|
Alex, |
| Comment by Alex Zhuravlev [ 06/Mar/13 ] |
|
osd-ldiskfs does not store LMA on OST objects, osd-zfs does. but I'd like to fix this in 2.4, if possible. I'm fine to do this myself, immediately. |
| Comment by Andreas Dilger [ 07/Mar/13 ] |
|
I notice that if we shrink lustre_mdt_attrs by another 8 bytes (removing "lma_flags", which is currently unused) then it would be possible to store both lustre_mdt_attrs and filter_fid in the same 256-byte ldiskfs inode. This is fine to change within the 2.4 release, since it was already modified in 2.4 due to removal of HSM and SOM xattrs, and 2.1 already has compatibility code for the smaller LMA xattr. Since LMA already has __u32 lma_compat and __u32 lma_incompat flags, we don't really need lma_flags very much. If you are planning to store both LMA and FF on ldiskfs OST objects, then it would also be possible to remove ff_objid and ff_seq from struct filter_fid, making it only 16 bytes in size + 20 byte header. That would need the code in ll_recover_lost_found_objs to be able to handle both the old and new filter_fid structure size, and check for FF (to determine if this is an OFD object) + LMA (for self FID) when rebuilding the object directories. With the smaller LMA and FF, there would even be 16 bytes free for future use, or in case more static fields are added to struct ext4_inode. |
| Comment by Andreas Dilger [ 08/Mar/13 ] |
|
Moved this over to Mike, since he has fewer blocker bugs than Di or Alex. Alex, if you have the bandwidth and desire to do this yourself, please discuss with Mike and assign to yourself. |
| Comment by Mikhail Pershin [ 28/Mar/13 ] |
|
http://review.whamcloud.com/#change,5838 patch removes lma_flags from lma and ff_seq, ff_objid from filter_fid. Tools are changed to handle old filter_fid and new changes. I had to change sanity.sh test_27z to don't use debugfs until it will understand new filter_fid. |
| Comment by Andreas Dilger [ 06/Apr/13 ] |
|
http://review.whamcloud.com/5964 is patch for e2fsprogs to allow using smaller FF and LMA structures. |
| Comment by John Hammond [ 08/Apr/13 ] |
|
What depends on this xattr being correct? I ask because I was poking at layout swap this morning and noticed that it was not being updated accordingly. # llmount.sh ... # cd /mnt/lustre # lfs setstripe -c2 f0 # dd if=/dev/zero bs=1M count=2 | tr '\0' 'X' > f0 2+0 records in 2+0 records out 2097152 bytes (2.1 MB) copied, 0.0201865 s, 104 MB/s # lfs getstripe f0 f0 lmm_stripe_count: 2 lmm_stripe_size: 1048576 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 1 0x1 0 1 1 0x1 0 # touch f1 # ls -lh total 2.0M -rw-r--r-- 1 root root 2.0M Apr 8 13:54 f0 -rw-r--r-- 1 root root 0 Apr 8 13:54 f1 # lfs path2fid f0 [0x200000400:0x1:0x0] # lfs path2fid f1 [0x200000400:0x3:0x0] # umount /mnt/ost1 # mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop # ls -lh /mnt/ost1/O/0/d1/1 -rw-rw-rw- 1 root root 1.0M Apr 8 13:54 /mnt/ost1/O/0/d1/1 # ll_decode_filter_fid /mnt/ost1/O/0/d1/1 /mnt/ost1/O/0/d1/1: objid=4294967296 seq=1 parent=[0x200000400:0x1:0x0] # umount /mnt/ost1 # mount /tmp/lustre-ost1 /mnt/ost1 -t lustre -o loop # lfs swap_layouts f0 f1 # ls -lh total 2.0M -rw-r--r-- 1 root root 0 Apr 8 13:57 f0 -rw-r--r-- 1 root root 2.0M Apr 8 13:57 f1 # umount /mnt/ost1 # mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop # ls -lh /mnt/ost1/O/0/d1/1 -rw-rw-rw- 1 root root 1.0M Apr 8 13:57 /mnt/ost1/O/0/d1/1 # ll_decode_filter_fid /mnt/ost1/O/0/d1/1 /mnt/ost1/O/0/d1/1: objid=4294967296 seq=1 parent=[0x200000400:0x1:0x0] |
| Comment by Andreas Dilger [ 08/Apr/13 ] |
|
John, could you please file the layout swap as a separate bug. That is a known issue with the current layout swap implementation, and while the solution was discussed at one point (IIRC, sending setattr to the OSTs to update the parents) I guess that this was not implemented yet. It isn't a totally critical failure, since the parent FID is only used in case of MDT corruption, and it will eventually be fixed by LFSCK Phase 2, it would still be better to get it correct during the swap. |
| Comment by John Hammond [ 08/Apr/13 ] |
|
I have created |
| Comment by Andreas Dilger [ 09/Apr/13 ] |
|
There is a prerequisite patch also: http://review.whamcloud.com/5967 |
| Comment by Jodi Levi (Inactive) [ 11/Apr/13 ] |
|
Landed for 2.4 |