[LU-3128] filter_fid on OST not updated during layout swap Created: 08/Apr/13 Updated: 27/Nov/17 Resolved: 27/Nov/17 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | John Hammond | Assignee: | Mikhail Pershin |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | mdd | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 7599 | ||||||||||||
| Description |
|
LMA on OST is not updated during layout swap. # llmount.sh ... # cd /mnt/lustre # lfs setstripe -c2 f0 # dd if=/dev/zero bs=1M count=2 | tr '\0' 'X' > f0 2+0 records in 2+0 records out 2097152 bytes (2.1 MB) copied, 0.0201865 s, 104 MB/s # lfs getstripe f0 f0 lmm_stripe_count: 2 lmm_stripe_size: 1048576 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 1 0x1 0 1 1 0x1 0 # touch f1 # ls -lh total 2.0M -rw-r--r-- 1 root root 2.0M Apr 8 13:54 f0 -rw-r--r-- 1 root root 0 Apr 8 13:54 f1 # lfs path2fid f0 [0x200000400:0x1:0x0] # lfs path2fid f1 [0x200000400:0x3:0x0] # umount /mnt/ost1 # mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop # ls -lh /mnt/ost1/O/0/d1/1 -rw-rw-rw- 1 root root 1.0M Apr 8 13:54 /mnt/ost1/O/0/d1/1 # ll_decode_filter_fid /mnt/ost1/O/0/d1/1 /mnt/ost1/O/0/d1/1: objid=4294967296 seq=1 parent=[0x200000400:0x1:0x0] # umount /mnt/ost1 # mount /tmp/lustre-ost1 /mnt/ost1 -t lustre -o loop # lfs swap_layouts f0 f1 # ls -lh total 2.0M -rw-r--r-- 1 root root 0 Apr 8 13:57 f0 -rw-r--r-- 1 root root 2.0M Apr 8 13:57 f1 # umount /mnt/ost1 # mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop # ls -lh /mnt/ost1/O/0/d1/1 -rw-rw-rw- 1 root root 1.0M Apr 8 13:57 /mnt/ost1/O/0/d1/1 # ll_decode_filter_fid /mnt/ost1/O/0/d1/1 /mnt/ost1/O/0/d1/1: objid=4294967296 seq=1 parent=[0x200000400:0x1:0x0] |
| Comments |
| Comment by Peter Jones [ 09/Apr/13 ] |
|
Bob will look into this |
| Comment by Bob Glossman (Inactive) [ 09/Apr/13 ] |
|
From the discussion in |
| Comment by John Hammond [ 10/Apr/13 ] |
|
Hi Bob. After the layout swap ll_decode_filter_fid should return the FID of the new parent. # lfs setstripe -c2 f0 # touch f1 # dd if=/dev/zero of=f0 bs=1M count=2 2+0 records in 2+0 records out 2097152 bytes (2.1 MB) copied, 0.02666 s, 78.7 MB/s # lfs path2fid f0 [0x200000400:0x5:0x0] # lfs path2fid f1 [0x200000400:0x6:0x0] # lfs getstripe f0 f0 lmm_stripe_count: 2 lmm_stripe_size: 1048576 lmm_layout_gen: 0 lmm_stripe_offset: 1 obdidx objid objid group 1 5 0x5 0 0 5 0x5 0 # umount /mnt/ost1 # mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop # ls -l /mnt/ost1/O/0/d5/5 -rw-rw-rw- 1 root root 1048576 Apr 10 10:20 /mnt/ost1/O/0/d5/5 # ll_decode_filter_fid /mnt/ost1/O/0/d5/5 /mnt/ost1/O/0/d5/5: parent=[0x200000400:0x5:0x0] stripe=1 ### Correct. # umount /mnt/ost1 # mount /tmp/lustre-ost1 /mnt/ost1 -t lustre -o loop # lfs swap_layouts f0 f1 # ls -lh total 2.0M -rw-r--r-- 1 root root 0 Apr 10 10:22 f0 -rw-r--r-- 1 root root 2.0M Apr 10 10:22 f1 # umount /mnt/ost1 # mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop # ll_decode_filter_fid /mnt/ost1/O/0/d5/5 /mnt/ost1/O/0/d5/5: parent=[0x200000400:0x5:0x0] stripe=1 ### Incorrect parent should be [0x200000400:0x6:0x0]. |
| Comment by Bob Glossman (Inactive) [ 11/Apr/13 ] |
|
Does look like the problem still reproduces with the commit from [root@centos1 lustre]# lfs setstripe -c2 f0 [root@centos1 lustre]# touch f1 [root@centos1 lustre]# dd if=/dev/zero of=f0 bs=1M count=2 2+0 records in 2+0 records out 2097152 bytes (2.1 MB) copied, 0.00989071 s, 212 MB/s [root@centos1 lustre]# lfs path2fid f0 [0x200000400:0x1:0x0] [root@centos1 lustre]# lfs path2fid f1 [0x200000400:0x2:0x0] [root@centos1 lustre]# lfs getstripe f0 f0 lmm_stripe_count: 2 lmm_stripe_size: 1048576 lmm_layout_gen: 0 lmm_stripe_offset: 1 obdidx objid objid group 1 2 0x2 0 0 2 0x2 0 [root@centos1 lustre]# umount /mnt/ost1 [root@centos1 lustre]# mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop [root@centos1 lustre]# ll_decode_filter_fid /mnt/ost1/O/0/d2/2 /mnt/ost1/O/0/d2/2: parent=[0x200000400:0x1:0x0] stripe=1 [root@centos1 lustre]# umount /mnt/ost1 [root@centos1 lustre]# mount /tmp/lustre-ost1 /mnt/ost1 -t lustre -o loop [root@centos1 lustre]# lfs swap_layouts f0 f1 [root@centos1 lustre]# ls -lh total 2.0M -rw-r--r-- 1 root root 0 Apr 11 11:33 f0 -rw-r--r-- 1 root root 2.0M Apr 11 11:33 f1 [root@centos1 lustre]# umount /mnt/ost1 [root@centos1 lustre]# [root@centos1 lustre]# mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop [root@centos1 lustre]# ll_decode_filter_fid /mnt/ost1/O/0/d2/2 /mnt/ost1/O/0/d2/2: parent=[0x200000400:0x1:0x0] stripe=1 |
| Comment by Mikhail Pershin [ 18/Apr/13 ] |
|
John, are you sure this occurred after |
| Comment by John Hammond [ 18/Apr/13 ] |
|
Mike, you're right about the description and the EAs. Sorry. I sort of copied that from the summary of # git describe
2.3.64-7-gc4f7a77
# llmount.sh
...
# cd /mnt/lustre
# lfs setstripe -c2 f0
# dd if=/dev/zero of=f0 bs=1M count=2
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.0128964 s, 163 MB/s
# lfs path2fid f0
[0x200000400:0x1:0x0]
# lfs getstripe f0
f0
lmm_stripe_count: 2
lmm_stripe_size: 1048576
lmm_layout_gen: 0
lmm_stripe_offset: 1
obdidx objid objid group
1 2 0x2 0
0 2 0x2 0
# touch f1
# lfs path2fid f1
[0x200000400:0x3:0x0]
# lfs getstripe f1
f1
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_layout_gen: 0
lmm_stripe_offset: 1
obdidx objid objid group
1 3 0x3 0
# ls -lh
total 2.0M
-rw-r--r-- 1 root root 2.0M Apr 18 08:59 f0
-rw-r--r-- 1 root root 0 Apr 18 08:59 f1
#
# umount /mnt/ost1
# mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop
#
# ls -lh /mnt/ost1/O/0/d2/2
-rw-rw-rw- 1 root root 1.0M Apr 18 08:59 /mnt/ost1/O/0/d2/2
# sys_listxattr /mnt/ost1/O/0/d2/2
'trusted.lma' '000000000000000000000000010000000200000000000000'
'trusted.fid' '00040000020000000100000001000000'
# ll_decode_filter_fid /mnt/ost1/O/0/d2/2
/mnt/ost1/O/0/d2/2: objid=4294967296 seq=1 parent=[0x200000400:0x1:0x1]
#
# umount /mnt/ost1
# mount /tmp/lustre-ost1 /mnt/ost1 -t lustre -o loop
#
# lfs swap_layouts f0 f1
# ls -lh
total 2.0M
-rw-r--r-- 1 root root 0 Apr 18 08:59 f0
-rw-r--r-- 1 root root 2.0M Apr 18 08:59 f1
#
# umount /mnt/ost1
# mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop
#
# lfs swap_layouts f0 f1
# ls -lh
total 2.0M
-rw-r--r-- 1 root root 0 Apr 18 08:59 f0
-rw-r--r-- 1 root root 2.0M Apr 18 08:59 f1
#
# umount /mnt/ost1
# mount /tmp/lustre-ost1 /mnt/ost1 -t ldiskfs -o loop
#
# ls -lh /mnt/ost1/O/0/d2/2
-rw-rw-rw- 1 root root 1.0M Apr 18 08:59 /mnt/ost1/O/0/d2/2
# sys_listxattr /mnt/ost1/O/0/d2/2
'trusted.lma' '000000000000000000000000010000000200000000000000'
'trusted.fid' '00040000020000000100000001000000'
# ll_decode_filter_fid /mnt/ost1/O/0/d2/2
/mnt/ost1/O/0/d2/2: objid=4294967296 seq=1 parent=[0x200000400:0x1:0x1]
|
| Comment by Mikhail Pershin [ 19/Apr/13 ] |
|
John, am I right that it was not update before |
| Comment by Alex Zhuravlev [ 19/Apr/13 ] |
|
is this really a blocker? I would hope it's not. I think LOD replacing layout should take care of filter_fid: call ->do_xattr_set(XATTR_NAME_FID, ..), then osp should llog the change and set appropriate OST_SET_ATTR? |
| Comment by Andreas Dilger [ 22/Apr/13 ] |
|
Alex, you are completely right - this should be handled by layout swap code, but this isn't done yet. The LFSCK Phase 2 code will correct the parent FID on the OST objects, but I'd rather that it be done correctly during swap under normal conditions. |
| Comment by Mikhail Pershin [ 11/Apr/14 ] |
|
I need to check if bug exists in the master to fix this or close the bug otherwise, lower priority to the Major |
| Comment by Jinshan Xiong (Inactive) [ 27/Nov/17 ] |
|
This problem will be fixed in |