[LU-3943] incorrect inode count in lfs df -i Created: 12/Sep/13 Updated: 02/Jun/14 Resolved: 27/Sep/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.7 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Kit Westneat (Inactive) | Assignee: | Jian Yu |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 10432 |
| Description |
|
SFU recently reformatted their MDT to have more inodes, using -i. We did a file-level backup and restore. Everything looks good except client-side df and lfs df are both reporting the old inode count: client1# lfs df -i mds# df -i /dev/vg_lfs_scra/mdt mds# dumpe2fs -h /dev/vg_lfs_scra/mdt I'm pretty puzzled by this. Is there something I'm doing wrong? Any other information I can get you? |
| Comments |
| Comment by Peter Jones [ 13/Sep/13 ] |
|
Yu, Jian Could you please advise on this one? Thanks Peter |
| Comment by Kit Westneat (Inactive) [ 23/Sep/13 ] |
|
Any updates? Thanks, |
| Comment by Jian Yu [ 24/Sep/13 ] |
|
Hi Kit, Am I correct that you changed the value of bytes-per-inode from 2048 to 1024 for the MDT? |
| Comment by Jian Yu [ 24/Sep/13 ] |
|
I just did an experiment on Lustre 1.8.7-wc1 and got the following results: Format the filesystem with "-i 2048" by default for MDT: [root@fat-amd-2 ~]# mkfs.lustre --mgs --mdt --fsname=lustre --device-size=240000000 --reformat /dev/sdc5
...
mkfs_cmd = mke2fs -j -b 4096 -L lustre-MDTffff -J size=400 -I 512 -i 2048 -q -O uninit_bg,dir_nlink,huge_file,flex_bg -E lazy_journal_init -F /dev/sdc5 60000000
Writing CONFIGS/mountdata
[root@fat-amd-2 ~]# mkfs.lustre --ost --fsname=lustre --mgsnode=fat-amd-2@tcp --device-size=240000000 --reformat /dev/sdc6
...
mkfs_cmd = mke2fs -j -b 4096 -L lustre-OSTffff -J size=400 -I 256 -i 69905 -q -O extents,uninit_bg,dir_nlink,huge_file,flex_bg -G 256 -E resize=4290772992,lazy_journal_init -F /dev/sdc6 60000000
Writing CONFIGS/mountdata
[root@fat-amd-2 ~]# mkdir -p /mnt/mds; mount -t lustre -o user_xattr /dev/sdc5 /mnt/mds
[root@fat-amd-2 ~]# mkdir -p /mnt/ost1; mount -t lustre /dev/sdc6 /mnt/ost1
[root@fat-amd-2 ~]# mount -t lustre -o user_xattr,flock fat-amd-2@tcp:/lustre /mnt/lustre
[root@fat-amd-2 ~]# lfs df /mnt/lustre
UUID 1K-blocks Used Available Use% Mounted on
lustre-MDT0000_UUID 179966864 483840 167483384 0% /mnt/lustre[MDT:0]
lustre-OST0000_UUID 239105024 470300 226634660 0% /mnt/lustre[OST:0]
filesystem summary: 239105024 470300 226634660 0% /mnt/lustre
[root@fat-amd-2 ~]# df /mnt/lustre
Filesystem 1K-blocks Used Available Use% Mounted on
fat-amd-2@tcp:/lustre
239105024 470300 226634660 1% /mnt/lustre
[root@fat-amd-2 ~]# df /mnt/mds
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdc5 179966864 483840 167483384 1% /mnt/mds
[root@fat-amd-2 ~]# lfs df -i /mnt/lustre
UUID Inodes IUsed IFree IUse% Mounted on
lustre-MDT0000_UUID 120003328 25 120003303 0% /mnt/lustre[MDT:0]
lustre-OST0000_UUID 3517440 56 3517384 0% /mnt/lustre[OST:0]
filesystem summary: 120003328 25 120003303 0% /mnt/lustre
[root@fat-amd-2 ~]# df -i /mnt/lustre
Filesystem Inodes IUsed IFree IUse% Mounted on
fat-amd-2@tcp:/lustre
3517409 25 3517384 1% /mnt/lustre
[root@fat-amd-2 ~]# df -i /mnt/mds
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sdc5 120003328 25 120003303 1% /mnt/mds
Unmount and reformat the filesystem with "-i 1024" for MDT: [root@fat-amd-2 ~]# mkfs.lustre --mgs --mdt --fsname=lustre --mkfsoptions="-i 1024" --device-size=240000000 --reformat /dev/sdc5
...
mkfs_cmd = mke2fs -j -b 4096 -L lustre-MDTffff -i 1024 -J size=400 -I 512 -q -O uninit_bg,dir_nlink,huge_file,flex_bg -E lazy_journal_init -F /dev/sdc5 60000000
Writing CONFIGS/mountdata
[root@fat-amd-2 ~]# mkfs.lustre --ost --fsname=lustre --mgsnode=fat-amd-2@tcp --device-size=240000000 --reformat /dev/sdc6
...
mkfs_cmd = mke2fs -j -b 4096 -L lustre-OSTffff -J size=400 -I 256 -i 69905 -q -O extents,uninit_bg,dir_nlink,huge_file,flex_bg -G 256 -E resize=4290772992,lazy_journal_init -F /dev/sdc6 60000000
Writing CONFIGS/mountdata
[root@fat-amd-2 ~]# mkdir -p /mnt/mds; mount -t lustre -o user_xattr /dev/sdc5 /mnt/mds
[root@fat-amd-2 ~]# mkdir -p /mnt/ost1; mount -t lustre /dev/sdc6 /mnt/ost1
[root@fat-amd-2 ~]# mount -t lustre -o user_xattr,flock fat-amd-2@tcp:/lustre /mnt/lustre
[root@fat-amd-2 ~]# lfs df /mnt/lustre
UUID 1K-blocks Used Available Use% Mounted on
lustre-MDT0000_UUID 119901352 487936 107414036 0% /mnt/lustre[MDT:0]
lustre-OST0000_UUID 239105024 470300 226634660 0% /mnt/lustre[OST:0]
filesystem summary: 239105024 470300 226634660 0% /mnt/lustre
[root@fat-amd-2 ~]# df /mnt/lustre
Filesystem 1K-blocks Used Available Use% Mounted on
fat-amd-2@tcp:/lustre
239105024 470300 226634660 1% /mnt/lustre
[root@fat-amd-2 ~]# df /mnt/mds
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdc5 119901352 487936 107414036 1% /mnt/mds
[root@fat-amd-2 ~]# lfs df -i /mnt/lustre
UUID Inodes IUsed IFree IUse% Mounted on
lustre-MDT0000_UUID 240046264 25 240046239 0% /mnt/lustre[MDT:0]
lustre-OST0000_UUID 3517440 56 3517384 0% /mnt/lustre[OST:0]
filesystem summary: 240046264 25 240046239 0% /mnt/lustre
[root@fat-amd-2 ~]# df -i /mnt/lustre
Filesystem Inodes IUsed IFree IUse% Mounted on
fat-amd-2@tcp:/lustre
3517409 25 3517384 1% /mnt/lustre
[root@fat-amd-2 ~]# df -i /mnt/mds
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sdc5 240046264 25 240046239 1% /mnt/mds
The inode count was increased from 120003328 to 240046264 properly. |
| Comment by Kit Westneat (Inactive) [ 24/Sep/13 ] |
|
Yes, bytes per inode was changed and then the contents of the old MDT was copied to the new MDT. Is the inode count stored somewhere on the MDT? |
| Comment by Andreas Dilger [ 25/Sep/13 ] |
|
The inode count isn't stored explicitly by Lustre anywhere (of course it is in the superblock of the underlying filesystem). I suspect the problem you are seeing is that "lfs df -i" and "df -i" are showing the worst case for the number of files that can be created in the filesystem. In 1.8.7 there are several limits put on the statfs() value returned to the client to ensure that the reported number of free files can actually be created. The free inode count is limited by:
/* * We need to hack the return value for the free inode counts because * the current EA code requires one filesystem block per inode with EAs, * so it is possible to run out of blocks before we run out of inodes. * * This can be removed when the ext3 EA code is fixed. */ static int fsfilt_ext3_statfs(struct super_block *sb, struct obd_statfs *osfs) { struct kstatfs sfs; int rc; memset(&sfs, 0, sizeof(sfs)); rc = ll_do_statfs(sb,&sfs); if (!rc && sfs.f_bfree < sfs.f_ffree) { sfs.f_files = (sfs.f_files - sfs.f_ffree) + sfs.f_bfree; sfs.f_ffree = sfs.f_bfree; } statfs_pack(osfs, &sfs); return rc; } This was removed in 1.8.7 because it caused more confusion than necessary. I believe that you must be running an older MDS server version that still has this code, because your free inode count exactly matches the free blocks count. If MDT inodes are created that do not consume OST objects (e.g. directories, internal log files, files explicitly striped with fewer than the default number of objects) or they do not consume extra MDT data blocks (e.g. most files excluding directories) then the number of free inodes in the filesystem will not decrease, and instead the total number of inodes will appear to increase. This was done because typically users of "df" or statfs() care about the free and used space and not the total space. See the comment in ll_statfs_internal(): /* If we don't have as many objects free on the OST as inodes
* on the MDS, we reduce the total number of inodes to
* compensate, so that the "inodes in use" number is correct.
*/
if (obd_osfs.os_ffree < osfs->os_ffree) {
osfs->os_files = (osfs->os_files - osfs->os_ffree) +
obd_osfs.os_ffree;
osfs->os_ffree = obd_osfs.os_ffree;
}
If more OSTs are added, or if the default stripe count is reduced (if not 1) then the number of files that can be created in the filesystem will appear to increase. It is usually desirable for the MDT to be over-provisioned with inodes so that it will not run out before the OSTs run out of space. |
| Comment by Kit Westneat (Inactive) [ 27/Sep/13 ] |
|
Ah you're right, they are actually running 1.8.6, my fault. Thanks for the explanation, I think this can be closed. |
| Comment by Peter Jones [ 27/Sep/13 ] |
|
ok thanks Kit! |
| Comment by Andreas Dilger [ 18/Dec/13 ] |
|
I found a patch on one of my systems to fix the "lfs df -i" inode summary to match "df -i" if the OST free objects count is less than the MDT free inode count: |