Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
Running e2fsck on an MDT with a very large REMOTE_PARENT_DIR is extremely slow if there are entries in that directory that need to be repaired. In the case of a 60M-entry REMOTE_PARENT_DIR system, each directory entry was taking about 1.4s to repair due to PR_3_UNCONNECTED_DIR:
Unconnected directory inode 2102494 (/REMOTE_PARENT_DIR/???) Connect to /lost+found? yes Unconnected directory inode 2102510 (/REMOTE_PARENT_DIR/???) Connect to /lost+found? yes Unconnected directory inode 2102514 (/REMOTE_PARENT_DIR/???) Connect to /lost+found? yes
Depending on how many unattached entries there are, this might take days, weeks, or even months to complete (1M files might take 2 weeks to repair).
Attaching ltrace to e2fsck showed that all of the time is spent in ext2fs_get_pathname() opening and iterating through all of the entries in the huge directory (ltrace slowed down the per-file repair time from 1s to 14s but is the same fraction of time):
1638486316.336885 ext2fs_read_inode(0x18ab2f0, 0x261db5f2, 0x7ffdc97a2d00, 10) = 0 <0.000069> 1638486316.336977 ext2fs_link(0x18ab2f0, 11, 0x7ffdc97a2d80, 0x261db5f2) = 0 <0.001130> 1638486316.338130 ext2fs_read_inode(0x18ab2f0, 0x261db5f2, 0x7ffdc97a2bd0, 0xa626870) = 0 <0.000071> 1638486316.338223 ext2fs_icount_increment(0x383027b0, 0x261db5f2, 0, 0x18ab2b0) = 0 <0.000084> 1638486316.338329 ext2fs_icount_increment(0x1efa400, 0x261db5f2, 0, 0) = 0 <0.000073> 1638486316.338425 ext2fs_write_inode(0x18ab2f0, 0x261db5f2, 0x7ffdc97a2bd0, 0) = 0 <0.000094> 1638486316.338542 ext2fs_u32_list_test(0x1efa310, 0x261db5f2, 11, 0) = 0 <0.000069> 1638486316.338633 ext2fs_dir_iterate(0x18ab2f0, 0x261db5f2, 1, 0 <unfinished ...> 1638486316.338727 ext2fs_read_inode(0x18ab2f0, 0x83f7c001, 0x7ffdc97a28f0, 0) = 0 <0.000071> 1638486316.338819 ext2fs_icount_decrement(0x383027b0, 0x83f7c001, 0, 0x18ab2c0) = 0 <0.000080> 1638486316.338921 ext2fs_read_inode(0x18ab2f0, 11, 0x7ffdc97a28f0, 0) = 0 <0.000070> 1638486316.339014 ext2fs_icount_increment(0x383027b0, 11, 0, 0x18ab2a0) = 0 <0.000075> 1638486316.339111 ext2fs_icount_increment(0x1efa400, 11, 0, 0) = 0 <0.000071> 1638486316.339205 ext2fs_write_inode(0x18ab2f0, 11, 0x7ffdc97a28f0, 0) = 0 <0.000087> 1638486316.339313 <... ext2fs_dir_iterate resumed> ) = 0 <0.000679> 1638486316.339337 ext2fs_test_generic_bmap(0x1efa870, 0x261db5f3, 0x7f527a6cce48, 0x7f52765db010) = 4 <0.000070> 1638486316.339428 ext2fs_mark_generic_bmap(0x296ee90, 0x261db5f3, 4, 2) = 0 <0.000070> 1638486316.339521 ext2fs_mark_generic_bmap(0x296ee90, 0x83f7c001, 0x261db5f3, 0) = 1 <0.000070> 1638486316.339614 ext2fs_test_generic_bmap(0x1efa870, 0x261db5f4, 0x7f527a6cce54, 0x7f52765db010) = 8 <0.000069> 1638486316.339705 ext2fs_mark_generic_bmap(0x296ee90, 0x261db5f4, 8, 3) = 0 <0.000070> 1638486316.339798 ext2fs_mark_generic_bmap(0x296ee90, 0x1a3e72d1, 0x261db5f4, 0) = 1 <0.000073> 1638486316.339894 ext2fs_test_generic_bmap(0x1efa870, 0x261db5f5, 0x7f527a6cce60, 0x7f52765db010) = 16 <0.000069> 1638486316.339985 ext2fs_mark_generic_bmap(0x296ee90, 0x261db5f5, 16, 4) = 0 <0.000069> 1638486316.340077 ext2fs_mark_generic_bmap(0x296ee90, 0x83f7c001, 0x261db5f5, 0) = 1 <0.000069> 1638486316.340168 ext2fs_test_generic_bmap(0x1efa870, 0x261db5f6, 0x7f527a6cce6c, 0x7f52765db010) = 32 <0.000069> 1638486316.340260 ext2fs_mark_generic_bmap(0x296ee90, 0x261db5f6, 32, 5) = 0 <0.000068> 1638486316.340350 ext2fs_mark_generic_bmap(0x296ee90, 0x545ad40d, 0x261db5f6, 0) = 0 <0.000069> 1638486316.340443 ext2fs_mark_generic_bmap(0x296ee90, 0x545ad40c, 0x2434746, 0) = 0 <0.000069> 1638486316.340534 ext2fs_mark_generic_bmap(0x296ee90, 0x1ec6326d, 0x2434743, 0) = 16 <0.000069> 1638486316.340625 ext2fs_test_generic_bmap(0x1efa870, 0x261db5f7, 0x7f527a6cce78, 0x7f52765db010) = 64 <0.000070> 1638486316.340717 ext2fs_mark_generic_bmap(0x296ee90, 0x261db5f7, 64, 6) = 0 <0.000069> 1638486316.340811 dcgettext(0, 0x448684, 5, 335) = 0x448684 <0.000079> 1638486316.340916 __fprintf_chk(0x7f5363936400, 1, 0x44cb40, 12) = 12 <0.000095> 1638486316.341033 dcgettext(0, 0x44cc4d, 5, 12) = 0x44cc4d <0.000072> 1638486316.341128 __fprintf_chk(0x7f5363936400, 1, 0x44cb40, 9) = 9 <0.000088> 1638486316.341239 __fprintf_chk(0x7f5363936400, 1, 0x44cb40, 1) = 1 <0.000089> 1638486316.341350 dcgettext(0, 0x44ccb0, 5, 1) = 0x44ccb0 <0.000071> 1638486316.341445 __fprintf_chk(0x7f5363936400, 1, 0x44cb40, 5) = 5 <0.000089> 1638486316.341557 __fprintf_chk(0x7f5363936400, 1, 0x44cb40, 1) = 1 <0.000092> 1638486316.341673 __ctype_b_loc() = 0x7f5364a2d6f0 <0.000062> 1638486316.341758 __fprintf_chk(0x7f5363936400, 1, 0x44cafd, 0) = 9 <0.000094> 1638486316.341874 __fprintf_chk(0x7f5363936400, 1, 0x44cb40, 2) = 2 <0.000089> 1638486316.341985 __ctype_b_loc() = 0x7f5364a2d6f0 <0.000062> 1638486316.342069 ext2fs_get_pathname(0x18ab2f0, 0x261db5f7, 0, 0x7ffdc97a2ce0) = 0 <13.722823> 1638486330.064926 strlen("/REMOTE_PARENT_DIR/???") = 22 <0.000133>
It isn't currently possible to reduce the number of unattached inodes (LU-14168 might avoid attaching them to lost+found, see options there), and it isn't possible to reduce the size of REMOTE_PARENT_DIR retroactively (LU-10329 and LU-15314 can avoid it in the future), so the ext2fs_get_pathname() function it self must be sped up by a few orders of magnitude.