[LU-14894] Parallel pass2 support for e2fsck Created: 30/Jul/21  Updated: 10/May/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major
Reporter: Wang Shilong (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: e2fsck, ldiskfs

Issue Links:
Duplicate
is duplicated by LU-14679 parallel e2fsck for pass2 directory s... Resolved
Related
is related to LU-14168 e2fsck should avoid moving files into... Open
is related to LU-14213 enable parallel e2fsck by default Open
is related to LU-8465 parallel e2fsck performance at scale Resolved
is related to LU-16170 parallel e2fsck summary inode count i... Open
is related to LU-16169 parallel e2fsck pass1 balanced group ... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

LU-8465 implemented parallel inode table scanning for e2fsck pass1, and parallel bitmap loading for e2fsck pass5.

After these improvements, e2fsck of a 50% full 1PB filesystem running with 256 threads improved pass1 from 7777s to 191s (40x faster), and pass5 from 10523s to 286s (36x faster).

However, pass2 only improved marginally, from 1363s to 1265s (7%) and this now dominates the total e2fsck runtime (70% in this test) on large OSTs, and limits the aggregate speedup of e2fsck to "only" 10x faster despite the other improvements.

Processing directories in parallel should also be possible, in order to further speed up e2fsck times on large OSTs, and expecially on large MDT filesystems.



 Comments   
Comment by Gerrit Updater [ 30/Jul/21 ]

Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44428
Subject: LU-14894 e2fsck: parallel pass2 support
Project: tools/e2fsprogs
Branch: master-lustre
Current Patch Set: 1
Commit: 40420a2f4dc1ae76176e7ce4fa39e600e38717ef

Comment by Andreas Dilger [ 30/Jul/21 ]

The description of this ticket mentions only about parallel loading of directory blocks, but not parallel checking. Is that true of the actual patch? It looked to me like it was also checking the directory blocks in parallel.

Comment by Wang Shilong (Inactive) [ 30/Jul/21 ]

This patch only implement parallel loading, parallel checking a bit complex, it touch many global variable.

Generated at Sat Feb 10 03:13:41 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.