[LU-12353] optimizations for ldiskfs quota updates Created: 29/May/19  Updated: 31/May/23  Resolved: 03/Dec/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.14.0, Lustre 2.16.0, Lustre 2.15.3

Type: Improvement Priority: Minor
Reporter: Andrew Perepechko Assignee: Andrew Perepechko
Resolution: Fixed Votes: 0
Labels: patch, performance

Issue Links:
Related
is related to LU-10034 LDISKFS Quota scalability improvement Closed
Rank (Obsolete): 9223372036854775807

 Description   

We can slightly improve quota usage accounting times by avoiding unnecessary ondisk updates.

A patch and explanation will be added.



 Comments   
Comment by Gerrit Updater [ 29/May/19 ]

Andrew Perepechko (c17827@cray.com) uploaded a new patch: https://review.whamcloud.com/34992
Subject: LU-12353 ldiskfs: speedup quota journalling
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 44215f0c452da18577475ca4c8a50e147e9e6a2b

Comment by Andrew Perepechko [ 29/May/19 ]

Why this patch improves performance:

Multiple threads will not block on dqio_mutex in dquot_commit().

If thread A is writing dquot and another thread B is waiting for dqio_mutex, other threads can just skip dquot update. Thread B is guaranteed to write the latest data. Once it starts writing, it will clear DQ_MOD_B and new threads entering ldiskfs_mark_dquot_dirty() will have to queue at least one dquot update.

So we are moving from the following worst case (threads ordered by the time they called ldiskfs_mark_dquot_dirty()):
thread 1) executing dquot_commit()
thread 2) waiting in dquot_commit()
...
thread N) waiting in dquot_commit()

to the following case:

thread 1) executing dquot_commit()
thread 2) waiting in dquot_commit()
thread 3) can exit ldiskfs_mark_dquot_dirty() immediately since thread 2 is guaranteed to write the latest data to disk
...
thread N) can exit ldiskfs_mark_dquot_dirty() immediately since thread 2 is guaranteed to write the latest data to disk

Although, in the first case, a subset of thread 2, thread 3, ... thead N may find that DQ_MOD_B is already cleared so they can exit dquot_commit() without updating the disk buffer, each of them will still have to wait for dqio_mutex to do that.

Atomicity concerns:

The updates are guaranteed to be atomic since we always call ldiskfs_mark_dquot_dirty() with a transaction handle. And if multiple threads enter this code at the same time, they share handles to the same transaction by jbd2 design.

Jan Kara mentioned in LKML that this is not the case for dqctl calls which can call ext4_mark_dquot_dirty() without a jbd2 handle, but this case is not relevant for Lustre.

Significant dqio_mutex redesign was planned, so this patch was rejected in LKML, but we can at least slightly improve performance without any kernel fs/quota modification for the older kernels.

Comment by Andreas Dilger [ 11/Jun/20 ]

Shuichi, would you be able to test out the referenced patch to see if it improves performance for us? This is not urgent, since I don't think e.g. we have quota enabled for IO500, but if we always have quota enabled on customer systems then it would be nice to add a 10-15% improvement (as mentioned in the comments in Gerrit on this patch) "for free" if this 2-line patch helps.

Comment by Wang Shilong (Inactive) [ 11/Jun/20 ]

Acutally, this could help IO500 since we have quota accounting by default.

Comment by Shuichi Ihara [ 11/Jun/20 ]

Andreas, Sure. I will add my Todo list and test this patch too. Just an confirmation. Do we expect performance improvement even no quota enforcement, but just quota accounting on quota slave? quota accounting is always enabled by default, right?

Comment by Andrew Perepechko [ 11/Jun/20 ]

sihara,

Do we expect performance improvement even no quota enforcement, but just quota accounting on quota slave?

Yes.

Comment by Gerrit Updater [ 03/Dec/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34992/
Subject: LU-12353 ldiskfs: speedup quota journalling
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: dad25f258e50895b4bd5fce30765599a7a490aa0

Comment by Peter Jones [ 03/Dec/20 ]

Landed for 2.14

Comment by Gerrit Updater [ 02/May/23 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50853
Subject: LU-12353 ldiskfs: add ext4-dquot-commit-speedup patch to more series
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8843159caf504c8fc22402ad5b06594da378bae7

Comment by Gerrit Updater [ 12/May/23 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50983
Subject: LU-12353 ldiskfs: add ext4-dquot-commit-speedup patch to more series
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 2a33165775f7fefa9682432869e920ead225352c

Comment by Gerrit Updater [ 27/May/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50983/
Subject: LU-12353 ldiskfs: add ext4-dquot-commit-speedup patch to more series
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 5305f6efdd3aae1b1034eb6a9d880f6db8559dbf

Comment by Gerrit Updater [ 31/May/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50853/
Subject: LU-12353 ldiskfs: add ext4-dquot-commit-speedup patch to more series
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ec8f93afa0559ef6bfcdf701f4c1a50207901ef2

Generated at Sat Feb 10 02:51:50 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.