[LU-8216] Quota updates are not properly journaled Created: 30/May/16  Updated: 12/Oct/16  Resolved: 22/Jun/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Major
Reporter: Wang Shilong (Inactive) Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: patch
Environment:

Running Lustre in RHEL6.


Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

In our production system, when Lustre server crash happen, e2fsck mostly will report Quota accounting mismatch problems. and sometimes there are huge differences.

Problems are current Lustre quota codes rely on ldiskfs quota accounting
and if ldiskfs quota is wrong, we need run e2fsck or by disable/enable to fix quota accounting.

This encourage me to look quota implement for ldiskfs. while taking at codes
I found there is a big problem with RHEL6 quota codes, that quota updates are
not properly journaled.

Every ext4_mark_dquot_dirty is called, we skip and only add quota updates
to dirty list without journal it. This make quota updates only journaled in ext4_quota_write() which can be called in 'sync_file' which only happen
during sync call or umount.

This will make big problem if we hit crash, we will lost many quota updates.



 Comments   
Comment by Gerrit Updater [ 30/May/16 ]

Wang Shilong (wshilong@ddn.com) uploaded a new patch: http://review.whamcloud.com/20503
Subject: LU-8216 ldiskfs: fix journal quota files
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9d519f9900fe1aa6c4d552fea20faf1346f0069a

Comment by Peter Jones [ 30/May/16 ]

Niu

Could you please review this proposed change?

Thanks

Peter

Comment by Gerrit Updater [ 14/Jun/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20503/
Subject: LU-8216 ldiskfs: fix journal quota files
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d59d553dc18189dfc5e43196f7c8a2bd6346b675

Comment by Bob Glossman (Inactive) [ 14/Jun/16 ]

I see the change http://review.whamcloud.com/20503 updates many of the ldiskfs patch series, but not those for el6.8, el7*, sles12*. Is this change not needed there or was it left out by mistake?

Comment by Wang Shilong (Inactive) [ 15/Jun/16 ]

Hello Bob Glossman, I think we still need it for el6.8, maybe when i pushed patch, el6.8 is not yet merged into master.

Comment by Gerrit Updater [ 15/Jun/16 ]

Wang Shilong (wshilong@ddn.com) uploaded a new patch: http://review.whamcloud.com/20791
Subject: LU-8216 ldiskfs: fix journal quota files for rhel6.8
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 62056c1d43625696ec719d1f1d7e0f1d477b5c6e

Comment by Gerrit Updater [ 22/Jun/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20791/
Subject: LU-8216 ldiskfs: fix journal quota files for RHEL6.8
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e9d2e7d7ef86e322d1fbc9b01e549e133d8cecc5

Comment by Joseph Gmitter (Inactive) [ 22/Jun/16 ]

All patches have landed to master for 2.9.0

Comment by Bruno Travouillon (Inactive) [ 02/Sep/16 ]

For the record, when we discovered this journaling issue, we added the following mount options to our mdt and osts :

usrjquota=lquota.user,grpjquota=lquota.group,jqfmt=vfsv1

No fix needed.

Generated at Sat Feb 10 02:15:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.