[LU-7899] osd_xattr_set() to batch actual EA update Created: 22/Mar/16  Updated: 21/Aug/17  Resolved: 09/Aug/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.10.1, Lustre 2.11.0

Type: Improvement Priority: Minor
Reporter: Alex Zhuravlev Assignee: Alex Zhuravlev
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocker
is blocking LU-7895 zfs metadata performance improvements Resolved
Related
is related to LU-8449 OSS crash with oom-killer started Resolved
Rank (Obsolete): 9223372036854775807

 Description   

moving EAs from nvlist into bonus/spill is quite expensive, we can save on this a bit collecting changes in nvlist (what we do already) and calling sa_update() from osd_trans_stop().



 Comments   
Comment by Gerrit Updater [ 25/Mar/16 ]

Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: http://review.whamcloud.com/19143
Subject: LU-7899 osd: batch EA updates
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 64ecfcbf26bdb21362adb91c62967a9b7db50faf

Comment by Gerrit Updater [ 11/Jul/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/19143/
Subject: LU-7899 osd: batch EA updates
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6cd79ab5860c59c2a640a9e8ca4ee86eec050b43

Comment by Joseph Gmitter (Inactive) [ 13/Jul/16 ]

Patch has landed to master for 2.9.0

Comment by Gerrit Updater [ 11/Aug/16 ]

Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/21878
Subject: Revert "LU-7899 osd: batch EA updates"
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9e26d30328dbe8bd01740ed9981b3a9ce62a4af0

Comment by Cliff White (Inactive) [ 11/Aug/16 ]

Testing on soak - no longer having soft lockups.

Comment by Gerrit Updater [ 11/Aug/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21878/
Subject: Revert "LU-7899 osd: batch EA updates"
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6fad3abf6f962d04989422cb44dfb7aa0835ad07

Comment by Peter Jones [ 11/Aug/16 ]

Oleg has reverted this change

Comment by Gerrit Updater [ 11/Aug/16 ]

Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: http://review.whamcloud.com/21893
Subject: LU-7899 osd: batch EA updates
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5cf0af5604a1eed3121d82d90ede98eb6207f3c7

Comment by Cliff White (Inactive) [ 24/May/17 ]

Soak testing started: 2017-05-24 21:29:31

Comment by Cliff White (Inactive) [ 25/May/17 ]

Immediately hit LU-9504. Have to roll up multiple fixes to get a possible working build, will work on that Friday.

Comment by Alex Zhuravlev [ 25/May/17 ]

thanks... working on that.

Comment by Alex Zhuravlev [ 25/May/17 ]

hmm, I still can't reproduce it Cliff, is it possible to run with just this patch and watch slabtop output along the run?

Comment by Cliff White (Inactive) [ 25/May/17 ]

We need a decent working baseline to apply this patch too. That is the problem. LU-9504 currently kills the system too quickly, we must have a fix for that in the baseline.

Comment by Cliff White (Inactive) [ 19/Jul/17 ]

Since we have landed LU-9504, can this patch be rebased? I will be able to run it on soak.

Comment by Alex Zhuravlev [ 20/Jul/17 ]

Cliff, the patch has been rebased. please, give it a run. thanks in advance.

Comment by Cliff White (Inactive) [ 21/Jul/17 ]

I am seeing more odd job failures. Things like:

07/21/2017 19:10:37: Process 0(soak-16.spirit.hpdd.intel.com): FAILED in mdtest_stat, unable to stat file: Input/output error
34395-simul.out:19:32:08: Process 10(soak-26.spirit.hpdd.intel.com): FAILED in simul_file_stat, stat failed: Input/output error
34409-simul.out:19:31:50: Process 12(soak-26.spirit.hpdd.intel.com): FAILED in simul_truncate, truncate failed: Cannot send after transport endpoint shutdown
34442-simul.out:19:10:37: Process 0(soak-16.spirit.hpdd.intel.com): FAILED in create_files, write in file /mnt/soaked/soaktest/test/simul/34442/simul_write.12: Input/output error
34454-mdtestssf.out:07/21/2017 19:10:37: Process 0(soak-16.spirit.hpdd.intel.com): FAILED in mdtest_stat, unable to stat file: Input/output error
34454-mdtestssf.out:07/21/2017 19:10:37: Process 1(soak-16.spirit.hpdd.intel.com): FAILED in mdtest_stat, unable to stat file: Input/output error
34462-simul.out:19:25:28: Process 26(soak-29.spirit.hpdd.intel.com): FAILED in simul_truncate, truncate failed: Input/output error

investigating.

Comment by Cliff White (Inactive) [ 21/Jul/17 ]

With this patch, i am seeing more jobs fail than succeed.

Comment by Alex Zhuravlev [ 24/Jul/17 ]

Cliff, any more details?

Comment by Cliff White (Inactive) [ 24/Jul/17 ]

What are you looking for from slab top? No hard crashes yet, restarting this morning.

Comment by Cliff White (Inactive) [ 02/Aug/17 ]

Tested latest version of the patch, ran 24 hours on soak, so far. No significant errors.

Comment by Gerrit Updater [ 09/Aug/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/21893/
Subject: LU-7899 osd: batch EA updates
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 2c9ff6dffdf4320af95c9db9af07a416529275f0

Comment by Peter Jones [ 09/Aug/17 ]

Landed for 2.11

Comment by Gerrit Updater [ 11/Aug/17 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28482
Subject: LU-7899 osd: batch EA updates
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 214a6c3d66456ff5f73fce8eb06b0b8e164347f7

Comment by Gerrit Updater [ 21/Aug/17 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28482/
Subject: LU-7899 osd: batch EA updates
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: d53a1183058ebfd52bf61b194d151369ba7c5087

Generated at Sat Feb 10 02:12:53 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.