[LU-7251] reduce commit callbacks in OSP Created: 04/Oct/15  Updated: 13/Feb/19  Resolved: 01/Nov/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.11.0, Lustre 2.10.4

Type: Improvement Priority: Minor
Reporter: Alex Zhuravlev Assignee: Alex Zhuravlev
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocker
is blocking LU-7895 zfs metadata performance improvements Resolved
Duplicate
is duplicated by LU-10170 OSP device could miss wakeup from com... Resolved
Related
is related to LU-10066 A potential bug on OSP setattr handling Open
is related to LU-10230 sanity test_239: 4336 not synced Resolved
Rank (Obsolete): 9223372036854775807

 Description   

OSP can create a lot of commit callbacks to track changes to OST.

also, due to current tracking mechanism, we allocate 732 in 3 chunks for every thandle and release up on commit. at 10K transactions/sec and 5s commit interval it gives ~35MB consumed. we can get rid of this.



 Comments   
Comment by Gerrit Updater [ 19/Nov/15 ]

Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: http://review.whamcloud.com/17270
Subject: LU-7251 osp: do not assign commit callback to every thandle
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 34c397ae9abdfffd4febb07578b0adb056aef556

Comment by Shuichi Ihara (Inactive) [ 18/Aug/17 ]

we've tested patches and it's useful and improved performance a lot of bached metadata operations.
Here is quick test resutls with/without patches. It's kind of metadata stress testing. mdtest for 2.56M files creation with 5 iterations. (mdtest -d /scratch/dir0 -u -v -i 5 -n 10000 from 32 clients, 256 processes)
without patches, Directory creation and file removal are impacted by async operations for previous metadata tests (e.g a lo tof callback transation behind)

SUMMARY: (of 5 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   Directory creation:      42984.628      36953.381      39881.281       2292.603
   Directory stat    :     329218.490     322008.772     326116.622       2642.107
   Directory removal :      92163.630      79497.765      85630.907       4297.749
   File creation     :     117720.781      83613.736      99194.213      12354.962
   File stat         :     328629.182     314377.011     318491.614       5447.589
   File read         :     256663.372     243983.658     249932.488       4027.999
   File removal      :      74525.772      54910.856      65667.784       7710.949
   Tree creation     :        256.674         70.697        199.439         67.303
   Tree removal      :          8.069          7.260          7.550          0.285
V-1: Entering timestamp...

With patches, Dir creation and File remove imporved a lot and it's stable results.

SUMMARY: (of 5 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   Directory creation:      48695.989      41188.882      44081.677       2834.035
   Directory stat    :     310020.068     296526.272     303988.450       4788.302
   Directory removal :      93812.856      80760.823      88067.584       4621.696
   File creation     :     123718.206      91192.211     101872.614      11548.931
   File stat         :     307580.997     295884.726     302662.073       4837.231
   File read         :     273169.928     221255.575     258596.002      20265.696
   File removal      :     105689.008      88411.461      98718.108       6047.066
   Tree creation     :        252.958         36.398        180.527         78.435
   Tree removal      :          7.638          0.704          4.741          3.283
V-1: Entering timestamp...

Comment by Shuichi Ihara (Inactive) [ 11/Sep/17 ]

I've tested latest patch (patchset 29) against latest master and no regressions at all.

master without patch

mpirun mdtest -n 10000 -u -v -d /scratch0/mdtest.out -F -i 3 -p 10 -w 0
32 client, 128 processes

SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :     114523.666     108810.588     111770.237       2336.908
   File stat         :     303861.727     288229.127     294945.438       6568.839
   File read         :     162822.391     153932.256     157040.416       4092.317
   File removal      :     127130.002     116283.786     119919.433       5098.703
   Tree creation     :        420.030        180.980        317.848        100.626
   Tree removal      :         12.986         11.249         12.007          0.726
V-1: Entering timestamp...


mpirun mdtest -n 10000 -u -v -d /scratch0/mdtest.out -F -i 3 -p 10 -w 1048576
32 client, 128 processes

SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :      41194.158      35728.404      37847.067       2394.493
   File stat         :     310838.872     307811.966     309721.094       1356.519
   File read         :     165343.510     164597.509     164868.278        337.128
   File removal      :     117608.944     115636.150     116642.341        805.876
   Tree creation     :        462.102        135.079        323.501        138.076
   Tree removal      :         12.555         11.170         11.884          0.566
V-1: Entering timestamp...

with patch

mpirun mdtest -n 10000 -u -v -d /scratch0/mdtest.out -F -i 3 -p 10 -w 0
32 client, 128 processes
SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :     128146.138     112097.493     119258.116       6664.717
   File stat         :     306953.378     299898.991     304338.726       3155.872
   File read         :     165972.464     160745.027     164083.325       2367.357
   File removal      :     142539.301     123405.006     135905.318       8844.609
   Tree creation     :        424.966         98.191        287.812        138.468
   Tree removal      :         13.934         11.025         12.419          1.190
V-1: Entering timestamp...

-- finished at 09/12/2017 08:30:01 --

mpirun mdtest -n 10000 -u -v -d /scratch0/mdtest.out -F -i 3 -p 10 -w 1048576
32 client, 128 processes

SUMMARY: (of 3 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :      41306.861      40326.870      40692.381        437.100
   File stat         :     316061.726     311079.602     314214.401       2228.390
   File read         :     166757.778     164512.030     165589.820        919.037
   File removal      :     140835.710     113936.502     128198.607      11041.507
   Tree creation     :        479.815        190.251        372.237        129.394
   Tree removal      :         12.696         10.593         11.918          0.942
V-1: Entering timestamp...
Comment by Gerrit Updater [ 01/Nov/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/17270/
Subject: LU-7251 osp: do not assign commit callback to every thandle
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 0ba690a526be74c4cdffe7a7dd3031b4bd2b37d8

Comment by Peter Jones [ 01/Nov/17 ]

Landed for 2.11

Comment by Gerrit Updater [ 01/Nov/17 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/29879
Subject: LU-7251 osp: do not assign commit callback to every thandle
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: f7a99006cab0235872c6af165a016142c669d7f6

Comment by Gerrit Updater [ 12/Apr/18 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/29879/
Subject: LU-7251 osp: do not assign commit callback to every thandle
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: 236f73509cdcc83cdb56cdea376ff4a4e7f378c7

Generated at Sat Feb 10 02:07:18 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.