[LU-5197] A performance regression of "FileRead" metadata operation Created: 14/Jun/14  Updated: 13/Jan/15  Resolved: 23/Jun/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.5.2
Fix Version/s: Lustre 2.6.0

Type: Bug Priority: Critical
Reporter: Shuichi Ihara (Inactive) Assignee: John Hammond
Resolution: Fixed Votes: 0
Labels: HB

Issue Links:
Related
is related to LU-4367 unlink performance regression on lust... Resolved
is related to LU-4398 mdt_object_open_lock() may not flush ... Resolved
Severity: 2
Rank (Obsolete): 14517

 Description   

There is a perforamnce regression of "FileRead" metadata operation caused by fixes of LU-4398.
Please see below performance difference with/without LU-4398 patches.

It's simple mdtest run on single client with single thread.

#mpirun -np 1 -ppn 1 -hostfile ./hostfile ./mdtest -n 100000 -i 1 -p 5 -u -v -F -d mdtest.out

v2_5_2_RC1

SUMMARY: (of 1 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :       3478.187       3478.187       3478.187          0.000
   File stat         :       4514.273       4514.273       4514.273          0.000
   File read         :        515.113        515.113        515.113          0.000
   File removal      :       6499.513       6499.513       6499.513          0.000
   Tree creation     :       3908.951       3908.951       3908.951          0.000
   Tree removal      :        428.646        428.646        428.646          0.000

v2_5_2_RC1 + revert "LU-4398 mdt: acquire an open lock for write or execute" (97bfe7a3c0fc74fb0e56cbc1ea9cb827fb657b48)
SUMMARY: (of 1 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :       3575.570       3575.570       3575.570          0.000
   File stat         :       4708.052       4708.052       4708.052          0.000
   File read         :       5348.987       5348.987       5348.987          0.000
   File removal      :       6509.834       6509.834       6509.834          0.000
   Tree creation     :       3816.473       3816.473       3816.473          0.000
   Tree removal      :        400.143        400.143        400.143          0.000


 Comments   
Comment by Peter Jones [ 15/Jun/14 ]

John

You are the author of the patch in question. Do you have any suggestions here?

Thanks

Peter

Comment by John Hammond [ 15/Jun/14 ]

Investigating.

Comment by John Hammond [ 16/Jun/14 ]

Ihara,

Could you try with http://review.whamcloud.com/10725?

Also I see that mdtest_read() calls open() with O_RDWR. Would it be possible to change that to O_RDONLY and rerun?

Comment by Andreas Dilger [ 16/Jun/14 ]

There were a few patches under LU-4367 that also affected the open performance that we were not considering as critical for 2.6.0, but may also help performance here.

Have you tested with any of those patches applied, in particular http://review.whamcloud.com/9696. The patch http://review.whamcloud.com/9697 also looks very similar to John's patch here.

Comment by Shuichi Ihara (Inactive) [ 19/Jun/14 ]

Sorry, dealy of this testing, I got some test resutls today. still bad file read operation.

branch File read (ops/sec)
master 620
master+10725 623
master+9696 625
Comment by John Hammond [ 19/Jun/14 ]

Hi Ihara,

Could you provide the results for master + revert of 97bfe7a3c0fc74fb0e56cbc1ea9cb827fb657b48?

Thanks,

John

Comment by Shuichi Ihara (Inactive) [ 19/Jun/14 ]

John,

You mean revert 708d85a652a77f85153790e6cca1b7a2b91947cf (Revert "LU-4398 mdt: acquire an open lock for write or execute"), right? I couldn't find 97bfe7a3c0fc74fb0e56cbc1ea9cb827fb657b48 in the master.

Anyway, here is results of master + revert of it. The FileRead operation is back to normal.

SUMMARY: (of 1 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :       4331.477       4331.477       4331.477          0.000
   File stat         :       5040.508       5040.508       5040.508          0.000
   File read         :       5780.921       5780.921       5780.921          0.000
   File removal      :       7401.880       7401.880       7401.880          0.000
   Tree creation     :       3184.741       3184.741       3184.741          0.000
   Tree removal      :        366.283        366.283        366.283          0.000
V-1: Entering timestamp...

-- finished at 06/19/2014 23:53:31 --
Comment by Jodi Levi (Inactive) [ 23/Jun/14 ]

Patch from LU-4398 causing this issue was reverted.

Generated at Sat Feb 10 01:49:22 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.