Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1666

hdr->coh_page_guard contention (single shared file performance)

Details

    • 3
    • 4054

    Description

      Shared file performance with the CLIO code has been observed to be considerably worse than file per process. In particular, during reads it's easy for our clients to become cpu bound. This workload is of particular interest because we expect numerous codes to operate in this mode on Sequoia.

      Running oprofile clearly shows that the vast majority of cpu is going to contention on the hdr->coh_page_guard spin lock. Below are is a report of all functions consuming >0.1% of the cpu during the read phase.

      samples  cum. samples  %        cum. %     app name                 symbol name
      3815407  3815407       26.3390  26.3390    obdclass.ko              cl_vmpage_page
      3671445  7486852       25.3452  51.6842    vmlinux                  _atomic_dec_and_lock
      1215715  8702567        8.3925  60.0767    obdclass.ko              cl_page_find0
      876063   9578630        6.0478  66.1244    obdclass.ko              cl_object_top
      760239   10338869       5.2482  71.3726    obdclass.ko              cl_page_delete0
      262614   10601483       1.8129  73.1855    vmlinux                  copy_user_generic_string
      141309   10742792       0.9755  74.1610    obdclass.ko              cl_page_put
      111365   10854157       0.7688  74.9298    osc.ko                   osc_checksum_bulk
      105015   10959172       0.7250  75.6548    lvfs.ko                  lprocfs_counter_add
      99702    11058874       0.6883  76.3430    vmlinux                  unlock_page
      96783    11155657       0.6681  77.0112    lustre.ko                vvp_page_export
      87458    11243115       0.6038  77.6149    vmlinux                  put_page
      82223    11325338       0.5676  78.1825    lvfs.ko                  lprocfs_counter_sub
      77318    11402656       0.5338  78.7163    obdclass.ko              cl_object_attr_lock
      77143    11479799       0.5325  79.2488    obdclass.ko              cl_page_list_init
      75322    11555121       0.5200  79.7688    vmlinux                  kmem_cache_free
      74627    11629748       0.5152  80.2840    osc.ko                   osc_ap_completion
      

      A closer look at the top five consumers shows them all contented on this spin lock.

      vma      samples  cum. samples  %        cum. %     linenr info                 app name                 symbol name
      0000000000050f30 3815407  3815407       26.3390  26.3390    cl_page.c:723               obdclass.ko              cl_vmpage_page
        0000000000050f30 212      212            0.0056   0.0056    cl_page.c:723
        0000000000050f31 197      409            0.0052   0.0107    cl_page.c:723
        0000000000050f34 12       421           3.1e-04   0.0110    cl_page.c:723
        0000000000050f36 184      605            0.0048   0.0159    cl_page.c:723
        0000000000050f38 395      1000           0.0104   0.0262    cl_page.c:723
        0000000000050f39 118      1118           0.0031   0.0293    cl_page.c:723
        0000000000050f3d 164      1282           0.0043   0.0336    cl_page.c:723
        0000000000050f42 27       1309          7.1e-04   0.0343    libcfs_debug.h:198
        0000000000050f49 16       1325          4.2e-04   0.0347    cl_page.c:727
        0000000000050f53 191      1516           0.0050   0.0397    cl_page.c:723
        0000000000050f56 7        1523          1.8e-04   0.0399    cl_page.c:723
        0000000000050f59 6        1529          1.6e-04   0.0401    libcfs_debug.h:198
        0000000000050f68 170      1699           0.0045   0.0445    cl_page.c:728
        0000000000050f6b 128      1827           0.0034   0.0479    cl_page.c:728
        0000000000050f71 13       1840          3.4e-04   0.0482    cl_page.c:740
        0000000000050f74 8        1848          2.1e-04   0.0484    cl_page.c:740
        0000000000050f79 182      2030           0.0048   0.0532    cl_object.h:2626
        0000000000050f7c 343      2373           0.0090   0.0622    cl_page.c:741
        0000000000050f80 48       2421           0.0013   0.0635    cl_page.c:741
        0000000000050f88 3714181  3716602       97.3469  97.4104    cl_page.c:742
        0000000000050f8c 1089     3717691        0.0285  97.4389    cl_page.c:742
        0000000000050f91 233      3717924        0.0061  97.4450    cl_object.h:2637
        0000000000050f95 15918    3733842        0.4172  97.8622    cl_object.h:2626
        0000000000050f99 3835     3737677        0.1005  97.9627    cl_page.c:744
        0000000000050f9c 420      3738097        0.0110  97.9737    cl_page.c:744
        0000000000050f9e 1016     3739113        0.0266  98.0004    cl_page.c:744
        0000000000050fb1 3584     3742697        0.0939  98.0943    cl_page.c:743
        0000000000050fb5 161      3742858        0.0042  98.0985    cl_page.c:742
        0000000000050fba 480      3743338        0.0126  98.1111    spinlock.h:174
        0000000000050fbd 269      3743607        0.0071  98.1182    spinlock.h:174
        0000000000050fc0 48983    3792590        1.2838  99.4020    spinlock.h:174
        0000000000050fc3 2        3792592       5.2e-05  99.4020    cl_page.c:750
        0000000000050fcc 103      3792695        0.0027  99.4047    libcfs_debug.h:198
        0000000000050fd3 84       3792779        0.0022  99.4069    libcfs_debug.h:198
        0000000000050ff7 125      3792904        0.0033  99.4102    cl_page.c:752
        0000000000050ffb 29       3792933       7.6e-04  99.4110    cl_page.c:752
        0000000000050ffc 229      3793162        0.0060  99.4170    cl_page.c:752
        0000000000050ffe 117      3793279        0.0031  99.4200    cl_page.c:752
        0000000000051000 98       3793377        0.0026  99.4226    cl_page.c:752
        0000000000051001 204      3793581        0.0053  99.4280    cl_page.c:752
        0000000000051028 55       3793636        0.0014  99.4294    atomic_64.h:176
        000000000005102d 30       3793666       7.9e-04  99.4302    atomic_64.h:176
        0000000000051031 7819     3801485        0.2049  99.6351    cl_page.c:123
        0000000000051035 6        3801491       1.6e-04  99.6353    cl_page.c:124
        0000000000051039 69       3801560        0.0018  99.6371    cl_page.c:124
        000000000005103d 583      3802143        0.0153  99.6524    cl_page.c:124
        0000000000051041 365      3802508        0.0096  99.6619    atomic_64.h:93
        0000000000051045 11064    3813572        0.2900  99.9519    atomic_64.h:93
        0000000000051086 30       3813602       7.9e-04  99.9527    cl_page.c:750
        000000000005108d 1805     3815407        0.0473  100.000    cl_page.c:750
      
      0000000000052420 1215715  8702567        8.3925  60.0767    cl_page.c:402               obdclass.ko              cl_page_find0
        0000000000052420 95       95             0.0078   0.0078    cl_page.c:402
        0000000000052421 55       150            0.0045   0.0123    cl_page.c:402
        0000000000052424 40       190            0.0033   0.0156    cl_page.c:402
        000000000005242b 85       275            0.0070   0.0226    cl_page.c:402
        000000000005242f 86       361            0.0071   0.0297    cl_page.c:402
        0000000000052433 55       416            0.0045   0.0342    cl_page.c:402
        0000000000052437 4        420           3.3e-04   0.0345    cl_page.c:402
        000000000005243b 73       493            0.0060   0.0406    cl_page.c:402
        000000000005243f 58       551            0.0048   0.0453    cl_page.c:402
        0000000000052444 42       593            0.0035   0.0488    cl_page.c:402
        0000000000052448 5        598           4.1e-04   0.0492    cl_page.c:402
        000000000005244c 124      722            0.0102   0.0594    cl_page.c:402
        000000000005244f 49       771            0.0040   0.0634    cl_page.c:402
        0000000000052453 75       846            0.0062   0.0696    cl_object.h:2631
        0000000000052457 8013     8859           0.6591   0.7287    cl_page.c:402
        000000000005245a 60       8919           0.0049   0.7336    cl_page.c:402
        000000000005245d 32       8951           0.0026   0.7363    cl_object.h:2631
        0000000000052461 5510     14461          0.4532   1.1895    cl_page.c:409
        0000000000052465 3        14464         2.5e-04   1.1898    cl_page.c:409
        000000000005246e 35       14499          0.0029   1.1926    cl_page.c:410
        0000000000052473 119      14618          0.0098   1.2024    libcfs_debug.h:198
        000000000005247a 41       14659          0.0034   1.2058    cl_page.c:412
        0000000000052484 155      14814          0.0127   1.2185    libcfs_debug.h:198
        00000000000524a8 648      15462          0.0533   1.2718    cl_object.h:2626
        00000000000524ac 82       15544          0.0067   1.2786    atomic_64.h:93
        00000000000524b0 83372    98916          6.8579   8.1364    libcfs_debug.h:198
        00000000000524b8 3        98919         2.5e-04   8.1367    cl_page.c:417
        00000000000524c2 191      99110          0.0157   8.1524    libcfs_debug.h:198
        0000000000052512 86       99196          0.0071   8.1595    cl_object.h:2631
        0000000000052517 2425     101621         0.1995   8.3589    libcfs_debug.h:198
        000000000005251e 42       101663         0.0035   8.3624    cl_object.h:2631
        0000000000052522 767      102430         0.0631   8.4255    cl_page.c:339
        0000000000052530 154      102584         0.0127   8.4382    libcfs_debug.h:198
        0000000000052536 13       102597         0.0011   8.4392    thread_info.h:216
        000000000005253f 700      103297         0.0576   8.4968    cl_page.c:341
        0000000000052546 363      103660         0.0299   8.5267    cl_page.c:341
        0000000000052551 106      103766         0.0087   8.5354    cl_page.c:341
        0000000000052558 32       103798         0.0026   8.5380    cl_page.c:341
        0000000000052562 85       103883         0.0070   8.5450    cl_page.c:341
        0000000000052565 5        103888        4.1e-04   8.5454    cl_page.c:341
        0000000000052568 51       103939         0.0042   8.5496    cl_page.c:341
        000000000005256e 1        103940        8.2e-05   8.5497    libcfs_fail.h:81
        0000000000052575 52       103992         0.0043   8.5540    libcfs_fail.h:81
        0000000000052580 56       104048         0.0046   8.5586    cl_page.c:341
        0000000000052585 1        104049        8.2e-05   8.5587    cl_page.c:341
        0000000000052588 55600    159649         4.5734  13.1321    cl_page.c:341
        000000000005258a 197      159846         0.0162  13.1483    cl_page.c:341
        0000000000052591 6        159852        4.9e-04  13.1488    cl_page.c:341
        0000000000052598 30       159882         0.0025  13.1513    cl_page.c:341
        000000000005259d 96       159978         0.0079  13.1592    libcfs_debug.h:198
        00000000000525a4 2        159980        1.6e-04  13.1593    cl_page.c:341
        00000000000525ae 103      160083         0.0085  13.1678    libcfs_debug.h:198
        00000000000525b7 1        160084        8.2e-05  13.1679    cl_page.c:344
        00000000000525bb 21       160105         0.0017  13.1696    atomic_64.h:35
        00000000000525c2 869      160974         0.0715  13.2411    cl_page.c:345
        00000000000525c7 79       161053         0.0065  13.2476    cl_page.c:346
        00000000000525d2 56       161109         0.0046  13.2522    cl_page.c:348
        00000000000525d6 46       161155         0.0038  13.2560    cl_page.c:354
        00000000000525da 19       161174         0.0016  13.2575    cl_page.c:354
        00000000000525e1 86       161260         0.0071  13.2646    cl_page.c:326
        00000000000525e9 22       161282         0.0018  13.2664    cl_page.c:350
        00000000000525f0 19       161301         0.0016  13.2680    cl_page.c:354
        00000000000525f7 23       161324         0.0019  13.2699    cl_page.c:348
        00000000000525fb 82       161406         0.0067  13.2766    cl_page.c:351
        00000000000525ff 1        161407        8.2e-05  13.2767    list.h:30
        0000000000052603 18       161425         0.0015  13.2782    list.h:31
        0000000000052607 26       161451         0.0021  13.2803    cl_page.c:352
        000000000005260b 70       161521         0.0058  13.2861    list.h:30
        000000000005260f 16       161537         0.0013  13.2874    list.h:31
        0000000000052613 20       161557         0.0016  13.2891    cl_page.c:353
        0000000000052617 12       161569        9.9e-04  13.2900    list.h:30
        000000000005261b 89       161658         0.0073  13.2974    list.h:31
        000000000005261f 26       161684         0.0021  13.2995    cl_page.c:354
        0000000000052624 21       161705         0.0017  13.3012    cl_page.c:356
        0000000000052628 1605     163310         0.1320  13.4332    cl_page.c:357
        000000000005262c 30095    193405         2.4755  15.9087    cl_page.c:357
        0000000000052630 48       193453         0.0039  15.9127    cl_page.c:357
        0000000000052633 60       193513         0.0049  15.9176    cl_page.c:357
        0000000000052637 4        193517        3.3e-04  15.9180    cl_page.c:357
        0000000000052639 43       193560         0.0035  15.9215    cl_page.c:357
        000000000005263d 51       193611         0.0042  15.9257    cl_page.c:357
        0000000000052641 9        193620        7.4e-04  15.9264    cl_page.c:357
        0000000000052645 4        193624        3.3e-04  15.9268    cl_page.c:357
        0000000000052648 44       193668         0.0036  15.9304    cl_page.c:357
        0000000000052650 142      193810         0.0117  15.9421    cl_page.c:359
        0000000000052655 6356     200166         0.5228  16.4649    cl_page.c:359
        0000000000052658 568      200734         0.0467  16.5116    cl_page.c:359
        000000000005265d 124      200858         0.0102  16.5218    cl_page.c:360
        0000000000052660 63       200921         0.0052  16.5270    cl_page.c:360
        0000000000052663 3        200924        2.5e-04  16.5272    cl_page.c:360
        0000000000052666 24       200948         0.0020  16.5292    cl_page.c:360
        0000000000052669 88       201036         0.0072  16.5364    cl_page.c:360
        000000000005266b 47       201083         0.0039  16.5403    cl_page.c:362
        0000000000052674 103      201186         0.0085  16.5488    cl_page.c:357
        0000000000052679 4520     205706         0.3718  16.9206    cl_page.c:357
        000000000005267c 103      205809         0.0085  16.9290    cl_page.c:357
        0000000000052680 3        205812        2.5e-04  16.9293    cl_page.c:357
        0000000000052682 81       205893         0.0067  16.9360    cl_page.c:357
        0000000000052686 80       205973         0.0066  16.9425    cl_page.c:357
        000000000005268a 19       205992         0.0016  16.9441    atomic_64.h:93
        000000000005268e 11       206003        9.0e-04  16.9450    atomic_64.h:93
        0000000000052692 25935    231938         2.1333  19.0783    atomic_64.h:93
        0000000000052696 5512     237450         0.4534  19.5317    atomic_64.h:93
        000000000005269a 5961     243411         0.4903  20.0220    atomic_64.h:93
        00000000000526a6 87       243498         0.0072  20.0292    libcfs_debug.h:198
        0000000000052718 191      243689         0.0157  20.0449    cl_page.c:430
        000000000005271b 9        243698        7.4e-04  20.0457    cl_page.c:430
        0000000000052723 2761     246459         0.2271  20.2728    cl_page.c:438
        0000000000052726 1        246460        8.2e-05  20.2728    cl_page.c:430
        0000000000052729 17       246477         0.0014  20.2742    cl_page.c:438
        000000000005272f 59       246536         0.0049  20.2791    atomic_64.h:93
        0000000000052733 44355    290891         3.6485  23.9276    libcfs_debug.h:198
        000000000005273a 2        290893        1.6e-04  23.9277    libcfs_debug.h:198
        0000000000052761 50       290943         0.0041  23.9318    cl_page.c:497
        0000000000052764 53       290996         0.0044  23.9362    cl_page.c:497
        0000000000052768 33       291029         0.0027  23.9389    cl_page.c:497
        000000000005276c 80       291109         0.0066  23.9455    cl_page.c:497
        0000000000052770 58       291167         0.0048  23.9503    cl_page.c:497
        0000000000052774 77       291244         0.0063  23.9566    cl_page.c:497
        0000000000052778 45       291289         0.0037  23.9603    cl_page.c:497
        0000000000052779 155      291444         0.0127  23.9731    cl_page.c:497
        00000000000527b8 128      291572         0.0105  23.9836    cl_page.c:448
        0000000000052850 105      291677         0.0086  23.9922    cl_page.c:461
        0000000000052854 2        291679        1.6e-04  23.9924    cl_page.c:461
        0000000000052858 1        291680        8.2e-05  23.9925    cl_page.c:461
        000000000005285f 96       291776         0.0079  24.0004    cl_page.c:461
        0000000000052864 901444   1193220       74.1493  98.1496    cl_page.c:462
        0000000000052868 106      1193326        0.0087  98.1584    cl_page.c:462
        000000000005286c 4        1193330       3.3e-04  98.1587    cl_page.c:462
        0000000000052870 31       1193361        0.0025  98.1612    cl_page.c:462
        0000000000052874 60       1193421        0.0049  98.1662    cl_page.c:462
        000000000005287c 81       1193502        0.0067  98.1728    cl_page.c:463
        000000000005287e 2        1193504       1.6e-04  98.1730    cl_page.c:463
        0000000000052882 42       1193546        0.0035  98.1765    cl_page.c:463
        0000000000052888 58       1193604        0.0048  98.1812    cl_page.c:482
        000000000005288d 111      1193715        0.0091  98.1904    cl_page.c:482
        000000000005288f 16       1193731        0.0013  98.1917    cl_page.c:483
        0000000000052894 302      1194033        0.0248  98.2165    cl_page.c:483
        000000000005289a 32       1194065        0.0026  98.2192    cl_page.c:484
        000000000005289e 136      1194201        0.0112  98.2303    cl_page.c:484
        00000000000528a2 105      1194306        0.0086  98.2390    cl_page.c:485
        00000000000528a6 270      1194576        0.0222  98.2612    cl_page.c:487
        00000000000528aa 83       1194659        0.0068  98.2680    cl_page.c:487
        00000000000528ad 25       1194684        0.0021  98.2701    cl_page.c:487
        00000000000528b0 29       1194713        0.0024  98.2725    cl_page.c:487
        00000000000528b5 13876    1208589        1.1414  99.4138    spinlock.h:174
        00000000000528b8 14       1208603        0.0012  99.4150    spinlock.h:174
        00000000000528bb 6973     1215576        0.5736  99.9886    spinlock.h:174
        00000000000528be 18       1215594        0.0015  99.9900    cl_page.c:491
        00000000000528c7 18       1215612        0.0015  99.9915    libcfs_debug.h:198
        00000000000528ce 103      1215715        0.0085  100.000    libcfs_debug.h:198
      
      000000000004d840 876063   9578630        6.0478  66.1244    cl_object.c:167             obdclass.ko              cl_object_top
        000000000004d840 6463     6463           0.7377   0.7377    cl_object.c:167
        000000000004d841 9476     15939          1.0817   1.8194    cl_object.c:167
        000000000004d844 438      16377          0.0500   1.8694    cl_object.c:167
        000000000004d845 569      16946          0.0649   1.9343    cl_object.c:167
        000000000004d849 185      17131          0.0211   1.9555    cl_object.c:167
        000000000004d84e 276      17407          0.0315   1.9870    cl_object.h:2626
        000000000004d851 17130    34537          1.9553   3.9423    cl_object.h:2626
        000000000004d858 97       34634          0.0111   3.9534    cl_object.c:171
        000000000004d85b 254      34888          0.0290   3.9824    cl_object.c:171
        000000000004d862 424776   459664        48.4869  52.4693    cl_object.c:171
        000000000004d867 693      460357         0.0791  52.5484    list.h:186
        000000000004d86b 368051   828408        42.0119  94.5603    lu_object.h:748
        000000000004d86f 10       828418         0.0011  94.5615    lu_object.h:748
        000000000004d878 662      829080         0.0756  94.6370    libcfs.h:320
        000000000004d880 616      829696         0.0703  94.7073    libcfs.h:320
        000000000004d889 539      830235         0.0615  94.7689    libcfs.h:323
        000000000004d88d 3        830238        3.4e-04  94.7692    cl_object.h:2603
        000000000004d890 29       830267         0.0033  94.7725    lu_object.h:749
        000000000004d893 391      830658         0.0446  94.8172    cl_object.h:2603
        000000000004d895 630      831288         0.0719  94.8891    libcfs_debug.h:198
        000000000004d89c 159      831447         0.0181  94.9072    cl_object.c:175
        000000000004d8a6 614      832061         0.0701  94.9773    libcfs_debug.h:198
        000000000004d8cc 11       832072         0.0013  94.9786    cl_object.c:177
        000000000004d8cf 156      832228         0.0178  94.9964    cl_object.c:177
        000000000004d8d3 8        832236        9.1e-04  94.9973    cl_object.c:177
        000000000004d8d4 662      832898         0.0756  95.0728    cl_object.c:177
        000000000004d8d5 221      833119         0.0252  95.0981    cl_object.c:177
        000000000004d8d6 584      833703         0.0667  95.1647    cl_object.h:2603
        000000000004d8de 517      834220         0.0590  95.2237    cl_object.h:2587
        000000000004d8e2 29587    863807         3.3773  98.6010    cl_object.h:2587
        000000000004d8e6 9113     872920         1.0402  99.6412    cl_object.h:2603
        000000000004d8e9 3143     876063         0.3588  100.000    cl_object.h:2603
      
      00000000000510d0 760239   10338869       5.2482  71.3726    cl_page.c:1126              obdclass.ko              cl_page_delete0
        00000000000510d0 12       12             0.0016   0.0016    cl_page.c:1126
        00000000000510d1 38       50             0.0050   0.0066    cl_page.c:1126
        00000000000510d4 9        59             0.0012   0.0078    cl_page.c:1126
        00000000000510d6 14       73             0.0018   0.0096    cl_page.c:1126
        00000000000510d8 42       115            0.0055   0.0151    cl_page.c:1126
        00000000000510da 6        121           7.9e-04   0.0159    cl_page.c:1126
        00000000000510dc 6        127           7.9e-04   0.0167    cl_page.c:1126
        00000000000510dd 4        131           5.3e-04   0.0172    cl_page.c:1126
        00000000000510e1 26       157            0.0034   0.0207    cl_page.c:1126
        00000000000510e6 4        161           5.3e-04   0.0212    libcfs_debug.h:198
        00000000000510ed 32       193            0.0042   0.0254    cl_page.c:1128
        00000000000510f7 4        197           5.3e-04   0.0259    cl_page.c:1126
        00000000000510fa 8        205            0.0011   0.0270    cl_page.c:1126
        00000000000510fd 28       233            0.0037   0.0306    cl_page.c:1126
        0000000000051100 6        239           7.9e-04   0.0314    libcfs_debug.h:198
        000000000005110f 5        244           6.6e-04   0.0321    libcfs_debug.h:198
        0000000000051112 3        247           3.9e-04   0.0325    libcfs_debug.h:198
        000000000005111b 25       272            0.0033   0.0358    cl_page.c:102
        0000000000051120 2        274           2.6e-04   0.0360    cl_page.c:102
        0000000000051125 2        276           2.6e-04   0.0363    cl_page.c:1130
        000000000005112e 2        278           2.6e-04   0.0366    cl_page.c:1131
        0000000000051133 20       298            0.0026   0.0392    cl_page.c:1131
        0000000000051139 4        302           5.3e-04   0.0397    cl_page.c:1136
        000000000005113c 2        304           2.6e-04   0.0400    cl_page.c:1136
        0000000000051141 22       326            0.0029   0.0429    cl_page.c:1146
        0000000000051143 14       340            0.0018   0.0447    cl_page.c:1146
        0000000000051146 1        341           1.3e-04   0.0449    cl_page.c:1146
        0000000000051149 7        348           9.2e-04   0.0458    cl_page.c:1146
        000000000005114e 1        349           1.3e-04   0.0459    cl_page.c:1147
        0000000000051159 28       377            0.0037   0.0496    cl_page.c:1147
        0000000000051167 98       475            0.0129   0.0625    cl_page.c:1149
        0000000000051169 3        478           3.9e-04   0.0629    cl_page.c:1149
        0000000000051173 32       510            0.0042   0.0671    cl_page.c:102
        0000000000051178 167      677            0.0220   0.0891    cl_page.c:102
        000000000005117d 21       698            0.0028   0.0918    cl_page.c:102
        0000000000051180 5        703           6.6e-04   0.0925    cl_page.c:1170
        0000000000051185 117      820            0.0154   0.1079    cl_page.c:1170
        0000000000051189 24       844            0.0032   0.1110    cl_page.c:1170
        000000000005118e 62       906            0.0082   0.1192    cl_page.c:1170
        0000000000051190 38       944            0.0050   0.1242    cl_page.c:1170
        0000000000051194 855      1799           0.1125   0.2366    cl_page.c:1170
        0000000000051198 192      1991           0.0253   0.2619    cl_page.c:1170
        000000000005119d 31       2022           0.0041   0.2660    cl_page.c:1170
        00000000000511a0 41       2063           0.0054   0.2714    cl_page.c:1170
        00000000000511a3 4        2067          5.3e-04   0.2719    cl_page.c:1170
        00000000000511a5 159      2226           0.0209   0.2928    cl_page.c:1170
        00000000000511a9 143      2369           0.0188   0.3116    cl_page.c:1170
        00000000000511ad 44       2413           0.0058   0.3174    cl_page.c:1170
        00000000000511b2 6        2419          7.9e-04   0.3182    cl_page.c:1170
        00000000000511b7 152      2571           0.0200   0.3382    cl_page.c:1170
        00000000000511bc 24       2595           0.0032   0.3413    libcfs_debug.h:198
        00000000000511c3 3        2598          3.9e-04   0.3417    cl_page.c:1172
        00000000000511cd 28       2626           0.0037   0.3454    libcfs_debug.h:198
        00000000000511f1 7        2633          9.2e-04   0.3463    cl_page.c:1173
        00000000000511f2 16       2649           0.0021   0.3484    cl_page.c:1173
        00000000000511f4 4        2653          5.3e-04   0.3490    cl_page.c:1173
        00000000000511f6 7        2660          9.2e-04   0.3499    cl_page.c:1173
        00000000000511f8 11       2671           0.0014   0.3513    cl_page.c:1173
        00000000000511fa 21       2692           0.0028   0.3541    cl_page.c:1173
        00000000000511fb 4        2696          5.3e-04   0.3546    cl_page.c:1173
        0000000000051216 2        2698          2.6e-04   0.3549    cl_page.c:1150
        0000000000051219 4        2702          5.3e-04   0.3554    cl_page.c:1150
        000000000005121c 39       2741           0.0051   0.3605    cl_page.c:1150
        0000000000051228 82       2823           0.0108   0.3713    cl_page.c:1164
        000000000005122c 1095     3918           0.1440   0.5154    cl_page.c:1164
        0000000000051235 23       3941           0.0030   0.5184    cl_page.c:1165
        000000000005123d 120      4061           0.0158   0.5342    spinlock.h:174
        0000000000051243 2020     6081           0.2657   0.7999    spinlock.h:174
        000000000005124a 717      6798           0.0943   0.8942    cl_page.c:1155
        0000000000051253 19       6817           0.0025   0.8967    cl_object.h:2626
        0000000000051257 194      7011           0.0255   0.9222    cl_object.h:2626
        000000000005125a 2233     9244           0.2937   1.2159    cl_page.c:1160
        000000000005125e 44       9288           0.0058   1.2217    cl_page.c:1160
        0000000000051261 36       9324           0.0047   1.2265    cl_page.c:1160
        0000000000051266 750688   760012        98.7437  99.9701    cl_page.c:1161
        000000000005126a 120      760132         0.0158  99.9859    cl_page.c:1161
        0000000000051273 107      760239         0.0141  100.000    cl_page.c:1163
      

      We can start by trying the easy stuff to improve performance here. For example, as suggested in the cl_object_header structure definition aligning the spin locks on cache lines should help. This may be particularly true on Sequoias PowerPC arch where we have 68 slow cores and relatively long latencies to memory. Although, I don't see much contention in the oprofile results for the other locks in the cache line. But if that fails we're going to need to look at optimizing what's done under the spin locks. On first glance I see several while loops and for loops which is always concerning.

      Attachments

        1. fpp.fio
          0.2 kB
        2. ssf.fio
          0.3 kB

        Issue Links

          Activity

            [LU-1666] hdr->coh_page_guard contention (single shared file performance)

            Gregoire submitted a b2_1 version of this patch at http://review.whamcloud.com/5428.

            adilger Andreas Dilger added a comment - Gregoire submitted a b2_1 version of this patch at http://review.whamcloud.com/5428 .

            As proposed by Jinshan, I have open a separate ticket for the parallel read performance issue: LU-2481 "performance issue of parallel read".

            pichong Gregoire Pichon added a comment - As proposed by Jinshan, I have open a separate ticket for the parallel read performance issue: LU-2481 "performance issue of parallel read".

            Ned,
            It's true the LU-1666 patch gives some improvement at 16 tasks. Anyway, I would have expected the per-task performance to maintain at the level (or slightly lower) of the one-reader case. It's what I observe on ext4 for instance.

            pichong Gregoire Pichon added a comment - Ned, It's true the LU-1666 patch gives some improvement at 16 tasks. Anyway, I would have expected the per-task performance to maintain at the level (or slightly lower) of the one-reader case. It's what I observe on ext4 for instance.

            Hi Gregoire,

            You got a factor of two performance increase at 16 tasks, so why do you say there was no real improvement? I think the same optimization should apply with minimal adaptation to 2.1, since the same contention point exists there.

            nedbass Ned Bass (Inactive) added a comment - Hi Gregoire, You got a factor of two performance increase at 16 tasks, so why do you say there was no real improvement? I think the same optimization should apply with minimal adaptation to 2.1, since the same contention point exists there.

            One of our customer (TGCC - lustre 2.1.3) is facing a performance issue while reading in the same file from many tasks on one client at the same time. Performance drops while increasing the number of tasks.

            I have reproduced the issue and obtained these results:

                tasks     Mean    Total (MB/s)
            dd      1      339      339
            dd      2      107      214
            dd      4       78      312
            dd      8       64      512
            dd     16       26      428
            

            Applying the patch http://review.whamcloud.com/#change,3627 on lustre 2.1.3 gives no real improvement.

                tasks     Mean    Total (MB/s)
            dd      1      542      542
            dd      2      118      236
            dd      4       74      298
            dd      8       73      591
            dd     16       52      833
            

            The profiling shows contention in CLIO page routines.
            (I suppose the libpython entry comes from the mpirun synchronization of the 16 tasks)

            samples  %        linenr info                 app name                 symbol name
            1149893  34.8489  (no location information)   libpython2.6.so.1.0      /usr/lib64/libpython2.6.so.1.0
            338676   10.2640  dec_and_lock.c:21           vmlinux                  _atomic_dec_and_lock
            335241   10.1599  cl_page.c:404               obdclass.ko              cl_page_find0
            293448    8.8933  cl_page.c:729               obdclass.ko              cl_vmpage_page
            235856    7.1479  filemap.c:963               vmlinux                  grab_cache_page_nowait
            142686    4.3243  cl_page.c:661               obdclass.ko              cl_page_put
            80065     2.4265  filemap.c:667               vmlinux                  find_get_page
            46623     1.4130  copy_user_64.S:240          vmlinux                  copy_user_generic_string
            34127     1.0343  (no location information)   libpthread-2.12.so       pthread_rwlock_rdlock
            33685     1.0209  (no location information)   libc-2.12.so             getenv
            31725     0.9615  (no location information)   libpthread-2.12.so       pthread_rwlock_unlock
            30492     0.9241  swap.c:288                  vmlinux                  activate_page
            28528     0.8646  (no location information)   libc-2.12.so             __dcigettext
            23703     0.7183  (no location information)   libc-2.12.so             __strlen_sse42
            22345     0.6772  vvp_page.c:120              lustre.ko                vvp_page_unassume
            20646     0.6257  swap.c:183                  vmlinux                  put_page
            18525     0.5614  intel_idle.c:215            vmlinux                  intel_idle
            12774     0.3871  open.c:961                  vmlinux                  sys_close
            12702     0.3849  filemap.c:579               vmlinux                  unlock_page
            12685     0.3844  (no location information)   libc-2.12.so             __memcpy_ssse3
            12134     0.3677  cl_page.c:874               obdclass.ko              cl_page_invoid
            11734     0.3556  (no location information)   libpthread-2.12.so       __close_nocancel
            10573     0.3204  cl_page.c:884               obdclass.ko              cl_page_owner_clear
            10510     0.3185  (no location information)   libc-2.12.so             strerror_r
            9783      0.2965  cl_page.c:1068              obdclass.ko              cl_page_unassume
            9472      0.2871  (no location information)   oprofiled                /usr/bin/oprofiled
            9438      0.2860  (no location information)   libc-2.12.so             __strncmp_sse2
            8826      0.2675  rw.c:724                    lustre.ko                ll_readahead
            8769      0.2658  filemap.c:527               vmlinux                  page_waitqueue
            8095      0.2453  entry_64.S:470              vmlinux                  system_call_after_swapgs
            7756      0.2351  cl_page.c:561               obdclass.ko              cl_page_state_set0
            7615      0.2308  (no location information)   libc-2.12.so             __stpcpy_sse2
            7366      0.2232  entry_64.S:462              vmlinux                  system_call
            7097      0.2151  sched.c:4430                vmlinux                  find_busiest_group
            6898      0.2091  (no location information)   libperl.so               /usr/lib64/perl5/CORE/libperl.so
            6887      0.2087  (no location information)   libc-2.12.so             __ctype_b_loc
            6780      0.2055  wait.c:251                  vmlinux                  __wake_up_bit
            6631      0.2010  cl_page.c:139               obdclass.ko              cl_page_at_trusted
            6588      0.1997  (no location information)   libc-2.12.so             __strlen_sse2
            6336      0.1920  lov_io.c:239                lov.ko                   lov_page_stripe
            5289      0.1603  lov_io.c:208                lov.ko                   lov_sub_get
            5263      0.1595  ring_buffer.c:2834          vmlinux                  rb_get_reader_page
            4595      0.1393  lvfs_lib.c:76               lvfs.ko                  lprocfs_counter_add
            4373      0.1325  (no location information)   libc-2.12.so             strerror
            4347      0.1317  lov_io.c:252                lov.ko                   lov_page_subio
            4337      0.1314  cl_page.c:1036              obdclass.ko              cl_page_assume
            4310      0.1306  (no location information)   libc-2.12.so             __mempcpy_sse2
            3903      0.1183  lov_page.c:92               lov.ko                   lov_page_own
            3866      0.1172  mutex.c:409                 vmlinux                  __mutex_lock_slowpath
            3841      0.1164  lcommon_cl.c:1097           lustre.ko                cl2vm_page
            3698      0.1121  radix-tree.c:414            vmlinux                  radix_tree_lookup_slot
            3506      0.1063  vvp_page.c:109              lustre.ko                vvp_page_assume
            3491      0.1058  lu_object.c:1115            obdclass.ko              lu_object_locate
            3490      0.1058  (no location information)   ipmi_si.ko               port_inb
            3457      0.1048  ring_buffer.c:3221          vmlinux                  ring_buffer_consume
            3431      0.1040  cl_page.c:898               obdclass.ko              cl_page_owner_set
            

            Does the patch LU-1666 need any adaptations to be applied on lustre 2.1.3 branch ?
            Would this use case require additional patches to be fixed ?

            I can open another JIRA ticket if needed and provide additional information (log, dump, profiling) as you which.
            thanks.

            pichong Gregoire Pichon added a comment - One of our customer (TGCC - lustre 2.1.3) is facing a performance issue while reading in the same file from many tasks on one client at the same time. Performance drops while increasing the number of tasks. I have reproduced the issue and obtained these results: tasks Mean Total (MB/s) dd 1 339 339 dd 2 107 214 dd 4 78 312 dd 8 64 512 dd 16 26 428 Applying the patch http://review.whamcloud.com/#change,3627 on lustre 2.1.3 gives no real improvement. tasks Mean Total (MB/s) dd 1 542 542 dd 2 118 236 dd 4 74 298 dd 8 73 591 dd 16 52 833 The profiling shows contention in CLIO page routines. (I suppose the libpython entry comes from the mpirun synchronization of the 16 tasks) samples % linenr info app name symbol name 1149893 34.8489 (no location information) libpython2.6.so.1.0 /usr/lib64/libpython2.6.so.1.0 338676 10.2640 dec_and_lock.c:21 vmlinux _atomic_dec_and_lock 335241 10.1599 cl_page.c:404 obdclass.ko cl_page_find0 293448 8.8933 cl_page.c:729 obdclass.ko cl_vmpage_page 235856 7.1479 filemap.c:963 vmlinux grab_cache_page_nowait 142686 4.3243 cl_page.c:661 obdclass.ko cl_page_put 80065 2.4265 filemap.c:667 vmlinux find_get_page 46623 1.4130 copy_user_64.S:240 vmlinux copy_user_generic_string 34127 1.0343 (no location information) libpthread-2.12.so pthread_rwlock_rdlock 33685 1.0209 (no location information) libc-2.12.so getenv 31725 0.9615 (no location information) libpthread-2.12.so pthread_rwlock_unlock 30492 0.9241 swap.c:288 vmlinux activate_page 28528 0.8646 (no location information) libc-2.12.so __dcigettext 23703 0.7183 (no location information) libc-2.12.so __strlen_sse42 22345 0.6772 vvp_page.c:120 lustre.ko vvp_page_unassume 20646 0.6257 swap.c:183 vmlinux put_page 18525 0.5614 intel_idle.c:215 vmlinux intel_idle 12774 0.3871 open.c:961 vmlinux sys_close 12702 0.3849 filemap.c:579 vmlinux unlock_page 12685 0.3844 (no location information) libc-2.12.so __memcpy_ssse3 12134 0.3677 cl_page.c:874 obdclass.ko cl_page_invoid 11734 0.3556 (no location information) libpthread-2.12.so __close_nocancel 10573 0.3204 cl_page.c:884 obdclass.ko cl_page_owner_clear 10510 0.3185 (no location information) libc-2.12.so strerror_r 9783 0.2965 cl_page.c:1068 obdclass.ko cl_page_unassume 9472 0.2871 (no location information) oprofiled /usr/bin/oprofiled 9438 0.2860 (no location information) libc-2.12.so __strncmp_sse2 8826 0.2675 rw.c:724 lustre.ko ll_readahead 8769 0.2658 filemap.c:527 vmlinux page_waitqueue 8095 0.2453 entry_64.S:470 vmlinux system_call_after_swapgs 7756 0.2351 cl_page.c:561 obdclass.ko cl_page_state_set0 7615 0.2308 (no location information) libc-2.12.so __stpcpy_sse2 7366 0.2232 entry_64.S:462 vmlinux system_call 7097 0.2151 sched.c:4430 vmlinux find_busiest_group 6898 0.2091 (no location information) libperl.so /usr/lib64/perl5/CORE/libperl.so 6887 0.2087 (no location information) libc-2.12.so __ctype_b_loc 6780 0.2055 wait.c:251 vmlinux __wake_up_bit 6631 0.2010 cl_page.c:139 obdclass.ko cl_page_at_trusted 6588 0.1997 (no location information) libc-2.12.so __strlen_sse2 6336 0.1920 lov_io.c:239 lov.ko lov_page_stripe 5289 0.1603 lov_io.c:208 lov.ko lov_sub_get 5263 0.1595 ring_buffer.c:2834 vmlinux rb_get_reader_page 4595 0.1393 lvfs_lib.c:76 lvfs.ko lprocfs_counter_add 4373 0.1325 (no location information) libc-2.12.so strerror 4347 0.1317 lov_io.c:252 lov.ko lov_page_subio 4337 0.1314 cl_page.c:1036 obdclass.ko cl_page_assume 4310 0.1306 (no location information) libc-2.12.so __mempcpy_sse2 3903 0.1183 lov_page.c:92 lov.ko lov_page_own 3866 0.1172 mutex.c:409 vmlinux __mutex_lock_slowpath 3841 0.1164 lcommon_cl.c:1097 lustre.ko cl2vm_page 3698 0.1121 radix-tree.c:414 vmlinux radix_tree_lookup_slot 3506 0.1063 vvp_page.c:109 lustre.ko vvp_page_assume 3491 0.1058 lu_object.c:1115 obdclass.ko lu_object_locate 3490 0.1058 (no location information) ipmi_si.ko port_inb 3457 0.1048 ring_buffer.c:3221 vmlinux ring_buffer_consume 3431 0.1040 cl_page.c:898 obdclass.ko cl_page_owner_set Does the patch LU-1666 need any adaptations to be applied on lustre 2.1.3 branch ? Would this use case require additional patches to be fixed ? I can open another JIRA ticket if needed and provide additional information (log, dump, profiling) as you which. thanks.

            Please use patch set 20 of LU-744.

            jay Jinshan Xiong (Inactive) added a comment - Please use patch set 20 of LU-744 .

            This problem is imported in LU-744, I will work out a patch soon.

            jay Jinshan Xiong (Inactive) added a comment - This problem is imported in LU-744 , I will work out a patch soon.

            Yes, sorry I neglected to include it above.

            2012-08-16 15:46:27.275649 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_cache.c:2338:osc_teardown_async_page()) extent c0000003642caf38@{[361184 -> 361199/361199], [3|1|-|active|wi|c00000029bfc0240], [1048576|16|+|-|c000000295f440b0|16|(null)]} trunc at 361195.
            2012-08-16 15:46:27.276084 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) page@c000000343b9ed40[2 c0000003c0a2ee98:2889515 ^(null)_c000000343b9ead0 4 0 1 (null) (null) 0x0]
            2012-08-16 15:46:27.276522 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) page@c000000343b9ead0[1 c00000036d6e4e68:361195 ^c000000343b9ed40_(null) 4 0 1 (null) (null) 0x0]
            2012-08-16 15:46:27.276915 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) vvp-page@c0000003633dce90(0:0:0) vm@c0000000015ddce8 71 2:0 0 2889515 lru
            2012-08-16 15:46:27.277362 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) lov-page@c000000349140930
            2012-08-16 15:46:27.277779 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) osc-page@c00000038c90d328: 1< 0x845fed 2 0 + - > 2< 23671275520 0 65536 0x0 0x420 | (null) c00000029159a7b0 c00000029bfc0240 > 3< + c0000003e9990740 0 0 0 > 4< 0 0 8 18677760 - | + - + - > 5< + - + - | 0 - | 64 - ->
            2012-08-16 15:46:27.278174 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) end page@c000000343b9ed40
            2012-08-16 15:46:27.278663 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) Trying to teardown failed: -16
            
            nedbass Ned Bass (Inactive) added a comment - Yes, sorry I neglected to include it above. 2012-08-16 15:46:27.275649 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_cache.c:2338:osc_teardown_async_page()) extent c0000003642caf38@{[361184 -> 361199/361199], [3|1|-|active|wi|c00000029bfc0240], [1048576|16|+|-|c000000295f440b0|16|(null)]} trunc at 361195. 2012-08-16 15:46:27.276084 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) page@c000000343b9ed40[2 c0000003c0a2ee98:2889515 ^(null)_c000000343b9ead0 4 0 1 (null) (null) 0x0] 2012-08-16 15:46:27.276522 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) page@c000000343b9ead0[1 c00000036d6e4e68:361195 ^c000000343b9ed40_(null) 4 0 1 (null) (null) 0x0] 2012-08-16 15:46:27.276915 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) vvp-page@c0000003633dce90(0:0:0) vm@c0000000015ddce8 71 2:0 0 2889515 lru 2012-08-16 15:46:27.277362 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) lov-page@c000000349140930 2012-08-16 15:46:27.277779 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) osc-page@c00000038c90d328: 1< 0x845fed 2 0 + - > 2< 23671275520 0 65536 0x0 0x420 | (null) c00000029159a7b0 c00000029bfc0240 > 3< + c0000003e9990740 0 0 0 > 4< 0 0 8 18677760 - | + - + - > 5< + - + - | 0 - | 64 - -> 2012-08-16 15:46:27.278174 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) end page@c000000343b9ed40 2012-08-16 15:46:27.278663 {DefaultControlEventListener} [mmcs]{16}.10.3: LustreError: 5769:0:(osc_page.c:430:osc_page_delete()) Trying to teardown failed: -16

            Hi Ned, do you have any luck to grab the page debug info such as LU-1723?

            jay Jinshan Xiong (Inactive) added a comment - Hi Ned, do you have any luck to grab the page debug info such as LU-1723 ?

            No I hadn't noticed that bug, thanks for the pointer.

            nedbass Ned Bass (Inactive) added a comment - No I hadn't noticed that bug, thanks for the pointer.

            People

              jay Jinshan Xiong (Inactive)
              behlendorf Brian Behlendorf
              Votes:
              1 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: