[LU-11164] lvbo_*() methods to reuse env - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.12.0
Affects Version/s: None
Labels:
- perf_optimization

Rank (Obsolete):
9223372036854775807

Description

The lvbo methods have to reallocate lu_env every time, which can be quite expensive in terms of CPU cycles at scale.
The layers above can pass lu_env to reuse existing one.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

master-patch-32832.svg
114 kB
25/Sep/18 10:00 PM
ost_io.svg
433 kB
25/Sep/18 9:51 PM

Issue Links

is related to

LU-12034 env allocation in ptlrpc_set_wait() causes slowdown on the client

Resolved

is related to

LU-11347 Do not use pagecache for SSD I/O when read/write cache are disabled

Resolved

Activity

[LU-11164] lvbo_*() methods to reuse env

Peter Jones added a comment - 06/Oct/18 1:23 PM

Landed for 2.12. James, perf testing is a standard part of release testing

Peter Jones added a comment - 06/Oct/18 1:23 PM Landed for 2.12. James, perf testing is a standard part of release testing

James A Simmons added a comment - 05/Oct/18 11:03 PM

Before we close this has anyone measured metadata performance? I wonder if their was any impact form this work.

James A Simmons added a comment - 05/Oct/18 11:03 PM Before we close this has anyone measured metadata performance? I wonder if their was any impact form this work.

Gerrit Updater added a comment - 05/Oct/18 10:26 PM

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32832/
Subject: ~~LU-11164~~ ldlm: pass env to lvbo methods
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e02cb40761ff8aae3df76c4210a345420b6d4ba1

Gerrit Updater added a comment - 05/Oct/18 10:26 PM Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32832/ Subject: LU-11164 ldlm: pass env to lvbo methods Project: fs/lustre-release Branch: master Current Patch Set: Commit: e02cb40761ff8aae3df76c4210a345420b6d4ba1

Patrick Farrell (Inactive) added a comment - 01/Oct/18 3:32 PM

Thanks for the detailed benchmark info, Andreas, Ihara. (Sorry, I was not originally following the LU so did not see the earlier comment.)

Patrick Farrell (Inactive) added a comment - 01/Oct/18 3:32 PM Thanks for the detailed benchmark info, Andreas, Ihara. (Sorry, I was not originally following the LU so did not see the earlier comment.)

Shuichi Ihara added a comment - 01/Oct/18 12:10 AM

Here is performance resutls with patches.
4K ranodm read from 768 processes on 32 clients with fio below.

[randread]
ioengine=sync
rw=randread
blocksize=4096
iodepth=16
direct=1
size=4g
runtime=60
numjobs=24
group_reporting
directory=/cache1/fio.out
filename_format=f.$jobnum.$filenum

master without patch

All clients: (groupid=0, jobs=32): err= 0: pid=0: Mon Oct  1 09:00:53 2018
   read: IOPS=564k, BW=2205Mi (2312M)(129GiB/60006msec)
    clat (usec): min=210, max=228722, avg=1370.30, stdev=623.75
     lat (usec): min=210, max=228722, avg=1370.45, stdev=623.75
   bw (  KiB/s): min= 1192, max=11160, per=0.13%, avg=2911.36, stdev=1003.58, samples=91693
   iops        : min=  298, max= 2790, avg=727.83, stdev=250.90, samples=91693
  lat (usec)   : 250=0.01%, 500=3.09%, 750=4.03%, 1000=11.72%
  lat (msec)   : 2=71.92%, 4=9.13%, 10=0.06%, 20=0.05%, 50=0.01%
  lat (msec)   : 100=0.01%, 250=0.01%
  cpu          : usr=0.19%, sys=1.70%, ctx=33884824, majf=0, minf=49349
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=33867532,0,0,0 short=0,0,0,0 dropped=0,0,0,0

master with patch 32832

All clients: (groupid=0, jobs=32): err= 0: pid=0: Mon Oct  1 00:50:54 2018
   read: IOPS=639k, BW=2497Mi (2619M)(146GiB/60062msec)
    clat (usec): min=202, max=303137, avg=1199.86, stdev=731.66
     lat (usec): min=202, max=303137, avg=1200.01, stdev=731.65
   bw (  KiB/s): min=  624, max=12712, per=0.13%, avg=3319.10, stdev=1131.37, samples=91719
   iops        : min=  156, max= 3178, avg=829.76, stdev=282.84, samples=91719
  lat (usec)   : 250=0.01%, 500=4.35%, 750=4.64%, 1000=21.49%
  lat (msec)   : 2=65.65%, 4=3.70%, 10=0.07%, 20=0.08%, 50=0.01%
  lat (msec)   : 100=0.01%, 250=0.01%, 500=0.01%
  cpu          : usr=0.21%, sys=1.89%, ctx=38418553, majf=0, minf=54483
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=38398687,0,0,0 short=0,0,0,0 dropped=0,0,0,0

564k vs 639k IOPS. patch contributes 13% performance gain.

Shuichi Ihara added a comment - 01/Oct/18 12:10 AM Here is performance resutls with patches. 4K ranodm read from 768 processes on 32 clients with fio below. [randread] ioengine=sync rw=randread blocksize=4096 iodepth=16 direct=1 size=4g runtime=60 numjobs=24 group_reporting directory=/cache1/fio.out filename_format=f.$jobnum.$filenum master without patch All clients: (groupid=0, jobs=32): err= 0: pid=0: Mon Oct 1 09:00:53 2018 read: IOPS=564k, BW=2205Mi (2312M)(129GiB/60006msec) clat (usec): min=210, max=228722, avg=1370.30, stdev=623.75 lat (usec): min=210, max=228722, avg=1370.45, stdev=623.75 bw ( KiB/s): min= 1192, max=11160, per=0.13%, avg=2911.36, stdev=1003.58, samples=91693 iops : min= 298, max= 2790, avg=727.83, stdev=250.90, samples=91693 lat (usec) : 250=0.01%, 500=3.09%, 750=4.03%, 1000=11.72% lat (msec) : 2=71.92%, 4=9.13%, 10=0.06%, 20=0.05%, 50=0.01% lat (msec) : 100=0.01%, 250=0.01% cpu : usr=0.19%, sys=1.70%, ctx=33884824, majf=0, minf=49349 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=33867532,0,0,0 short=0,0,0,0 dropped=0,0,0,0 master with patch 32832 All clients: (groupid=0, jobs=32): err= 0: pid=0: Mon Oct 1 00:50:54 2018 read: IOPS=639k, BW=2497Mi (2619M)(146GiB/60062msec) clat (usec): min=202, max=303137, avg=1199.86, stdev=731.66 lat (usec): min=202, max=303137, avg=1200.01, stdev=731.65 bw ( KiB/s): min= 624, max=12712, per=0.13%, avg=3319.10, stdev=1131.37, samples=91719 iops : min= 156, max= 3178, avg=829.76, stdev=282.84, samples=91719 lat (usec) : 250=0.01%, 500=4.35%, 750=4.64%, 1000=21.49% lat (msec) : 2=65.65%, 4=3.70%, 10=0.07%, 20=0.08%, 50=0.01% lat (msec) : 100=0.01%, 250=0.01%, 500=0.01% cpu : usr=0.21%, sys=1.89%, ctx=38418553, majf=0, minf=54483 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=38398687,0,0,0 short=0,0,0,0 dropped=0,0,0,0 564k vs 639k IOPS. patch contributes 13% performance gain.

Andreas Dilger added a comment - 25/Sep/18 10:34 PM

Patrick,
attached is a flame graph ( ost_io.svg ) showing CPU usage for the OST under a high-throughput random read workload (fake IO used so that no storage overhead is present, just network and RPC processing). In ost_lvbo_update() the lu_env_init() and lu_env_fini() functions are consuming over 10% of the OSS CPU for basically no benefit. The master-patch-32832.svg flame graph shows the ost_lvbo_update() CPU usage is down to 1.5% when the patch is applied, which resulted in a 6.3% performance improvement for random 4KB reads. Alex, could you please include these results in the commit comment so that it is more clear why we want to land that patch.

The ost_io.svg graph is also showing find_or_create_page() using 4.25% of CPU, which drove the creation of patch https://review.whamcloud.com/32875 "LU-11347 osd: do not use pagecache for I/O".

Andreas Dilger added a comment - 25/Sep/18 10:34 PM Patrick, attached is a flame graph ( ost_io.svg ) showing CPU usage for the OST under a high-throughput random read workload (fake IO used so that no storage overhead is present, just network and RPC processing). In ost_lvbo_update() the lu_env_init() and lu_env_fini() functions are consuming over 10% of the OSS CPU for basically no benefit. The master-patch-32832.svg flame graph shows the ost_lvbo_update() CPU usage is down to 1.5% when the patch is applied, which resulted in a 6.3% performance improvement for random 4KB reads. Alex, could you please include these results in the commit comment so that it is more clear why we want to land that patch. The ost_io.svg graph is also showing find_or_create_page() using 4.25% of CPU, which drove the creation of patch https://review.whamcloud.com/32875 " LU-11347 osd: do not use pagecache for I/O ".

Peter Jones added a comment - 06/Sep/18 3:42 AM

No it's not - thanks for point that out!

Peter Jones added a comment - 06/Sep/18 3:42 AM No it's not - thanks for point that out!

James A Simmons added a comment - 14/Aug/18 5:37 PM

I see this is marked as upstream. Is that correct?

James A Simmons added a comment - 14/Aug/18 5:37 PM I see this is marked as upstream. Is that correct?

Joseph Gmitter (Inactive) added a comment - 28/Jul/18 2:44 AM

Patch is at https://review.whamcloud.com/#/c/32832/

Joseph Gmitter (Inactive) added a comment - 28/Jul/18 2:44 AM Patch is at https://review.whamcloud.com/#/c/32832/