[LU-10400] Reduced stat performance with lustre 2.10 Created: 15/Dec/17  Updated: 22/Mar/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Tim McMullan Assignee: Saurabh Tandan (Inactive)
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

We have been noticing decreased performance in any stat-intense operation on lustre 2.10.[0,2] when compared to 2.7. The difference is more significant when testing on HDDs than when testing on SSDs, but is visible for us on both. Between runs I am dropping cache on the client, mds, and oss via "echo 3 > /proc/sys/vm/drop_caches"

For example, in a single directory containing 100000 files:

 Client Version 2.10.2 2.10.2 2.7 2.7
Server Version 2.10.2  2.7 2.10.2 2.7
 ls -l time (seconds) 54 5 53 6
du -s time (seconds) 150 22 150 29

We are running 2.10 server on centos7 and 2.7 on rhel6.6.



 Comments   
Comment by Peter Jones [ 19/Dec/17 ]

Saraubh

Can you please see whether you can reproduce these results?

Thanks

Peter

Comment by Andreas Dilger [ 19/Dec/17 ]

Hi Tim, are there any tunable or formatting options that are used, or default file striping that is used at your site? We’d like to reproduce this locally to debug the problem, but want to make sure that what we are testing matches what you have.

Comment by Tim McMullan [ 20/Dec/17 ]

Hey Andreas, 
The physical setup is 1mds/mgs, 1 oss with 4 osts (2 SSD. 2 HDD). The stripe size is 1MB, and we tested with stripe 2 so everything hits both osts of the same type (SSD and HDD are in separate pools).  The files we used were all 2MB.  All the testing above was done on the HDDs.
 
I'm setting the following on the MGS, but otherwise the setup is default for both 2.7 and 2.10
lctl set_param -P llite.*.lazystatfs=1
lctl set_param -P osc.*.max_rpcs_in_flight=32
lctl set_param -P osc.*.max_dirty_mb=256
 
I'm formatting with the following:
MDS - the mgs and mdt are on the same host sharing a drive, the underlying device is a RAID1 of 10k RPM disks.
mkfs.lustre --fsname=${name} --mgs /dev/sdc1
mkfs.lustre --fsname=${name} --mdt --mgsnode=${mgs_ip}@o2ib --index=0 /dev/sdc2
 
OSS -
ssd, no raid
mkfs.lustre –fsname=${name} -ost -mgsnode=${mgs_ip}@o2ib --index=0 /dev/sdb
mkfs.lustre –fsname=${name} -ost -mgsnode=${mgs_ip}@o2ib --index=1 /dev/sdc
 
 
10k RPM disks, no raid:
mkfs.lustre –fsname=${name} -ost -mgsnode=${mgs_ip}@o2ib --index=2 /dev/sdd
mkfs.lustre –fsname=${name} -ost -mgsnode=${mgs_ip}@o2ib --index=3 /dev/sde
 
Thank you!
--Tim
 

Comment by Saurabh Tandan (Inactive) [ 05/Jan/18 ]

Hi Tim,
Can you please clarify if you are using el7 or el7.4 with 2.10.x ?
Also can you please clarify if you are using two separate systems with 2.7 and 2.10.x or upgrading from 2.7 to 2.10.x ?
Statement - "We are running 2.10 server on centos7 and 2.7 on rhel6.6." is creating a bit of confusion.
Can you please clarify a bit on your lustre version setup please.

Comment by Allen Todd [ 05/Jan/18 ]

The lustre 2.10.x system is running: CentOS Linux release 7.4.1708 (Core)
The lustre 2.7 system is running: RedHatEnterpriseServer 6.6

Both filesystems are new builds in a lab with no preexisting data.

Comment by Saurabh Tandan (Inactive) [ 07/Mar/18 ]

Tried to verify the performance drop between Lustre version 2.7.19.6 and 2.10.0 using the same kernel but I was not able to identify any huge delta between their performance numbers for file creation of 100000 files and later stat using 'time ls -l'. The numbers below are average of 3 runs for each. We will still continue to investigate further into this issue and see if we may identify anything.
File creation using Touch

Build		           Version	     Real           user       sys	
b_ieel3_0 build 159	   2.7.19.6      85.389      0.33	     14.625      kernel-3.10.0-514.el7
b2_10 build 5	           2.10.0         99.75        0.325      18.363      kernel-3.10.0-514.el7

time ls -l for touch

Build		               Version      Real            user      sys	
b_ieel3_0 build 159		2.7.19.6      4.444      0.835      2.098      kernel-3.10.0-514.el7
b2_10 build 5          	2.10.0          3.848      0.824      2.338      kernel-3.10.0-514.el7

File creation using Mcreate:

Build                        	Version	Real	        usr	         sys	
b_ieel3_0 build 159		2.7.19.6	183.02	38.133	137.687	kernel-3.10.0-514.el7
b2_10 build 5	                2.10.0	196.111	38.003	152.28	kernel-3.10.0-514.el7

time ls -l for Mcreate:

Build	                         Version      Real         usr         sys	
b_ieel3_0 build 159	         2.7.19.6      3.266      0.76      1.464      kernel-3.10.0-514.el7
b2_10 build 5    	         2.10.0         3.27       0.738     1.782      kernel-3.10.0-514.el7
Comment by Tim McMullan [ 22/Mar/18 ]

Thanks for checking it out!  After your test I decided to try running a test with the same lustre version on the el6 and 7 kernels.  I ran this with lustre 2.8 on rhel6 and rhel7 since it happens to be easy with the released packages.  The results are below, but times appear to be significantly different between the two.  

time ls -l 

Kernel                             real   user   sys
2.6.32-573.12.1.el6_lustre.x86_64  2.848  0.824  1.808
3.10.0-693.11.6.el7_lustre.x86_64  4.322  0.832  2.188

time du -s

 

Kernel                             real   user   sys
2.6.32-573.12.1.el6_lustre.x86_64  20.450 0.188  5.280
3.10.0-693.11.6.el7_lustre.x86_64  34.830 0.192  5.448

I'll keep looking and see what more I can come up with.  Thanks!

 

 

Comment by Patrick Farrell (Inactive) [ 22/Mar/18 ]

Tim,

That version of CentOS 7 includes the KPTI/Meltdown fix, and that version of CentOS 6 does not.  That's a huge difference, and should account for the differences you're seeing, unless you've specifically disabled KPTI.

Comment by Tim McMullan [ 22/Mar/18 ]

I'm sorry Patrick, my mistake.  I grabbed some output for the wrong host...  

This is the run from 3.10.0-327.3.1.el7_lustre.x86_64 (packaged one for 2.8)

time ls -l

Kernel                             real   user   sys
2.6.32-573.12.1.el6_lustre.x86_64  2.848  0.824  1.808
3.10.0-327.3.1.el7_lustre.x86_64   3.391  0.820  1.876

time du -s

Kernel                             real   user   sys
2.6.32-573.12.1.el6_lustre.x86_64  20.450 0.188  5.280
3.10.0-327.3.1.el7_lustre.x86_64   32.417 0.252  5.272

 

Generated at Sat Feb 10 02:34:44 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.