[LU-11754] ldiskfs performance degradation due to kernel swap hugging cpu Created: 11/Dec/18  Updated: 11/Dec/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Abe Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

HI,

With obdfilter survery the ldiskfs performance is impacted by kernel swap hugging cpu usage, current configurations is as follows:

2 osts: ost1,ost2

/dev/sdc on /mnt/mdt type lustre (ro,context=unconfined_u:object_r:user_tmp_t:s0,svname=tempAA-MDT0000,mgs,osd=osd-ldiskfs,user_xattr,errors=remount-ro)
/dev/sdb on /mnt/ost1 type lustre (ro,context=unconfined_u:object_r:user_tmp_t:s0,svname=tempAA-OST0001,mgsnode=10.10.10.168@o2ib,osd=osd-ldiskfs,errors=remount-ro)
/dev/sda on /mnt/ost2 type lustre (ro,context=unconfined_u:object_r:user_tmp_t:s0,svname=tempAA-OST0002,mgsnode=10.10.10.168@o2ib,osd=osd-ldiskfs,errors=remount-ro)
[root@oss100 htop-2.2.0]#

[root@oss100 htop-2.2.0]# dkms status
lustre-ldiskfs, 2.11.0, 3.10.0-693.21.1.el7_lustre.x86_64, x86_64: installed
spl, 0.7.6, 3.10.0-693.21.1.el7_lustre.x86_64, x86_64: installed
[root@oss100 htop-2.2.0]#

sh ./obdsurvey-script.sh
Mon Dec 10 17:19:52 PST 2018 Obdfilter-survey for case=disk from oss100
ost 2 sz 512000000K rsz 1024K obj 2 thr 2 write 134.52 [ 49.99, 101.96] rewrite 132.09 [ 49.99, 78.99] read 2566.74 [ 258.96, 2068.71]
ost 2 sz 512000000K rsz 1024K obj 2 thr 4 write 195.73 [ 76.99, 128.98] rewrite

root@oss100 htop-2.2.0]# lctl dl
0 UP osd-ldiskfs tempAA-MDT0000-osd tempAA-MDT0000-osd_UUID 9
1 UP mgs MGS MGS 4
2 UP mgc MGC10.10.10.168@o2ib 65f231a0-8fd8-001d-6b0f-3e986f914178 4
3 UP mds MDS MDS_uuid 2
4 UP lod tempAA-MDT0000-mdtlov tempAA-MDT0000-mdtlov_UUID 3
5 UP mdt tempAA-MDT0000 tempAA-MDT0000_UUID 8
6 UP mdd tempAA-MDD0000 tempAA-MDD0000_UUID 3
7 UP qmt tempAA-QMT0000 tempAA-QMT0000_UUID 3
8 UP lwp tempAA-MDT0000-lwp-MDT0000 tempAA-MDT0000-lwp-MDT0000_UUID 4
9 UP osd-ldiskfs tempAA-OST0001-osd tempAA-OST0001-osd_UUID 4
10 UP ost OSS OSS_uuid 2
11 UP obdfilter tempAA-OST0001 tempAA-OST0001_UUID 5
12 UP lwp tempAA-MDT0000-lwp-OST0001 tempAA-MDT0000-lwp-OST0001_UUID 4
13 UP osp tempAA-OST0001-osc-MDT0000 tempAA-MDT0000-mdtlov_UUID 4
14 UP echo_client tempAA-OST0001_ecc tempAA-OST0001_ecc_UUID 2
15 UP osd-ldiskfs tempAA-OST0002-osd tempAA-OST0002-osd_UUID 4
16 UP obdfilter tempAA-OST0002 tempAA-OST0002_UUID 5
17 UP lwp tempAA-MDT0000-lwp-OST0002 tempAA-MDT0000-lwp-OST0002_UUID 4
18 UP osp tempAA-OST0002-osc-MDT0000 tempAA-MDT0000-mdtlov_UUID 4
19 UP echo_client tempAA-OST0002_ecc tempAA-OST0002_ecc_UUID 2
[root@oss100 htop-2.2.0]#

root@oss100 htop-2.2.0]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 152.8T 0 disk /mnt/ost2
sdb 8:16 0 152.8T 0 disk /mnt/ost1
sdc 8:32 0 931.5G 0 disk /mnt/mdt
sdd 8:48 0 465.8G 0 disk
\u251c\u2500sdd1 8:49 0 200M 0 part /boot/efi
\u251c\u2500sdd2 8:50 0 1G 0 part /boot
\u2514\u2500sdd3 8:51 0 464.6G 0 part
\u251c\u2500centos-root 253:0 0 50G 0 lvm /
\u251c\u2500centos-swap 253:1 0 4G 0 lvm [SWAP]
\u2514\u2500centos-home 253:2 0 410.6G 0 lvm /home
nvme0n1 259:2 0 372.6G 0 disk
\u2514\u2500md124 9:124 0 372.6G 0 raid1
nvme1n1 259:0 0 372.6G 0 disk
\u2514\u2500md124 9:124 0 372.6G 0 raid1
nvme2n1 259:3 0 372.6G 0 disk
\u2514\u2500md125 9:125 0 354G 0 raid1
nvme3n1 259:1 0 372.6G 0 disk
\u2514\u2500md125 9:125 0 354G 0 raid1

 

thanks,

Abe

 

 

 


Generated at Sat Feb 10 02:46:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.