Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11754

ldiskfs performance degradation due to kernel swap hugging cpu

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.11.0
    • None
    • 3
    • 9223372036854775807

    Description

      HI,

      With obdfilter survery the ldiskfs performance is impacted by kernel swap hugging cpu usage, current configurations is as follows:

      2 osts: ost1,ost2

      /dev/sdc on /mnt/mdt type lustre (ro,context=unconfined_u:object_r:user_tmp_t:s0,svname=tempAA-MDT0000,mgs,osd=osd-ldiskfs,user_xattr,errors=remount-ro)
      /dev/sdb on /mnt/ost1 type lustre (ro,context=unconfined_u:object_r:user_tmp_t:s0,svname=tempAA-OST0001,mgsnode=10.10.10.168@o2ib,osd=osd-ldiskfs,errors=remount-ro)
      /dev/sda on /mnt/ost2 type lustre (ro,context=unconfined_u:object_r:user_tmp_t:s0,svname=tempAA-OST0002,mgsnode=10.10.10.168@o2ib,osd=osd-ldiskfs,errors=remount-ro)
      [root@oss100 htop-2.2.0]#

      [root@oss100 htop-2.2.0]# dkms status
      lustre-ldiskfs, 2.11.0, 3.10.0-693.21.1.el7_lustre.x86_64, x86_64: installed
      spl, 0.7.6, 3.10.0-693.21.1.el7_lustre.x86_64, x86_64: installed
      [root@oss100 htop-2.2.0]#

      sh ./obdsurvey-script.sh
      Mon Dec 10 17:19:52 PST 2018 Obdfilter-survey for case=disk from oss100
      ost 2 sz 512000000K rsz 1024K obj 2 thr 2 write 134.52 [ 49.99, 101.96] rewrite 132.09 [ 49.99, 78.99] read 2566.74 [ 258.96, 2068.71]
      ost 2 sz 512000000K rsz 1024K obj 2 thr 4 write 195.73 [ 76.99, 128.98] rewrite

      root@oss100 htop-2.2.0]# lctl dl
      0 UP osd-ldiskfs tempAA-MDT0000-osd tempAA-MDT0000-osd_UUID 9
      1 UP mgs MGS MGS 4
      2 UP mgc MGC10.10.10.168@o2ib 65f231a0-8fd8-001d-6b0f-3e986f914178 4
      3 UP mds MDS MDS_uuid 2
      4 UP lod tempAA-MDT0000-mdtlov tempAA-MDT0000-mdtlov_UUID 3
      5 UP mdt tempAA-MDT0000 tempAA-MDT0000_UUID 8
      6 UP mdd tempAA-MDD0000 tempAA-MDD0000_UUID 3
      7 UP qmt tempAA-QMT0000 tempAA-QMT0000_UUID 3
      8 UP lwp tempAA-MDT0000-lwp-MDT0000 tempAA-MDT0000-lwp-MDT0000_UUID 4
      9 UP osd-ldiskfs tempAA-OST0001-osd tempAA-OST0001-osd_UUID 4
      10 UP ost OSS OSS_uuid 2
      11 UP obdfilter tempAA-OST0001 tempAA-OST0001_UUID 5
      12 UP lwp tempAA-MDT0000-lwp-OST0001 tempAA-MDT0000-lwp-OST0001_UUID 4
      13 UP osp tempAA-OST0001-osc-MDT0000 tempAA-MDT0000-mdtlov_UUID 4
      14 UP echo_client tempAA-OST0001_ecc tempAA-OST0001_ecc_UUID 2
      15 UP osd-ldiskfs tempAA-OST0002-osd tempAA-OST0002-osd_UUID 4
      16 UP obdfilter tempAA-OST0002 tempAA-OST0002_UUID 5
      17 UP lwp tempAA-MDT0000-lwp-OST0002 tempAA-MDT0000-lwp-OST0002_UUID 4
      18 UP osp tempAA-OST0002-osc-MDT0000 tempAA-MDT0000-mdtlov_UUID 4
      19 UP echo_client tempAA-OST0002_ecc tempAA-OST0002_ecc_UUID 2
      [root@oss100 htop-2.2.0]#

      root@oss100 htop-2.2.0]# lsblk
      NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
      sda 8:0 0 152.8T 0 disk /mnt/ost2
      sdb 8:16 0 152.8T 0 disk /mnt/ost1
      sdc 8:32 0 931.5G 0 disk /mnt/mdt
      sdd 8:48 0 465.8G 0 disk
      \u251c\u2500sdd1 8:49 0 200M 0 part /boot/efi
      \u251c\u2500sdd2 8:50 0 1G 0 part /boot
      \u2514\u2500sdd3 8:51 0 464.6G 0 part
      \u251c\u2500centos-root 253:0 0 50G 0 lvm /
      \u251c\u2500centos-swap 253:1 0 4G 0 lvm [SWAP]
      \u2514\u2500centos-home 253:2 0 410.6G 0 lvm /home
      nvme0n1 259:2 0 372.6G 0 disk
      \u2514\u2500md124 9:124 0 372.6G 0 raid1
      nvme1n1 259:0 0 372.6G 0 disk
      \u2514\u2500md124 9:124 0 372.6G 0 raid1
      nvme2n1 259:3 0 372.6G 0 disk
      \u2514\u2500md125 9:125 0 354G 0 raid1
      nvme3n1 259:1 0 372.6G 0 disk
      \u2514\u2500md125 9:125 0 354G 0 raid1

       

      thanks,

      Abe

       

       

       

      Attachments

        Activity

          People

            wc-triage WC Triage
            abea@supermicro.com Abe
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: