Loading...

Type: Bug
Resolution: Not a Bug
Priority: Critical
Fix Version/s: None
Affects Version/s: Lustre 2.12.0
Labels:
None
Environment:
CentOS 7.6, Sherlock Cluster clients: kernel 3.10.0-957.5.1.el7.x86_64, lustre-client 2.12.0 (from wc), Server: Fir running 2.12.0 Kernel 3.10.0-957.1.3.el7_lustre.x86_64

Severity:
3
Rank (Obsolete):
9223372036854775807

Hello! We started production on 2.12 clients and 2.12 servers (scratch filesystem) last week (we still have Oak servers in 2.10 also mounted on Sherlock). The cluster has been stabilized but now we have a major issue with slow clients. Some clients are slow and we've been trying to figure out all day why without success. Other clients are just run fine, only some of them are slow. Hopefully someone will have some clue as this leads to many unhappy users at the moment...

Let's take two Lustre 2.12 clients, on the same IB fabric (we have two separate fabrics on this cluster), using the same lustre routers, also they are using the same hardware and same OS image:

sh-ln05 is very slow at the moment, a simple dd to /fir leads to:

[root@sh-ln05 sthiell]# dd if=/dev/zero of=seqddout1M bs=1M count=1000 conv=fsync
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 131.621 s, 8.0 MB/s

sh-ln06 on the same file (same stripping) runs just fine:

[root@sh-ln06 sthiell]# dd if=/dev/zero of=seqddout1M bs=1M count=1000 conv=fsync
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.52442 s, 688 MB/s

Both of these nodes are in production with a medium load. On some other nodes, less loaded, I get 1.2GB/s with the same dd.

We started with Large Bulk I/O 16MB and tried to revert to 4MB but this didn't change anything. On those clients, we tried random things like to clear the lru, drop caches, swapoff with no luck. NRS is off so using fifo.

We use DoM + PFL. Example for this file seqddout1M:

[root@sh-ln06 sthiell]# lfs getstripe seqddout1M 
seqddout1M
  lcm_layout_gen:    9
  lcm_mirror_count:  1
  lcm_entry_count:   6
    lcme_id:             1
    lcme_mirror_id:      0
    lcme_flags:          init
    lcme_extent.e_start: 0
    lcme_extent.e_end:   131072
      lmm_stripe_count:  0
      lmm_stripe_size:   131072
      lmm_pattern:       mdt
      lmm_layout_gen:    0
      lmm_stripe_offset: 0

    lcme_id:             2
    lcme_mirror_id:      0
    lcme_flags:          init
    lcme_extent.e_start: 131072
    lcme_extent.e_end:   16777216
      lmm_stripe_count:  1
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: 41
      lmm_objects:
      - 0: { l_ost_idx: 41, l_fid: [0x100290000:0xb3f46:0x0] }

    lcme_id:             3
    lcme_mirror_id:      0
    lcme_flags:          init
    lcme_extent.e_start: 16777216
    lcme_extent.e_end:   1073741824
      lmm_stripe_count:  2
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: 26
      lmm_objects:
      - 0: { l_ost_idx: 26, l_fid: [0x1001a0000:0xb3f5c:0x0] }
      - 1: { l_ost_idx: 19, l_fid: [0x100130000:0xb401e:0x0] }

    lcme_id:             4
    lcme_mirror_id:      0
    lcme_flags:          init
    lcme_extent.e_start: 1073741824
    lcme_extent.e_end:   34359738368
      lmm_stripe_count:  4
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: 9
      lmm_objects:
      - 0: { l_ost_idx: 9, l_fid: [0x100090000:0xb41eb:0x0] }
      - 1: { l_ost_idx: 43, l_fid: [0x1002b0000:0xb3f4a:0x0] }
      - 2: { l_ost_idx: 42, l_fid: [0x1002a0000:0xb408a:0x0] }
      - 3: { l_ost_idx: 2, l_fid: [0x100020000:0xb3f50:0x0] }

    lcme_id:             5
    lcme_mirror_id:      0
    lcme_flags:          0
    lcme_extent.e_start: 34359738368
    lcme_extent.e_end:   274877906944
      lmm_stripe_count:  8
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: -1

    lcme_id:             6
    lcme_mirror_id:      0
    lcme_flags:          0
    lcme_extent.e_start: 274877906944
    lcme_extent.e_end:   EOF
      lmm_stripe_count:  16
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: -1

Other client config:
osc.fir-OST*.max_dirty_mb=256
osc.fir-OST*.max_pages_per_rpc=1024
osc.fir-OST*.max_rpcs_in_flight=8

I have a full lustre log during a slow dd on sh-ln05 that I have attached to this ticket.

Note: we are seeing the same behavior when using /fir (2.12 servers) and /oak (2.10 servers). This definitively seems to originate from the Lustre client itself.

We also checked the state of IB and everything looks good.

Any idea on how to find out the root cause of this major random 2.12 client slowness ?

Thanks!!
Stephane

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

lustre-log-sh-ln05.log.gz
2.97 MB
12/Feb/19 1:19 AM

Details

Description

Attachments

Attachments

Activity

People

Dates