[LU-6325] CPT bound ptlrpcd's are unimplemented - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.8.0
Affects Version/s: None
Labels:
- patch

Severity:
3
Rank (Obsolete):
17711

Description

ptlrpcd_select_pc() has the comment:

ifdef CFS_CPU_MODE_NUMA
warning "fix this code to use new CPU partition APIs"
endif

In our own experimentation on large NUMA systems, we found substantial benefits to confining the existing ptlrpcd's to the NUMA node originating IO using the taskset command to set affinity. Unfortunately, this only works for one node at a time.

To obtain the best case for all nodes, we need to create ptlrpcd's confined to each node and to select ptlrpcd's on the same node.

We plan to submit a patch against master to complete this.

Attachments

Issue Links

is related to

LU-6580 Poor read performance with many ptlrpcd threads on the client

Resolved

Activity

[LU-6325] CPT bound ptlrpcd's are unimplemented

Peter Jones made changes - 10/Aug/15 1:48 PM

Link

Original: This issue is related to LDEV-18 [ LDEV-18 ]

Peter Jones made changes - 10/Aug/15 1:48 PM

Link

New: This issue is related to LDEV-19 [ LDEV-19 ]

Peter Jones made changes - 16/Jul/15 10:04 PM

Resolution		New: Fixed [ 1 ]
Status	Original: Open [ 1 ]	New: Resolved [ 5 ]

Peter Jones added a comment - 16/Jul/15 10:04 PM

Landed for 2.8

Peter Jones added a comment - 16/Jul/15 10:04 PM Landed for 2.8

Gerrit Updater added a comment - 16/Jul/15 10:01 PM

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13972/
Subject: ~~LU-6325~~ ptlrpc: make ptlrpcd threads cpt-aware
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 2686b25c301f055a15d13f085f5184e6f5cbbe13

Gerrit Updater added a comment - 16/Jul/15 10:01 PM Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13972/ Subject: LU-6325 ptlrpc: make ptlrpcd threads cpt-aware Project: fs/lustre-release Branch: master Current Patch Set: Commit: 2686b25c301f055a15d13f085f5184e6f5cbbe13

Jian Yu made changes - 16/Jul/15 9:54 PM

Link

New: This issue is related to LDEV-18 [ LDEV-18 ]

Andreas Dilger made changes - 07/May/15 5:17 PM

Link

New: This issue is related to ~~LU-6580~~ [ ~~LU-6580~~ ]

Stephen Champion added a comment - 22/Apr/15 11:24 AM

I've done a performance evaluation with and without the current revision of http://review.whamcloud.com/#/c/13972/, and made the data available for review at https://docs.google.com/spreadsheets/d/1d_4-rvk6ja3msnZJFzT3L-A42jg_pNz7Ki6j1riMN_g. This data was collected using IOR on a 16 socket E5-4650 UV 2000 partition using a single FDR IB port. The IB card is adjacent to node 3. node 2 is also adjacent to node 3, and node 8 is the most distant node.

Although there is a clear benefit, the read rates are not necessarily informative : Since each IOR thread performs every read to the same address space and Intel Data Direct I/O is implemented on these processors, the read may only make it to the L3 cache of a processor, and never be committed to memory on the node originating the IO. This can be effectively utilized by someone developing applications for NUMA architectures, moreso with this patch, but it is not the normal case we expect from end-user applications.

So we are principally looking at write speeds to see the effect on transfer from local memory to the file system. For reference, a generic two socket E5-2660 client was able to achieve 3.4 GB/s writes on the same file system.

We can see consistent, albeit moderate gains for IO originating from all nodes. A critical benefit not shown here is that the effect of IO to a Lustre filestem on other nodes is substantially reduced, which is an important improvement to reproducible performance of jobs on a shared system.

Stephen Champion added a comment - 22/Apr/15 11:24 AM I've done a performance evaluation with and without the current revision of http://review.whamcloud.com/#/c/13972/ , and made the data available for review at https://docs.google.com/spreadsheets/d/1d_4-rvk6ja3msnZJFzT3L-A42jg_pNz7Ki6j1riMN_g . This data was collected using IOR on a 16 socket E5-4650 UV 2000 partition using a single FDR IB port. The IB card is adjacent to node 3. node 2 is also adjacent to node 3, and node 8 is the most distant node. Although there is a clear benefit, the read rates are not necessarily informative : Since each IOR thread performs every read to the same address space and Intel Data Direct I/O is implemented on these processors, the read may only make it to the L3 cache of a processor, and never be committed to memory on the node originating the IO. This can be effectively utilized by someone developing applications for NUMA architectures, moreso with this patch, but it is not the normal case we expect from end-user applications. So we are principally looking at write speeds to see the effect on transfer from local memory to the file system. For reference, a generic two socket E5-2660 client was able to achieve 3.4 GB/s writes on the same file system. We can see consistent, albeit moderate gains for IO originating from all nodes. A critical benefit not shown here is that the effect of IO to a Lustre filestem on other nodes is substantially reduced, which is an important improvement to reproducible performance of jobs on a shared system.

Gerrit Updater added a comment - 28/Mar/15 3:13 AM

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14049/
Subject: ~~LU-6325~~ libcfs: shortcut to create CPT from NUMA topology
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: dd9533737c28bd47a4b10d15ed6a4f0b3353765a

Gerrit Updater added a comment - 28/Mar/15 3:13 AM Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14049/ Subject: LU-6325 libcfs: shortcut to create CPT from NUMA topology Project: fs/lustre-release Branch: master Current Patch Set: Commit: dd9533737c28bd47a4b10d15ed6a4f0b3353765a

Liang Zhen (Inactive) added a comment - 18/Mar/15 11:39 PM

thanks Olaf, I will look into it soon.

Liang Zhen (Inactive) added a comment - 18/Mar/15 11:39 PM thanks Olaf, I will look into it soon.

People

Assignee:: Liang Zhen (Inactive)

Reporter:: Stephen Champion

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 04/Mar/15 8:44 AM

Updated:: 12/May/16 6:33 PM

Resolved:: 16/Jul/15 10:04 PM