[LUDOC-40] Create Documentation for multi-threaded ptlrpcd - Whamcloud Community JIRA

Details

Type: New Feature
Resolution: Fixed
Priority: Major
Fix Version/s: None
Affects Version/s: None
Labels:
- releases

Rank (Obsolete):
7172

Description

Please write the documentation necessary for multithreaded ptlrpcd to be understood and used in Lustre 2.2. Please include any tunables that affect this feature. Only the raw content needs to be written - any grammar checking, spell checking, formatting, etc. will be done by a doc writer. The content can be appended to this ticket.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Multi-threaded_ptlrpcd.docx
121 kB
01/Feb/12 9:57 PM

Activity

[LUDOC-40] Create Documentation for multi-threaded ptlrpcd

Richard Henwood (Inactive) added a comment - 05/Apr/12 11:16 AM

Merged.

Richard Henwood (Inactive) added a comment - 05/Apr/12 11:16 AM Merged.

Cliff White (Inactive) added a comment - 30/Mar/12 3:00 PM

Content now up for review http://review.whamcloud.com/2425

Cliff White (Inactive) added a comment - 30/Mar/12 3:00 PM Content now up for review http://review.whamcloud.com/2425

Cliff White (Inactive) added a comment - 27/Mar/12 9:29 AM

I am still unclear on the real differences between PDB_POLICY_FULL and PDB_POLICY_NEIGHBOR - could you give and example of when either one would be useful.

Cliff White (Inactive) added a comment - 27/Mar/12 9:29 AM I am still unclear on the real differences between PDB_POLICY_FULL and PDB_POLICY_NEIGHBOR - could you give and example of when either one would be useful.

nasf (Inactive) added a comment - 22/Feb/12 2:45 AM

The recommended mode is the default mode: one thread per core (including hyper-threading). It is not verified where is the point for the best performance.

The administrator can tune ptlrpcd_bind_policy when insmod ptlrpcd.ko. But ptlrpcd_load_policy is used inside Lustre code only, not tunable for administrator. So the former one should be part of Lustre user manual, the later one is for developer and should be part of Lustre internals documentation.

nasf (Inactive) added a comment - 22/Feb/12 2:45 AM The recommended mode is the default mode: one thread per core (including hyper-threading). It is not verified where is the point for the best performance. The administrator can tune ptlrpcd_bind_policy when insmod ptlrpcd.ko. But ptlrpcd_load_policy is used inside Lustre code only, not tunable for administrator. So the former one should be part of Lustre user manual, the later one is for developer and should be part of Lustre internals documentation.

Cliff White (Inactive) added a comment - 22/Feb/12 12:58 AM

Would the recommended maximum then be one thread per core (including hyper-threading)?
Is there a point where performance will decrease if the threads per core is >1 ? >5?? etc??

I am a little confused by last response, in section 2.1 you list PDB_POLICY options, and they are set by
"insmod ptlrpcd.ko ptlrpcd_bind_policy=xxx" - which would imply the system admin tunes these.

In section 2.2 you list the PDL_POLICY options, from your response above these are internal-only, never
touched by any other than Lustre developers? Just to confirm, they cannot be tuned by a system admin?

If true, then i think section 2.2 might not go in the general manual, but rather in some developer-focused or Lustre Internals documentation.

Cliff White (Inactive) added a comment - 22/Feb/12 12:58 AM Would the recommended maximum then be one thread per core (including hyper-threading)? Is there a point where performance will decrease if the threads per core is >1 ? >5?? etc?? I am a little confused by last response, in section 2.1 you list PDB_POLICY options, and they are set by "insmod ptlrpcd.ko ptlrpcd_bind_policy=xxx" - which would imply the system admin tunes these. In section 2.2 you list the PDL_POLICY options, from your response above these are internal-only, never touched by any other than Lustre developers? Just to confirm, they cannot be tuned by a system admin? If true, then i think section 2.2 might not go in the general manual, but rather in some developer-focused or Lustre Internals documentation.

nasf (Inactive) added a comment - 21/Feb/12 7:46 PM

Q: The absolute minimum is 2 per node, regardless of number of cores?
A: Yes

Q: The default is one thread per core, including hyperthreading?
A: Yes, the default mode is one thread per hyper-threading.

Q: Is there any limit or maximum in the code?
A: Currently, there is no maximun limit, but I think we should set the maximum as the core count (hyper-threading) on the node.

Q: You mention large directory traversal and statahead as operations that are async RPC-intensive,
are there any other situations a user may need to be aware of?
A: statahead is part of large directory traversal, and async glimpse lock (agl) is also part of large directory traversal. Both of them are usually can be triggered by "ls -l", "du", "find", and similar system commands.
Another often used async RPC case is I/O, in Lustre, most of I/O are async mode.

Q: Is there any tuning of RPC behavior in this area, in other words for a specific type
of RPC or action, can a user force async or sync behavior?
A: There are some existing proc interfaces maybe affect the efficient for async RPC processing, like "max_rpcs_in_flight". But as for whether a RPC is sync or async, depends on Lustre internal implementation, the developer can specify that inside Lustre code, but there is no tunable interface for users to specify whether the RPC is async or sync outside Lustre code.

Q: what is the parameter name for the ptlrpcd_load_policy? is it "ptlrpcd_load_policy=XX" ??
A: ptlrpcd_load_policy is not the name of some parameter. It is the name for a set of parameters used inside Lustre code to specify how to push the async RPC into some ptlrpcd queue. That means only Lustre developer can use such parameters, but invisible outside Lustre code.

nasf (Inactive) added a comment - 21/Feb/12 7:46 PM Q: The absolute minimum is 2 per node, regardless of number of cores? A: Yes Q: The default is one thread per core, including hyperthreading? A: Yes, the default mode is one thread per hyper-threading. Q: Is there any limit or maximum in the code? A: Currently, there is no maximun limit, but I think we should set the maximum as the core count (hyper-threading) on the node. Q: You mention large directory traversal and statahead as operations that are async RPC-intensive, are there any other situations a user may need to be aware of? A: statahead is part of large directory traversal, and async glimpse lock (agl) is also part of large directory traversal. Both of them are usually can be triggered by "ls -l", "du", "find", and similar system commands. Another often used async RPC case is I/O, in Lustre, most of I/O are async mode. Q: Is there any tuning of RPC behavior in this area, in other words for a specific type of RPC or action, can a user force async or sync behavior? A: There are some existing proc interfaces maybe affect the efficient for async RPC processing, like "max_rpcs_in_flight". But as for whether a RPC is sync or async, depends on Lustre internal implementation, the developer can specify that inside Lustre code, but there is no tunable interface for users to specify whether the RPC is async or sync outside Lustre code. Q: what is the parameter name for the ptlrpcd_load_policy? is it "ptlrpcd_load_policy=XX" ?? A: ptlrpcd_load_policy is not the name of some parameter. It is the name for a set of parameters used inside Lustre code to specify how to push the async RPC into some ptlrpcd queue. That means only Lustre developer can use such parameters, but invisible outside Lustre code.

Peter Jones added a comment - 21/Feb/12 4:53 PM

Added Fanyong as a watcher so he sees Cliff's question

Peter Jones added a comment - 21/Feb/12 4:53 PM Added Fanyong as a watcher so he sees Cliff's question

Cliff White (Inactive) added a comment - 21/Feb/12 4:13 PM

I would appreciate a response at your earliest convience.
--------
New question - what is the parameter name for the ptlrpcd_load_policy? This
is not in the document.
is it "ptlrpcd_load_policy=XX" ??
---------

--------

For the max_ptlrpcds parameter:

The absolute minimum is 2 per node, regardless of number of cores?
The default is one thread per core, including hyperthreading?
Is there any limit or maximum in the code?

You mention large directory traversal and statahead as operations that
are async RPC-intensive,
are there any other situations a user may need to be aware of?

Somewhat outside question:

Is there any tuning of RPC behavior in this area, in other words for a
specific type
of RPC or action, can a user force async or sync behavior?

–
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com

Cliff White (Inactive) added a comment - 21/Feb/12 4:13 PM I would appreciate a response at your earliest convience. -------- New question - what is the parameter name for the ptlrpcd_load_policy? This is not in the document. is it "ptlrpcd_load_policy=XX" ?? --------- -------- For the max_ptlrpcds parameter: The absolute minimum is 2 per node, regardless of number of cores? The default is one thread per core, including hyperthreading? Is there any limit or maximum in the code? You mention large directory traversal and statahead as operations that are async RPC-intensive, are there any other situations a user may need to be aware of? Somewhat outside question: Is there any tuning of RPC behavior in this area, in other words for a specific type of RPC or action, can a user force async or sync behavior? – cliffw Support Guy WhamCloud, Inc. www.whamcloud.com

Cliff White (Inactive) added a comment - 15/Feb/12 8:15 PM

Asked these in email, putting in bug for record, or in case anybody else wants to answer them here.
--------

For the max_ptlrpcds parameter:

The absolute minimum is 2 per node, regardless of number of cores?
The default is one thread per core, including hyperthreading?
Is there any limit or maximum in the code?

You mention large directory traversal and statahead as operations that are async RPC-intensive,
are there any other situations a user may need to be aware of?

Somewhat outside question:

Is there any tuning of RPC behavior in this area, in other words for a specific type
of RPC or action, can a user force async or sync behavior?

Cliff White (Inactive) added a comment - 15/Feb/12 8:15 PM Asked these in email, putting in bug for record, or in case anybody else wants to answer them here. -------- For the max_ptlrpcds parameter: The absolute minimum is 2 per node, regardless of number of cores? The default is one thread per core, including hyperthreading? Is there any limit or maximum in the code? You mention large directory traversal and statahead as operations that are async RPC-intensive, are there any other situations a user may need to be aware of? Somewhat outside question: Is there any tuning of RPC behavior in this area, in other words for a specific type of RPC or action, can a user force async or sync behavior?

Peter Jones added a comment - 14/Feb/12 11:07 AM

Hi Cliff

Please can you integrate this material into the manual

Thanks

Peter

Peter Jones added a comment - 14/Feb/12 11:07 AM Hi Cliff Please can you integrate this material into the manual Thanks Peter

People

Assignee:: Cliff White (Inactive)

Reporter:: Bryon Neitzel (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Due:: 30/Mar/12

Created:: 19/Jan/12 1:58 PM

Updated:: 05/Apr/12 11:16 AM

Resolved:: 05/Apr/12 11:16 AM