Loading...

XML

Word

Printable

Details

Type: Technical task
Resolution: Won't Fix
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.1.0, Lustre 2.2.0
Labels:
- llnl

Rank (Obsolete):
10219

Description

In ~~LU-874~~ the shared single-file IOR testing with 512 threads on 32 clients (16 cores per client) writing 128MB chunks to a file striped over 2 OSTs. This showed clients timing out on DLM locks. The threads on the single client are writing to disjoint parts of the file (i.e. each thread has its own DLM extent that is not adjacent to the extents written by other threads on that client).

For example, to reproduce this workload with 4 clients (A, B, C, D) against 2 OSTs (1, 2):

Client ABCDABCDABCD...
OST 121212121212...

While this IOR test is running, other tests are also running on different clients to create a very heavy IO load on the OSTs.

It may be that DLM locks on the OST are not getting any IO requests sent to refresh the DLM locks:

due to the number of active DLM locks on the client for a single OST being more than the number of rpcs in flight, some of the locks may be starved for sending BRW RPCs under that lock to the OST to refresh the lock timeout
due to the IO ordering of the BRW requests on the client, it may be that all of the pages for the lower-offset extent are sent to the OST before the pages for a higher-offset extent are ever sent
the high priority request queue on the OST may not be enough to help this if several locks on the client for one OST are canceled at the same time

Some solutions that might help this (individually, or in combination):
1. increase the max_rpcs_in_flight = core count, but I think this is bad in the long run since it can dramatically increase the number of RPCs that need to be handled at one time by each OST
2. always allow at least one BRW RPC in flight for each lock that is being canceled
3. prioritize ALL BRW RPCs for a blocked lock in advance of non-blocked BRW requests (e.g. like a high-priority request queue on the client)
4. both (2) and (3) may be needed in order to avoid starvation as the client core count increases

Attachments

Activity

People

Assignee:: Jinshan Xiong (Inactive)

Reporter:: Andreas Dilger

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 13/Dec/11 4:14 AM

Updated:: 08/Feb/18 6:21 PM

Resolved:: 08/Feb/18 6:21 PM