Details
-
Improvement
-
Resolution: Fixed
-
Minor
-
Lustre 2.5.0
-
8065
Description
In current implementation of CLIO, cl_lock is a data structure that records which parts of the file have been protected by DLM lock. The current design of the current cl_lock intended to reduce the overhead of rebuilding memory and data structures so that better performance could be achieved - particularly for small IO. In reality, the complexity of cl_lock implementation has meant the anticipated performance improvements could not be realized. In addition to an unrealized performance goal, the complexity of cl_lock has also been a source of a number of bugs and presents a high barrier to entry for new developers.
The cl_lock cannot be removed altogether as it provides a mechanism to communicate the extent of a specific DLM lock for a specific IO. The new design of cl_lock will continue to fulfil this fundamental role: passing information between layers (from llite to OSC) so the DLM module understands the requirements for a given IO.
The cl_lock operations can pass as both FTTB (from top to bottom) and FBTT (from bottom to top). This design introduces the possibility of deadlock, so an additional deadlock avoidance mechanism (closure) was also introduced. This makes cl_lock implementation extremely complex and hard to maintain, as well as difficult to understand.
A cacheless cl_lock will be designed to replace the current implementation. In this implementation, DLM lock will still be maintained below the OSC layer. However, before each IO starts, we're going to rebuild cl_lock data structure in memory; after the IO is done, the cl_lock will be destroyed immediately.
The new cl_lock will
New cl_lock Functional Coverage
File I/O
Almost all file IOs requesting cl_lock will be affected by this design:
Write
Read
Fault
Setattr
After cache-less cl_lock is introduced, above operations will use the new cl_lock interfaces to request cl_lock.
DLM lock cancellation callback
When a DLM extent lock is cancelled, it will never be used by any active IOs. It is not necessary to notify CLIO upper layers to remove the DLM lock. This design will remove the FBTT operations: just write back data covered by the DLM lock and destroy it.
Anticipated Post Implementation Effect
+ A cleaner, simpler design that will make the code easier to understand and maintain.
+ Rebuilding the cl_lock data structure in memory for each IO may have negitive impact on the performance of the new design under certain workloads.