[LU-3259] cl_lock refactoring Created: 01/May/13  Updated: 08/Nov/16  Resolved: 17/Jun/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: Lustre 2.7.0, Lustre 2.8.0

Type: Improvement Priority: Minor
Reporter: Jinshan Xiong (Inactive) Assignee: Jinshan Xiong (Inactive)
Resolution: Fixed Votes: 0
Labels: clio

Attachments: PNG File cl_lock_simp.png    
Issue Links:
Blocker
is blocking LU-5880 CLIO Simplification Resolved
Duplicate
is duplicated by LU-7104 ASSERTION( osc == oap->oap_obj ) failed Resolved
Related
is related to LU-90 Simplify cl_lock Resolved
Rank (Obsolete): 8065

 Description   

In current implementation of CLIO, cl_lock is a data structure that records which parts of the file have been protected by DLM lock. The current design of the current cl_lock intended to reduce the overhead of rebuilding memory and data structures so that better performance could be achieved - particularly for small IO. In reality, the complexity of cl_lock implementation has meant the anticipated performance improvements could not be realized. In addition to an unrealized performance goal, the complexity of cl_lock has also been a source of a number of bugs and presents a high barrier to entry for new developers.

The cl_lock cannot be removed altogether as it provides a mechanism to communicate the extent of a specific DLM lock for a specific IO. The new design of cl_lock will continue to fulfil this fundamental role: passing information between layers (from llite to OSC) so the DLM module understands the requirements for a given IO.

The cl_lock operations can pass as both FTTB (from top to bottom) and FBTT (from bottom to top). This design introduces the possibility of deadlock, so an additional deadlock avoidance mechanism (closure) was also introduced. This makes cl_lock implementation extremely complex and hard to maintain, as well as difficult to understand.

A cacheless cl_lock will be designed to replace the current implementation. In this implementation, DLM lock will still be maintained below the OSC layer. However, before each IO starts, we're going to rebuild cl_lock data structure in memory; after the IO is done, the cl_lock will be destroyed immediately.

The new cl_lock will
New cl_lock Functional Coverage
File I/O

Almost all file IOs requesting cl_lock will be affected by this design:

Write

Read

Fault

Setattr

After cache-less cl_lock is introduced, above operations will use the new cl_lock interfaces to request cl_lock.
DLM lock cancellation callback

When a DLM extent lock is cancelled, it will never be used by any active IOs. It is not necessary to notify CLIO upper layers to remove the DLM lock. This design will remove the FBTT operations: just write back data covered by the DLM lock and destroy it.
Anticipated Post Implementation Effect

+ A cleaner, simpler design that will make the code easier to understand and maintain.

+ Rebuilding the cl_lock data structure in memory for each IO may have negitive impact on the performance of the new design under certain workloads.



 Comments   
Comment by John Hammond [ 18/Feb/15 ]

landed: http://review.whamcloud.com/#/c/10858/ LU-3259 clio: cl_lock simplification
unlanded: http://review.whamcloud.com/#/c/10859/ LU-3259 clio: Revise read ahead implementation
unlanded: http://review.whamcloud.com/#/c/11013/ LU-3259 clio: get rid of cl_req

Comment by Gerrit Updater [ 12/Mar/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/10859/
Subject: LU-3259 clio: Revise read ahead implementation
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: db47990bbf7c59474f4c5ca28f96fdf424275516

Comment by James A Simmons [ 17/Jun/15 ]

Any work left for this?

Comment by Jinshan Xiong (Inactive) [ 17/Jun/15 ]

no

Comment by James A Simmons [ 17/Jun/15 ]

We should close this ticket

Generated at Sat Feb 10 01:32:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.