[LU-90] Simplify cl_lock - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Duplicate
Priority: Minor
Fix Version/s: Lustre 2.1.0
Affects Version/s: Lustre 2.0.0
Labels:
- clio

Rank (Obsolete):
10405

Description

This task is based on a discussion about simplifying cl_lock at Beijing.

The proposal to simplify cl_lock
================================

We have discussed the scheme to simplify cl_lock many times, but we always
don't start a paper work to record it. Based on this situation, I'm writing my
idea to simplify cl_lock.

1. Problems
First of all, we all have consensus that the current implementation of cl_lock
is far too much complex. This is because:

cl_lock has two-level caches,
a top lock may have several sublocks, and a sub lock may be shared by
multiple top locks,
an IO lock is actually composed by one top lock, and several sublocks,
we have to hold the mutex of both top lock and sublocks to finish some
operations, and finally,
both llite and osc can initiate an operation to update the lock.

The above difficutlies make cl_lock be hard to understand, deadlock prone and
out of maintance. It also affects performance because we have to grab unknown #
of mutexes to finish an operation. And more, we have to invent the
cl_lock_closure to address the deadlock issue.

Life would be a bit eaiser if we can revise the lock modal to mitigate those
issues.

2 Scheme
Here is my proposal to fix this problem:

remove the top level cache, that is to say, make it be a pass through cache
revise the bottom to top lock operations, so that it can have only top to
bottom operations

If we can reach the above targets, we can simplify the lock modal a lot,because:

No deadlock concerns any more, we can then remove cl_lock_closure;
The # of mutexes to be held to finish an operation is determined,
the # is (N + 1) at most, N = stripe count;
we can remove hundred lines of code related with cl_lock;
code will become easy to understand.

2.1 remove top level cache
After removing top level cache, each IO has to request new locks
unconditionally. This can be done to have a new enqueue bit in cl_lock_descr,
says CEF_NOCACHE. After an IO is done, we will cancel and delete these top
locks voluntarily. Based on the fact that when we're doing IO, we have to hold
the user count(->cll_users) to prevent the sublock from being cancelled, this
has a benefit that if the sublock is able to be canceled, it must not have any
top lock stacked upon.

2.2 remove bottom-to-top lock callback methods
Currently, we have the following operations initiating from osc:

->clo_weigh: this operation is used to determine which locks can be early
canceled. in ccc_lock_weigh, it checks if the object has mmap
regions, if this is true, we aren't keen on canceling this
lock. We can invent a new mechanism to address this issue.
For example, we can have an ->cll_points field in cl_lock.
And if the lock is from mmap region, we can assign it a higher
point. Then, early cancel logic scans namespaces, and
decreases the point of each lock by one, if the point reaches
zero, it can be canceled. Also, we can make a good LRU
algorithm based on this scheme.
->clo_closure: now that there is no deadlock concerns, we don't need this
method
->clo_modify: since the top lock can't be cached, there is no point in
modifying the descr of top lock
->clo_delete: we can make absolutely sure that when the lock is being
deleted, there's no top lock stacked upon.
->clo_state: we still need this method because we need a way to notify top
locks that a sublock is able to be operated.
However, we can implement this function in a new way: we've
already maintained a parent lock list at lovsub_lock, the list
is a private data to sublock, so that we can access this list
under the protection of sublock's mutex. We can then access
each parent lock in the list, and just wake up the processes
by calling wakeup(parent->cll_wq). We don't need to hold
the parent's mutex at all.
2.3 other updates
We need to revise some code in lov_lock.c to overcome the lack of notification
from bottom to top. This seems to not be difficult.

2.4 Pros and cons
In this scheme, we still use most ideas in current implementation, we just
remove some hard-to-understand code, this makes the modification is under
control.
However, the code might be still difficult, a new engineer still need much time
to grok the cl_lock. This is unavoidable.

3. Discussions
3.1 What if we made sublock be pass through cache as well?
One word: mininal update. The most difficult parts are two-level cache and
two-direction path to update the lock, we just need to grab the essence. To
implement this update, we have to rework dlm callbacks, page finding routines.
This is not worth, because they are working good so far.

Attachments

Issue Links

is related to

LU-3259 cl_lock refactoring

Resolved

Activity

[LU-90] Simplify cl_lock

James A Simmons added a comment - 12/Dec/14 5:22 PM

Same as ~~LU-3259~~. Needs to be closed.

James A Simmons added a comment - 12/Dec/14 5:22 PM Same as LU-3259 . Needs to be closed.

Jinshan Xiong (Inactive) added a comment - 21/Feb/11 9:59 PM - edited

SUMMARY OF CHANGES

==================
0. Summary
For sub objects, there is a bit cl_object_header::coh_lock_cacheable
set. When creating a cl_lock, this bit will be checked: if set, it will try to
match locks from cache; otherwise, new lock will be created.

1.Retained methods of cl_lock_operations

->clo_enqueue: Yes
->clo_wait: Yes
->clo_unuse: No
->clo_use: No
->clo_fits_into: Yes
->clo_cancel: Yes, but the semantcis will be changed to the opposite
operation of enqueue, in history enqueue and cancel are mutual
opposite operations.
->clo_closure: No
->clo_modify: No
->clo_delete: Yes
->clo_fini: Yes
->clo_state: Yes
->clo_weigh: No

2. enqueue and cancel
The sematics of cl_lock_cancel is changed to unref the ldlm lock. ldlm
lock may be cancelled anytime after cl_lock_cancel is called.
Besides enqueuing new ldlm locks, cached ldlm lock will be handled in
clo_enqueue. In osc_lock_enqueue, for cl_lock in CLS_CACHE state, it will call
ldlm_lock_addref_try to add a refcount.
When a lock is being held, i.e. cll_holds is not zero, it cannot be
cancelled. When unholding a lock with cl_lock_unhold, the following policy
will be applied:

       if (--lock->cll_holds == 0) {

                cl_lock_cancel();
                if (lock is uncacheable)
                        destroy it;
                else if (lock is in CLS_HELD)
                        change the state to CLS_CACHE;
                else
                        destroy it;
        }

3. No cl_use/unuse, cl_lock_release, etc any more
4. Other changes
To suppoer cl_lock_peek, a special enq flag, CEF_PEEK will be defined. If
cl_lock_request is called with this flag, no new sublocks will be created -
only cached sublocks will be used.

Jinshan Xiong (Inactive) added a comment - 21/Feb/11 9:59 PM - edited SUMMARY OF CHANGES ================== 0. Summary For sub objects, there is a bit cl_object_header::coh_lock_cacheable set. When creating a cl_lock, this bit will be checked: if set, it will try to match locks from cache; otherwise, new lock will be created. 1.Retained methods of cl_lock_operations ->clo_enqueue: Yes ->clo_wait: Yes ->clo_unuse: No ->clo_use: No ->clo_fits_into: Yes ->clo_cancel: Yes, but the semantcis will be changed to the opposite operation of enqueue, in history enqueue and cancel are mutual opposite operations. ->clo_closure: No ->clo_modify: No ->clo_delete: Yes ->clo_fini: Yes ->clo_state: Yes ->clo_weigh: No 2. enqueue and cancel The sematics of cl_lock_cancel is changed to unref the ldlm lock. ldlm lock may be cancelled anytime after cl_lock_cancel is called. Besides enqueuing new ldlm locks, cached ldlm lock will be handled in clo_enqueue. In osc_lock_enqueue, for cl_lock in CLS_CACHE state, it will call ldlm_lock_addref_try to add a refcount. When a lock is being held, i.e. cll_holds is not zero, it cannot be cancelled. When unholding a lock with cl_lock_unhold, the following policy will be applied: if (--lock->cll_holds == 0) { cl_lock_cancel(); if (lock is uncacheable) destroy it; else if (lock is in CLS_HELD) change the state to CLS_CACHE; else destroy it; } 3. No cl_use/unuse, cl_lock_release, etc any more 4. Other changes To suppoer cl_lock_peek, a special enq flag, CEF_PEEK will be defined. If cl_lock_request is called with this flag, no new sublocks will be created - only cached sublocks will be used.

Simplify cl_lock