[LU-11075] suspicious code in ldlm_prepare_lru_list() Created: 07/Jun/18  Updated: 05/Aug/20  Resolved: 01/Sep/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Minor
Reporter: John Hammond Assignee: John Hammond
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-11092 NMI watchdog: BUG: soft lockup - CPU#... Open
is related to LU-10537 softlockup in ldlm_prepare_lru_list() Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

There is some suspicious code in ldlm_prepare_lru_list() which should be fixed:

static int ldlm_prepare_lru_list(struct ldlm_namespace *ns,
                                 struct list_head *cancels, int count, int max,
                                 enum ldlm_lru_flags lru_flags)
...
                        if (!ldlm_is_canceling(lock) || /* XXX '||' should be '&&' */
                            !ldlm_is_converting(lock))
                                break;
...
                if (result == LDLM_POLICY_SKIP_LOCK) {
                        lu_ref_del(&lock->l_reference, __func__, current);
                        LDLM_LOCK_RELEASE(lock); /* XXX release should follow if block. */
                        if (no_wait) {
                                spin_lock(&ns->ns_lock);
                                if (!list_empty(&lock->l_lru) &&
                                    lock->l_lru.prev == ns->ns_last_pos)
                                        ns->ns_last_pos = &lock->l_lru;
                                spin_unlock(&ns->ns_lock);
                        }
                        continue;
                }


 Comments   
Comment by Gerrit Updater [ 07/Jun/18 ]

John L. Hammond (john.hammond@intel.com) uploaded a new patch: https://review.whamcloud.com/32660
Subject: LU-11075 ldlm: correct logic in ldlm_prepare_lru_list()
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8fe6f74a2a7fcb2b8ac1af4e00b21d04ed7f5951

Comment by Andreas Dilger [ 18/Aug/18 ]

Could this be the root cause of LU-11092 and numerous other LDLM LRU overflow problems?

Comment by John Hammond [ 18/Aug/18 ]

It's possible.

Comment by Gerrit Updater [ 01/Sep/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32660/
Subject: LU-11075 ldlm: correct logic in ldlm_prepare_lru_list()
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: aecafb57d5b60ea430ce9ef13783eb74ad4f6936

Comment by Peter Jones [ 01/Sep/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:40:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.