[LU-1682] Memory allocation failure in mgc_enqueue() tends to cause LBUG and refcount issue Created: 27/Jul/12  Updated: 08/Feb/18  Resolved: 08/Feb/18

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0, Lustre 2.3.0, Lustre 1.8.8
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Hiroya Nozaki Assignee: Jinshan Xiong (Inactive)
Resolution: Won't Fix Votes: 0
Labels: patch
Environment:

Originally it happened with FEFS, based on Lustre-1.8.5 in RIKEN K computer environment
MDSx1, OSSx2592, OSTx5184, Clientx84672


Severity: 3
Rank (Obsolete): 4517

 Description   

Hi, I believe I found some bugs in mgc_enqueue().
I'm sure in 1.8.x case because I've often run into the case actually,
but I'm not so in 2.x case because it's due to only my sourde code
inspection, not actuall facing the problem.

So could someone verify whether or not it's true or not in the 2.x case?
The over views of the bugs are the below.

1)
mgc_blocking_ast() is never callced when the case that mgc_enqnene()
failed to allocat memory for a new ldlm_lock for cld, because there
is no ldlm_lock holding callback-functions due to the allocation failure.
That's why the cld remains and it cause the obd_device of MGC remains
forever even after umount-command is exexuted.

IMHO, In order to fix the problem, we should call config_log_put() when
the above allocation failure happened. But we have no way to know just
only the failure because ldlm_cli_enqueue() returns only -ENOMEM to caller
and there're some other functions which will return -ENOMEM.

Locally, I've fixed it with adding one more argument, though my case
is FEFS, based on Lustre-1.8.5. but I think it's not so good.

So I'd like to discuss about the way to fix it.

2)
When the case that mgc_enqnene() failed to allocat memory for a new
ptrpc_request for cld, failed_lock_cleanup() is called, and the call
path will follow the below list ...

failed_lock_clanup()
    ldlm_lock_decref_internal()
        ldlm_handle_bl_callback()
            mgc_blocking_ast()
                ldlm_cli_cancel()
                    ldlm_cli_cancel_local()

Then, it will reach ldlm_cli_cancel_local() and go the below
else-statement because the lock->l_conn_export haven't been
set so far.
But it's caught by LBUG because it's a client side lock.

static int ldlm_cli_cancel_local(struct ldlm_lock *lock)
{
        int rc = LDLM_FL_LOCAL_ONLY;
        ENTRY;

        if (lock->l_conn_export) { <---- NULL in the case
                ....
        } else {
                if (ns_is_client(ldlm_lock_to_ns(lock))) { <---- This is a client side lock, then it will be True
                        LDLM_ERROR(lock, "Trying to cancel local lock"); <---- caught here
                        LBUG();
                }
                LDLM_DEBUG(lock, "server-side local cancel");
                ldlm_lock_cancel(lock);
                ldlm_reprocess_all(lock->l_resource);
        }

        RETURN(rc);
} 

So, I believe ldlm_cli_enqueue should be modified like the below pseudo code.

+        lock->l_conn_export = exp;
+        lock->l_export = NULL;
+        lock->l_blocking_ast = einfo->ei_cb_bl;

        if (reqp == NULL || *reqp == NULL) {
                req = ptlrpc_request_alloc_pack(class_exp2cliimp(exp),
                                                &RQF_LDLM_ENQUEUE,
                                                LUSTRE_DLM_VERSION,
                                                LDLM_ENQUEUE);
                if (req == NULL) {
                        failed_lock_cleanup(ns, lock, einfo->ei_mode);
                        LDLM_LOCK_RELEASE(lock);
                        RETURN(-ENOMEM);
                }
                req_passed_in = 0;
                if (reqp)
                        *reqp = req;
        } else {
                int len;

                req = *reqp;
                len = req_capsule_get_size(&req->rq_pill, &RMF_DLM_REQ,
                                           RCL_CLIENT);
                LASSERTF(len >= sizeof(*body), "buflen[%d] = %d, not %d\n",
                         DLM_LOCKREQ_OFF, len, (int)sizeof(*body));
        }

-        lock->l_conn_export = exp;
-        lock->l_export = NULL;
-        lock->l_blocking_ast = einfo->ei_cb_bl;


 Comments   
Comment by Oleg Drokin [ 27/Jul/12 ]

Thanks for uncovering this problem.

I see that the code in 2.x is different than in 1.8. The cld get is done after mgc_enqueue call in 2.x.
In this case what needs to be done is in mgc_process_log() to check that lockh is still 0 (there is a function to check that hte lock handle is valid), and if so do not do the get.

In 1.8 you can check that the lockh is not 0 after ldlm_cli_enqueue in mgc_enqueue and if it is still 0 (well, again, the not valid function), drop the cld refcount.

I think this approach should help for issue #1.

For issue #2, 2.x is unaffected in this particular case because we allocate the request before calling ldlm_cli_enqueue.

But otherwise the bug is real and could hit in some other scenarios. As such please submit a separate patch just following your pseudocode for master (and b1_8 too).

Thanks again!

Comment by Hiroya Nozaki [ 27/Jul/12 ]

Hi, Oleg. It's been a while and thank you for your quick response!!

1)
Oh, sorry. Apparently I had read a bit old code, Lustre-2.1.1 or so.
As you had said, config_log_get() is called only in the case when
ldlm_cli_enqueue() returns successfully.

In 1.8 you can check that the lockh is not 0 after ldlm_cli_enqueue
in mgc_enqueue and if it is still 0 (well, again, the not valid function),
drop the cld refcount.

I got it, it sounds good. I'll try it from now on. thanks alot !!

2)

But otherwise the bug is real and could hit in some other scenarios.
As such please submit a separate patch just following your pseudocode
for master (and b1_8 too).

OK, so I'll create and submit the new patch soon.

Comment by Oleg Drokin [ 27/Jul/12 ]

Oh, sorry. Apparently I had read a bit old code, Lustre-2.1.1 or so.
As you had said, config_log_get() is called only in the case when
ldlm_cli_enqueue() returns successfully

Sorry, what I mean is while the config_log_get() is called after mgc_enqueue in master, it's still called regardless of success
(you can see that the call is there in both branches of that if), so the bug is there and it still needs to be fixed, just in a different place.

Comment by Hiroya Nozaki [ 27/Jul/12 ]

Sorry, what I mean is while the config_log_get() is called after mgc_enqueue in master, it's still called regardless of success

Wow, that's true. I seemed to jump to conclusions ...

OK, so what I'm supposed to do is trying to fix the #1 bug of both 1.8
and 2.x case, right? If it's true, I'll try to fix it from now on.

Comment by Oleg Drokin [ 27/Jul/12 ]

Yes, it would be great if you can submit patches for both issues for both branches.

Comment by Hiroya Nozaki [ 27/Jul/12 ]

for the #2 bug, b1_8 branch
http://review.whamcloud.com/#change,3487

Comment by Hiroya Nozaki [ 27/Jul/12 ]

for the #2 bug, master branch
http://review.whamcloud.com/#change,3488

Comment by Hiroya Nozaki [ 27/Jul/12 ]

For the #1 bug, b1_8 branch.
http://review.whamcloud.com/#change,3489

Comment by Hiroya Nozaki [ 27/Jul/12 ]

For the #1 but, master branch
http://review.whamcloud.com/#change,3490

I especially want you all to review this patch.
because I'm not so experienced in Lustre-2.x stuff yet.

Comment by Hiroya Nozaki [ 07/Aug/12 ]

Now I've investigated a weird dump of FEFS based on Lustre-1.8.5.
In that, there are two config_llog_datas in config_llog_list. And each cld is with
cld_refcount=1 and cld_lostlock=1 whereas rq_state is 0xf, RQ_RUNNING|RQ_NOW|RQ_LATER|RQ_STOP
but with no mgc_requeue_thread.

Do you know of some similar phenomena? Anything can be helpful for me to proceed it.
by the way, we haven't added something into the mgc_enqueue logic.

Comment by Jinshan Xiong (Inactive) [ 08/Feb/18 ]

close old tickets

Generated at Sat Feb 10 01:18:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.