Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2807

lockup in server completion ast -> lu_object_find_at

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Lustre 2.4.0
    • 3
    • 6794

    Description

      Running racer I hit a problem multiple times where on completion AST the callback gets stuck looking for some object.
      Alex thinks it's a not fully fixed race vs object deletion of some sort.
      The stack trace looks like this:

      [175924.328073] INFO: task ptlrpc_hr01_003:16414 blocked for more than 120 seconds.
      [175924.328610] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [175924.329108] ptlrpc_hr01_0 D 0000000000000006  3952 16414      2 0x00000000
      [175924.329432]  ffff880076a19920 0000000000000046 0000000000000040 0000000000000286
      [175924.329950]  ffff880076a198a0 0000000000000286 0000000000000286 ffffc9000376b040
      [175924.330457]  ffff8800573a67b8 ffff880076a19fd8 000000000000fba8 ffff8800573a67b8
      [175924.330950] Call Trace:
      [175924.331191]  [<ffffffffa0743c36>] ? htable_lookup+0x1a6/0x1c0 [obdclass]
      [175924.331505]  [<ffffffffa041e79e>] cfs_waitq_wait+0xe/0x10 [libcfs]
      [175924.331807]  [<ffffffffa0744243>] lu_object_find_at+0xb3/0x360 [obdclass]
      [175924.332104]  [<ffffffff81057d60>] ? default_wake_function+0x0/0x20
      [175924.332403]  [<ffffffffa07413df>] ? keys_fill+0x6f/0x190 [obdclass]
      [175924.332746]  [<ffffffffa0744506>] lu_object_find+0x16/0x20 [obdclass]
      [175924.333035]  [<ffffffffa0549ea6>] mdt_object_find+0x56/0x170 [mdt]
      [175924.333398]  [<ffffffffa0586e63>] mdt_lvbo_fill+0x2f3/0x800 [mdt]
      [175924.333715]  [<ffffffffa0845c1a>] ldlm_server_completion_ast+0x18a/0x640 [ptlrpc]
      [175924.334204]  [<ffffffffa0845a90>] ? ldlm_server_completion_ast+0x0/0x640 [ptlrpc]
      [175924.334655]  [<ffffffffa081bbdc>] ldlm_work_cp_ast_lock+0xcc/0x200 [ptlrpc]
      [175924.334976]  [<ffffffffa085c18f>] ptlrpc_set_wait+0x6f/0x880 [ptlrpc]
      [175924.335264]  [<ffffffff81090154>] ? __init_waitqueue_head+0x24/0x40
      [175924.335559]  [<ffffffffa041e8a5>] ? cfs_waitq_init+0x15/0x20 [libcfs]
      [175924.335867]  [<ffffffffa085876e>] ? ptlrpc_prep_set+0x11e/0x300 [ptlrpc]
      [175924.336134]  [<ffffffffa081bb10>] ? ldlm_work_cp_ast_lock+0x0/0x200 [ptlrpc]
      [175924.336444]  [<ffffffffa081e19b>] ldlm_run_ast_work+0x1db/0x460 [ptlrpc]
      [175924.336767]  [<ffffffffa081eda4>] ldlm_reprocess_all+0x114/0x300 [ptlrpc]
      [175924.337067]  [<ffffffffa08372e3>] ldlm_cli_cancel_local+0x2b3/0x470 [ptlrpc]
      [175924.337445]  [<ffffffffa083bbab>] ldlm_cli_cancel+0x5b/0x360 [ptlrpc]
      [175924.337719]  [<ffffffffa083bf42>] ldlm_blocking_ast_nocheck+0x92/0x320 [ptlrpc]
      [175924.338177]  [<ffffffffa0819070>] ? lock_res_and_lock+0x30/0x50 [ptlrpc]
      [175924.338464]  [<ffffffffa0549d40>] mdt_blocking_ast+0x190/0x2a0 [mdt]
      [175924.338759]  [<ffffffffa042e401>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      [175924.339051]  [<ffffffff814faf3e>] ? _spin_unlock+0xe/0x10
      [175924.339339]  [<ffffffffa083f950>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
      [175924.339814]  [<ffffffffa0820cc6>] ldlm_lock_decref_internal+0x426/0xc80 [ptlrpc]
      [175924.340282]  [<ffffffff814faf3e>] ? _spin_unlock+0xe/0x10
      [175924.340614]  [<ffffffffa0712217>] ? class_handle2object+0x97/0x170 [obdclass]
      [175924.341175]  [<ffffffffa0821f49>] ldlm_lock_decref+0x39/0x90 [ptlrpc]
      [175924.341527]  [<ffffffffa087112b>] ptlrpc_hr_main+0x39b/0x760 [ptlrpc]
      [175924.341824]  [<ffffffff81057d60>] ? default_wake_function+0x0/0x20
      [175924.342141]  [<ffffffffa0870d90>] ? ptlrpc_hr_main+0x0/0x760 [ptlrpc]
      [175924.342444]  [<ffffffff8100c14a>] child_rip+0xa/0x20
      [175924.342734]  [<ffffffffa0870d90>] ? ptlrpc_hr_main+0x0/0x760 [ptlrpc]
      [175924.343068]  [<ffffffffa0870d90>] ? ptlrpc_hr_main+0x0/0x760 [ptlrpc]
      [175924.343376]  [<ffffffff8100c140>] ? child_rip+0x0/0x20
      

      Attachments

        Issue Links

          Activity

            [LU-2807] lockup in server completion ast -> lu_object_find_at

            No, I did not mean the problem is in layout-swap/lvb but that it was put back to front due to it. We already agreed it is an old/known problem/race between unlink and getattr. I just wanted to comment on the fact that now you think the best place to handle and fix this is in mdt_lvbo_fill() where I was pointing that the extra-lookup causing the hung situation is.

            bfaccini Bruno Faccini (Inactive) added a comment - No, I did not mean the problem is in layout-swap/lvb but that it was put back to front due to it. We already agreed it is an old/known problem/race between unlink and getattr. I just wanted to comment on the fact that now you think the best place to handle and fix this is in mdt_lvbo_fill() where I was pointing that the extra-lookup causing the hung situation is.

            So finally, you changed your mind and will fix it on the LVB/layout-swap side as we were discussing before ?

            This is not an issue about layout-swap or something. Maybe I missed something in our previous conversation

            jay Jinshan Xiong (Inactive) added a comment - So finally, you changed your mind and will fix it on the LVB/layout-swap side as we were discussing before ? This is not an issue about layout-swap or something. Maybe I missed something in our previous conversation

            frankly, I can't say this is very nice solution.. and I don't think one more RPC to fetch LOV after data restore is such a big problem.

            bzzz Alex Zhuravlev added a comment - frankly, I can't say this is very nice solution.. and I don't think one more RPC to fetch LOV after data restore is such a big problem.

            So finally, you changed your mind and will fix it on the LVB/layout-swap side as we were discussing before ?

            bfaccini Bruno Faccini (Inactive) added a comment - So finally, you changed your mind and will fix it on the LVB/layout-swap side as we were discussing before ?

            I'm going to fix this issue by finding a field in ldlm_lock, say l_tree_node, to store mdt_object, if it's an intent operation which find the object firstly and then request dlm lock. So in mdt_lvbo_fill(), it only calls mdt_object_find() if it's NULL.

            jay Jinshan Xiong (Inactive) added a comment - I'm going to fix this issue by finding a field in ldlm_lock, say l_tree_node, to store mdt_object, if it's an intent operation which find the object firstly and then request dlm lock. So in mdt_lvbo_fill(), it only calls mdt_object_find() if it's NULL.
            jay Jinshan Xiong (Inactive) added a comment - patch is at: http://review.whamcloud.com/5911

            I mean we can declare a new function, say: mdt_object_lookup() which will lookup the hash table and make sure the object exists in the cache. In mdt_object_lookup(), it also calls lu_object_find(), but with a new flags in lu_object_conf, say: LOC_F_LOOKUP. With this flag, lu_object_find() will look up hash table only, and of course, if the object is dying it will return -ENOENT.

            This assumes that the object must have been referenced by someone. For getattr intent request, this is true. However we need to check other code path to make sure.

            jay Jinshan Xiong (Inactive) added a comment - I mean we can declare a new function, say: mdt_object_lookup() which will lookup the hash table and make sure the object exists in the cache. In mdt_object_lookup(), it also calls lu_object_find(), but with a new flags in lu_object_conf, say: LOC_F_LOOKUP. With this flag, lu_object_find() will look up hash table only, and of course, if the object is dying it will return -ENOENT. This assumes that the object must have been referenced by someone. For getattr intent request, this is true. However we need to check other code path to make sure.

            Thank's Jinshan, so for you problem has not been introduced by LVB/layout-swap changes but only highlighted.

            And the fix you suggest is to give getattr the mean to detect unlink occurred and object is dying with a new lu_object_lookup() method, just after it acquired the "inodebit dlm lock" and return ENOENT if object is dying ?

            bfaccini Bruno Faccini (Inactive) added a comment - Thank's Jinshan, so for you problem has not been introduced by LVB/layout-swap changes but only highlighted. And the fix you suggest is to give getattr the mean to detect unlink occurred and object is dying with a new lu_object_lookup() method, just after it acquired the "inodebit dlm lock" and return ENOENT if object is dying ?

            I think this is a race between unlink and getattr. Let's make up a test case for this race, say:
            1. client1 unlink reaches the MDT;
            2. before unlink enqueues lock, client2 tries to send a getattr intent req;
            3. unlink acquires inodebits dlm lock;
            4. before unlink releases the lock, getattr comes to acquire the lock, blocked;
            5. unlink finishes and releases the lock, getattr's completion_ast will be invoked;
            6. this problem should be reproduced.

            If this is the case, we can work out a lu_object_lookup() and if the object is already killed or not existed, -ENOENT should be returned; then -ENOENT should be returned to getattr intent request too.

            jay Jinshan Xiong (Inactive) added a comment - I think this is a race between unlink and getattr. Let's make up a test case for this race, say: 1. client1 unlink reaches the MDT; 2. before unlink enqueues lock, client2 tries to send a getattr intent req; 3. unlink acquires inodebits dlm lock; 4. before unlink releases the lock, getattr comes to acquire the lock, blocked; 5. unlink finishes and releases the lock, getattr's completion_ast will be invoked; 6. this problem should be reproduced. If this is the case, we can work out a lu_object_lookup() and if the object is already killed or not existed, -ENOENT should be returned; then -ENOENT should be returned to getattr intent request too.

            The object lookup, due to lvb and causing the dead-lock, is the result of recent integration of patches for layout-swap, mainly from LU-1876.

            I'll ask Jinshan if this scenario sounds familiar to him and if has some idea on how to fix it without breaking layout-swap.

            bfaccini Bruno Faccini (Inactive) added a comment - The object lookup, due to lvb and causing the dead-lock, is the result of recent integration of patches for layout-swap, mainly from LU-1876 . I'll ask Jinshan if this scenario sounds familiar to him and if has some idea on how to fix it without breaking layout-swap.
            bfaccini Bruno Faccini (Inactive) added a comment - - edited

            The dead-lock comes from the fact thread 18000, which owns last object reference, is waiting for lock-completion, which is itself handled by thread 15574. But 15574 can not accomplish the completion for this lock because, in parsing its l_cp_ast list, it is already running completion for an other lock on same object when it has been marked dying in-between and due to lvb an object lookup has been necessary, causing to wait for object full death that can never happen.

            I am currently reviewing concerned source code to propose a fix now.

            bfaccini Bruno Faccini (Inactive) added a comment - - edited The dead-lock comes from the fact thread 18000, which owns last object reference, is waiting for lock-completion, which is itself handled by thread 15574. But 15574 can not accomplish the completion for this lock because, in parsing its l_cp_ast list, it is already running completion for an other lock on same object when it has been marked dying in-between and due to lvb an object lookup has been necessary, causing to wait for object full death that can never happen. I am currently reviewing concerned source code to propose a fix now.

            People

              jay Jinshan Xiong (Inactive)
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: