Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2492

MDT thread stuck: mdd_object_find -> lu_object_find_at

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.4.0
    • Lustre 2.4.0
    • 3
    • 5846

    Description

      I think this is a duplicate of LU-1640, but just in case it's not I'm opening this as a separate ticket.

      I've found the following message on the console of our Grove MDS today after upgrading to 2.3.57-1chaos-1surya1 (2.3.57-1chaos + a couple changes for LU-2109) and rebooting a few times.

                                                                               
      2012-12-13 13:25:25 LNet: Service thread pid 33078 was inactive for 450.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      2012-12-13 13:25:25 Pid: 33078, comm: mdt01_020                                    
      2012-12-13 13:25:25                                                                
      2012-12-13 13:25:25 Call Trace:                                                    
      2012-12-13 13:25:25  [<ffffffffa05b07de>] cfs_waitq_wait+0xe/0x10 [libcfs]         
      2012-12-13 13:25:25  [<ffffffffa0764fe3>] lu_object_find_at+0xb3/0x460 [obdclass]
      2012-12-13 13:25:25  [<ffffffff8105ea30>] ? default_wake_function+0x0/0x20         
      2012-12-13 13:25:25  [<ffffffff8126c735>] ? _atomic_dec_and_lock+0x55/0x80         
      2012-12-13 13:25:25  [<ffffffffa07653cf>] lu_object_find_slice+0x1f/0x80 [obdclass]
      2012-12-13 13:25:25  [<ffffffffa0c27310>] mdd_object_find+0x10/0x70 [mdd]          
      2012-12-13 13:25:25  [<ffffffffa0c2b89f>] mdd_path+0x35f/0x1060 [mdd]              
      2012-12-13 13:25:25  [<ffffffffa08f30f1>] ? lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc]
      2012-12-13 13:25:25  [<ffffffffa0c8dfd7>] mdt_get_info+0x567/0xbb0 [mdt]           
      2012-12-13 13:25:25  [<ffffffffa0c8941d>] ? mdt_unpack_req_pack_rep+0x4d/0x4c0 [mdt]
      2012-12-13 13:25:25  [<ffffffffa0c91782>] mdt_handle_common+0x932/0x1760 [mdt]  
      2012-12-13 13:25:25  [<ffffffffa0c92685>] mdt_regular_handle+0x15/0x20 [mdt]       
      2012-12-13 13:25:25  [<ffffffffa09036ac>] ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc]
      2012-12-13 13:25:25  [<ffffffffa05b06be>] ? cfs_timer_arm+0xe/0x10 [libcfs]        
      2012-12-13 13:25:25  [<ffffffffa05c214f>] ? lc_watchdog_touch+0x6f/0x180 [libcfs]
      2012-12-13 13:25:25  [<ffffffffa08faa79>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
      2012-12-13 13:25:25  [<ffffffff81051ba3>] ? __wake_up+0x53/0x70                    
      2012-12-13 13:25:25  [<ffffffffa0904c45>] ptlrpc_main+0xbb5/0x1970 [ptlrpc]        
      2012-12-13 13:25:25  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
      2012-12-13 13:25:25  [<ffffffff8100c14a>] child_rip+0xa/0x20                       
      2012-12-13 13:25:25  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
      2012-12-13 13:25:25  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
      2012-12-13 13:25:25  [<ffffffff8100c140>] ? child_rip+0x0/0x20                     
      2012-12-13 13:25:25                                                                
      2012-12-13 13:25:25 LustreError: dumping log to /tmp/lustre-log.1355433925.33078
      

      Even nearly an hour later, it looks like the thread is still stuck. I continually see the following message every few minutes:

                                                                               
      2012-12-13 14:22:12 INFO: task mdt01_020:33078 blocked for more than 120 seconds.
      2012-12-13 14:22:12 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      2012-12-13 14:22:12 mdt01_020     D 0000000000000004     0 33078      2 0x00000000
      2012-12-13 14:22:12  ffff880f444d7ae0 0000000000000046 0000000000000000 ffff880f444d7b60
      2012-12-13 14:22:12  ffff880f444d7a90 ffffc9007e22e02c 0000000000000246 0000000000000246
      2012-12-13 14:22:12  ffff880f444d5ab8 ffff880f444d7fd8 000000000000f4e8 ffff880f444d5ab8
      2012-12-13 14:22:12 Call Trace:                                                    
      2012-12-13 14:22:12  [<ffffffffa05b07de>] cfs_waitq_wait+0xe/0x10 [libcfs]         
      2012-12-13 14:22:12  [<ffffffffa0764fe3>] lu_object_find_at+0xb3/0x460 [obdclass]
      2012-12-13 14:22:12  [<ffffffff8105ea30>] ? default_wake_function+0x0/0x20         
      2012-12-13 14:22:12  [<ffffffff8126c735>] ? _atomic_dec_and_lock+0x55/0x80         
      2012-12-13 14:22:12  [<ffffffffa07653cf>] lu_object_find_slice+0x1f/0x80 [obdclass]
      2012-12-13 14:22:12  [<ffffffffa0c27310>] mdd_object_find+0x10/0x70 [mdd]          
      2012-12-13 14:22:12  [<ffffffffa0c2b89f>] mdd_path+0x35f/0x1060 [mdd]              
      2012-12-13 14:22:12  [<ffffffffa08f30f1>] ? lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc]
      2012-12-13 14:22:12  [<ffffffffa0c8dfd7>] mdt_get_info+0x567/0xbb0 [mdt]           
      2012-12-13 14:22:12  [<ffffffffa0c8941d>] ? mdt_unpack_req_pack_rep+0x4d/0x4c0 [mdt]
      2012-12-13 14:22:12  [<ffffffffa0c91782>] mdt_handle_common+0x932/0x1760 [mdt]  
      2012-12-13 14:22:12  [<ffffffffa0c92685>] mdt_regular_handle+0x15/0x20 [mdt]       
      2012-12-13 14:22:12  [<ffffffffa09036ac>] ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc]
      2012-12-13 14:22:12  [<ffffffffa05b06be>] ? cfs_timer_arm+0xe/0x10 [libcfs]        
      2012-12-13 14:22:12  [<ffffffffa05c214f>] ? lc_watchdog_touch+0x6f/0x180 [libcfs]
      2012-12-13 14:22:12  [<ffffffffa08faa79>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
      2012-12-13 14:22:12  [<ffffffff81051ba3>] ? __wake_up+0x53/0x70                    
      2012-12-13 14:22:12  [<ffffffffa0904c45>] ptlrpc_main+0xbb5/0x1970 [ptlrpc]        
      2012-12-13 14:22:12  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
      2012-12-13 14:22:12  [<ffffffff8100c14a>] child_rip+0xa/0x20                       
      2012-12-13 14:22:12  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
      2012-12-13 14:22:12  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
      2012-12-13 14:22:12  [<ffffffff8100c140>] ? child_rip+0x0/0x20                     
      

      I've attached the lustre log file dumped by the LNet message above:

                                                                               
      2012-12-13 13:25:25 LustreError: dumping log to /tmp/lustre-log.1355433925.33078
      

      Attachments

        Issue Links

          Activity

            [LU-2492] MDT thread stuck: mdd_object_find -> lu_object_find_at

            Awesome! Thanks Bobijam.

            prakash Prakash Surya (Inactive) added a comment - Awesome! Thanks Bobijam.
            bobijam Zhenyu Xu added a comment -

            landed on master for 2.4.0

            bobijam Zhenyu Xu added a comment - landed on master for 2.4.0
            bobijam Zhenyu Xu added a comment - - edited

            patch tracking at http://review.whamcloud.com/3439

            commit message
            LU-2492 obdclass: lu_object_find_at() waits forever
            
            In lu_object_put(), cfs_hash_bd_dec_and_lock() could possibly count
            the refcount added by htable_lookup(), so if the thread run to this
            code path is the last object holder, it should have gone down to the
            free part, but because the htable_lookup() messes the refcount, the
            last object holder misses the change to free the object,
            table_lookup() will then always find the 0 referred object in the hash
            table, which no one will delete it from hash table and free it any
            longer, and lu_object_find_at() keeps waiting forever.
            
            In this case, we'd do the unhash and free process in htable_lookup().
            
            
            bobijam Zhenyu Xu added a comment - - edited patch tracking at http://review.whamcloud.com/3439 commit message LU-2492 obdclass: lu_object_find_at() waits forever In lu_object_put(), cfs_hash_bd_dec_and_lock() could possibly count the refcount added by htable_lookup(), so if the thread run to this code path is the last object holder, it should have gone down to the free part, but because the htable_lookup() messes the refcount, the last object holder misses the change to free the object, table_lookup() will then always find the 0 referred object in the hash table, which no one will delete it from hash table and free it any longer, and lu_object_find_at() keeps waiting forever. In this case, we'd do the unhash and free process in htable_lookup().
            pjones Peter Jones added a comment -

            Bobijam

            Could you please comment as to whether this issue is a duplicate of LU-1640?

            Thanks

            Peter

            pjones Peter Jones added a comment - Bobijam Could you please comment as to whether this issue is a duplicate of LU-1640 ? Thanks Peter

            People

              bobijam Zhenyu Xu
              prakash Prakash Surya (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: