[LU-2492] MDT thread stuck: mdd_object_find -> lu_object_find_at Created: 13/Dec/12  Updated: 19/Mar/14  Resolved: 08/Jan/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Prakash Surya (Inactive) Assignee: Zhenyu Xu
Resolution: Fixed Votes: 0
Labels: MB, sequoia

Attachments: File lustre-log.1355433925.33078    
Issue Links:
Duplicate
duplicates LU-3870 deadlock with journal Closed
is duplicated by LU-1640 Test failure on test suite lustre-rsy... Resolved
Severity: 3
Rank (Obsolete): 5846

 Description   

I think this is a duplicate of LU-1640, but just in case it's not I'm opening this as a separate ticket.

I've found the following message on the console of our Grove MDS today after upgrading to 2.3.57-1chaos-1surya1 (2.3.57-1chaos + a couple changes for LU-2109) and rebooting a few times.

                                                                         
2012-12-13 13:25:25 LNet: Service thread pid 33078 was inactive for 450.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
2012-12-13 13:25:25 Pid: 33078, comm: mdt01_020                                    
2012-12-13 13:25:25                                                                
2012-12-13 13:25:25 Call Trace:                                                    
2012-12-13 13:25:25  [<ffffffffa05b07de>] cfs_waitq_wait+0xe/0x10 [libcfs]         
2012-12-13 13:25:25  [<ffffffffa0764fe3>] lu_object_find_at+0xb3/0x460 [obdclass]
2012-12-13 13:25:25  [<ffffffff8105ea30>] ? default_wake_function+0x0/0x20         
2012-12-13 13:25:25  [<ffffffff8126c735>] ? _atomic_dec_and_lock+0x55/0x80         
2012-12-13 13:25:25  [<ffffffffa07653cf>] lu_object_find_slice+0x1f/0x80 [obdclass]
2012-12-13 13:25:25  [<ffffffffa0c27310>] mdd_object_find+0x10/0x70 [mdd]          
2012-12-13 13:25:25  [<ffffffffa0c2b89f>] mdd_path+0x35f/0x1060 [mdd]              
2012-12-13 13:25:25  [<ffffffffa08f30f1>] ? lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc]
2012-12-13 13:25:25  [<ffffffffa0c8dfd7>] mdt_get_info+0x567/0xbb0 [mdt]           
2012-12-13 13:25:25  [<ffffffffa0c8941d>] ? mdt_unpack_req_pack_rep+0x4d/0x4c0 [mdt]
2012-12-13 13:25:25  [<ffffffffa0c91782>] mdt_handle_common+0x932/0x1760 [mdt]  
2012-12-13 13:25:25  [<ffffffffa0c92685>] mdt_regular_handle+0x15/0x20 [mdt]       
2012-12-13 13:25:25  [<ffffffffa09036ac>] ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc]
2012-12-13 13:25:25  [<ffffffffa05b06be>] ? cfs_timer_arm+0xe/0x10 [libcfs]        
2012-12-13 13:25:25  [<ffffffffa05c214f>] ? lc_watchdog_touch+0x6f/0x180 [libcfs]
2012-12-13 13:25:25  [<ffffffffa08faa79>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
2012-12-13 13:25:25  [<ffffffff81051ba3>] ? __wake_up+0x53/0x70                    
2012-12-13 13:25:25  [<ffffffffa0904c45>] ptlrpc_main+0xbb5/0x1970 [ptlrpc]        
2012-12-13 13:25:25  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
2012-12-13 13:25:25  [<ffffffff8100c14a>] child_rip+0xa/0x20                       
2012-12-13 13:25:25  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
2012-12-13 13:25:25  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
2012-12-13 13:25:25  [<ffffffff8100c140>] ? child_rip+0x0/0x20                     
2012-12-13 13:25:25                                                                
2012-12-13 13:25:25 LustreError: dumping log to /tmp/lustre-log.1355433925.33078

Even nearly an hour later, it looks like the thread is still stuck. I continually see the following message every few minutes:

                                                                         
2012-12-13 14:22:12 INFO: task mdt01_020:33078 blocked for more than 120 seconds.
2012-12-13 14:22:12 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2012-12-13 14:22:12 mdt01_020     D 0000000000000004     0 33078      2 0x00000000
2012-12-13 14:22:12  ffff880f444d7ae0 0000000000000046 0000000000000000 ffff880f444d7b60
2012-12-13 14:22:12  ffff880f444d7a90 ffffc9007e22e02c 0000000000000246 0000000000000246
2012-12-13 14:22:12  ffff880f444d5ab8 ffff880f444d7fd8 000000000000f4e8 ffff880f444d5ab8
2012-12-13 14:22:12 Call Trace:                                                    
2012-12-13 14:22:12  [<ffffffffa05b07de>] cfs_waitq_wait+0xe/0x10 [libcfs]         
2012-12-13 14:22:12  [<ffffffffa0764fe3>] lu_object_find_at+0xb3/0x460 [obdclass]
2012-12-13 14:22:12  [<ffffffff8105ea30>] ? default_wake_function+0x0/0x20         
2012-12-13 14:22:12  [<ffffffff8126c735>] ? _atomic_dec_and_lock+0x55/0x80         
2012-12-13 14:22:12  [<ffffffffa07653cf>] lu_object_find_slice+0x1f/0x80 [obdclass]
2012-12-13 14:22:12  [<ffffffffa0c27310>] mdd_object_find+0x10/0x70 [mdd]          
2012-12-13 14:22:12  [<ffffffffa0c2b89f>] mdd_path+0x35f/0x1060 [mdd]              
2012-12-13 14:22:12  [<ffffffffa08f30f1>] ? lustre_pack_reply_v2+0x1e1/0x280 [ptlrpc]
2012-12-13 14:22:12  [<ffffffffa0c8dfd7>] mdt_get_info+0x567/0xbb0 [mdt]           
2012-12-13 14:22:12  [<ffffffffa0c8941d>] ? mdt_unpack_req_pack_rep+0x4d/0x4c0 [mdt]
2012-12-13 14:22:12  [<ffffffffa0c91782>] mdt_handle_common+0x932/0x1760 [mdt]  
2012-12-13 14:22:12  [<ffffffffa0c92685>] mdt_regular_handle+0x15/0x20 [mdt]       
2012-12-13 14:22:12  [<ffffffffa09036ac>] ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc]
2012-12-13 14:22:12  [<ffffffffa05b06be>] ? cfs_timer_arm+0xe/0x10 [libcfs]        
2012-12-13 14:22:12  [<ffffffffa05c214f>] ? lc_watchdog_touch+0x6f/0x180 [libcfs]
2012-12-13 14:22:12  [<ffffffffa08faa79>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
2012-12-13 14:22:12  [<ffffffff81051ba3>] ? __wake_up+0x53/0x70                    
2012-12-13 14:22:12  [<ffffffffa0904c45>] ptlrpc_main+0xbb5/0x1970 [ptlrpc]        
2012-12-13 14:22:12  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
2012-12-13 14:22:12  [<ffffffff8100c14a>] child_rip+0xa/0x20                       
2012-12-13 14:22:12  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
2012-12-13 14:22:12  [<ffffffffa0904090>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]        
2012-12-13 14:22:12  [<ffffffff8100c140>] ? child_rip+0x0/0x20                     

I've attached the lustre log file dumped by the LNet message above:

                                                                         
2012-12-13 13:25:25 LustreError: dumping log to /tmp/lustre-log.1355433925.33078


 Comments   
Comment by Peter Jones [ 13/Dec/12 ]

Bobijam

Could you please comment as to whether this issue is a duplicate of LU-1640?

Thanks

Peter

Comment by Zhenyu Xu [ 17/Dec/12 ]

patch tracking at http://review.whamcloud.com/3439

commit message
LU-2492 obdclass: lu_object_find_at() waits forever

In lu_object_put(), cfs_hash_bd_dec_and_lock() could possibly count
the refcount added by htable_lookup(), so if the thread run to this
code path is the last object holder, it should have gone down to the
free part, but because the htable_lookup() messes the refcount, the
last object holder misses the change to free the object,
table_lookup() will then always find the 0 referred object in the hash
table, which no one will delete it from hash table and free it any
longer, and lu_object_find_at() keeps waiting forever.

In this case, we'd do the unhash and free process in htable_lookup().

Comment by Zhenyu Xu [ 08/Jan/13 ]

landed on master for 2.4.0

Comment by Prakash Surya (Inactive) [ 08/Jan/13 ]

Awesome! Thanks Bobijam.

Generated at Sat Feb 10 01:25:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.