[LU-2683] Client deadlock in cl_lock_mutex_get Created: 25/Jan/13 Updated: 28/Nov/13 Resolved: 04/Mar/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Christopher Morrone | Assignee: | Jinshan Xiong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | HB, mn1, sequoia, topsequoia | ||
| Environment: |
Sequoia ppc64 client, 2.3.58-5chaos, servers are x86_64 running 2.3.58-6chaos |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 6270 | ||||||||||||
| Description |
|
With 2.3.58-5chaos ppc64 clients, we are seeing Lustre hang with many threads waiting on a mutex under cl_lock_mutex_get(). In the attached file "RA0-ID-J03.log.txt", you can see the sysrq-t and sysrq-l output. The node is responsive, but lustre is doing nothing because all ptlrpcd threads are stuck in the same path: 2013-01-25 11:51:10.788870 {DefaultControlEventListener} [mmcs]{131}.0.0: ptlrpcd_0 D 0000000000000000 0 3268 2 0x00000000
2013-01-25 11:51:10.788922 {DefaultControlEventListener} [mmcs]{131}.0.0: Call Trace:
2013-01-25 11:51:10.788974 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621ee70] [c00000000068f010] svc_rdma_ops+0xda18/0x1a900 (unreliable)
2013-01-25 11:51:10.789026 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f040] [c000000000009b2c] .__switch_to+0xc4/0x100
2013-01-25 11:51:10.789078 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f0d0] [c00000000042a418] .schedule+0x7d4/0x944
2013-01-25 11:51:10.789129 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f380] [c00000000042b4a8] .__mutex_lock_slowpath+0x208/0x390
2013-01-25 11:51:10.789181 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f480] [c00000000042bf44] .mutex_lock+0x38/0x58
2013-01-25 11:51:10.789234 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f500] [8000000002501cf8] .cl_lock_mutex_get+0xc8/0x110 [obdclass]
2013-01-25 11:51:10.789285 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f590] [800000000514d124] .lovsub_parent_lock+0x94/0x260 [lov]
2013-01-25 11:51:10.789337 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f640] [800000000514de70] .lovsub_lock_state+0xd0/0x300 [lov]
2013-01-25 11:51:10.789389 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f710] [80000000024fe344] .cl_lock_state_signal+0xd4/0x2c0 [obdclass]
2013-01-25 11:51:10.789441 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f7d0] [80000000024ff448] .cl_lock_signal+0xa8/0x260 [obdclass]
2013-01-25 11:51:10.789492 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f890] [80000000046adb24] .osc_lock_upcall+0x194/0x810 [osc]
2013-01-25 11:51:10.789544 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621f970] [8000000004688a7c] .osc_enqueue_fini+0xfc/0x3f0 [osc]
2013-01-25 11:51:10.789596 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621fa60] [8000000004697364] .osc_enqueue_interpret+0x104/0x240 [osc]
2013-01-25 11:51:10.789647 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621fb40] [8000000003b7a308] .ptlrpc_check_set+0x3c8/0x4e50 [ptlrpc]
2013-01-25 11:51:10.789700 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621fd20] [8000000003bcffec] .ptlrpcd_check+0x66c/0x870 [ptlrpc]
2013-01-25 11:51:10.789751 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621fe40] [8000000003bd054c] .ptlrpcd+0x35c/0x510 [ptlrpc]
2013-01-25 11:51:10.789802 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c621ff90] [c00000000001b9a0] .kernel_thread+0x54/0x70
|
| Comments |
| Comment by Christopher Morrone [ 25/Jan/13 ] |
|
My suspicion is that this thread completes the deadlock loop: 2013-01-25 11:51:10.831871 {DefaultControlEventListener} [mmcs]{131}.0.0: sysiod S 00000fff873f7148 0 6794 3077 0x00000000
2013-01-25 11:51:10.831961 {DefaultControlEventListener} [mmcs]{131}.0.0: Call Trace:
2013-01-25 11:51:10.832047 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c39363b0] [c00000000068baa0] svc_rdma_ops+0xa4a8/0x1a900 (unreliable)
2013-01-25 11:51:10.832180 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3936580] [c000000000009b2c] .__switch_to+0xc4/0x100
2013-01-25 11:51:10.832290 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3936610] [c00000000042a418] .schedule+0x7d4/0x944
2013-01-25 11:51:10.832379 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c39368c0] [8000000000ae2090] .cfs_waitq_wait+0x10/0x30 [libcfs]
2013-01-25 11:51:10.832469 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3936930] [80000000046c4930] .osc_extent_wait+0x610/0x790 [osc]
2013-01-25 11:51:10.832605 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3936b30] [80000000046c5174] .osc_cache_wait_range+0x6c4/0xc90 [osc]
2013-01-25 11:51:10.832702 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3936d70] [80000000046cc9c4] .osc_cache_writeback_range+0xe54/0x151c [osc]
2013-01-25 11:51:10.832789 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3936fa0] [80000000046aa1d0] .osc_lock_flush+0xe0/0x4a0 [osc]
2013-01-25 11:51:10.832890 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937090] [80000000046aa6cc] .osc_lock_cancel+0x13c/0x2c0 [osc]
2013-01-25 11:51:10.833011 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937160] [80000000024fde10] .cl_lock_cancel0+0xd0/0x2b0 [obdclass]
2013-01-25 11:51:10.833090 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937220] [8000000002500d78] .cl_lock_hold_release+0x258/0x450 [obdclass]
2013-01-25 11:51:10.833167 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c39372f0] [8000000002503a7c] .cl_lock_unhold+0x8c/0x270 [obdclass]
2013-01-25 11:51:10.833243 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c39373c0] [800000000513e5b4] .lov_sublock_release+0x244/0x370 [lov]
2013-01-25 11:51:10.833323 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937480] [8000000005141f68] .lov_lock_enqueue+0x388/0xb20 [lov]
2013-01-25 11:51:10.833431 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c39375c0] [8000000002505108] .cl_enqueue_try+0x1d8/0x510 [obdclass]
2013-01-25 11:51:10.833511 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c39376d0] [8000000002507108] .cl_enqueue_locked+0xa8/0x2c0 [obdclass]
2013-01-25 11:51:10.833578 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937780] [8000000002507630] .cl_lock_request+0xe0/0x370 [obdclass]
2013-01-25 11:51:10.833652 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937850] [800000000695efb4] .cl_glimpse_lock+0x2b4/0x640 [lustre]
2013-01-25 11:51:10.833736 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937940] [800000000695f538] .cl_glimpse_size0+0x1f8/0x270 [lustre]
2013-01-25 11:51:10.833826 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937a10] [80000000068f1510] .ll_inode_revalidate_it+0x220/0x2c0 [lustre]
2013-01-25 11:51:10.833931 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937ad0] [80000000068f15f0] .ll_getattr_it+0x40/0x180 [lustre]
2013-01-25 11:51:10.834017 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937b70] [80000000068f1774] .ll_getattr+0x44/0x60 [lustre]
2013-01-25 11:51:10.834118 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937c20] [c0000000000d57d8] .vfs_getattr+0x38/0x60
2013-01-25 11:51:10.834248 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937c90] [c0000000000d5e4c] .vfs_fstatat+0x78/0xa8
2013-01-25 11:51:10.834349 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937d30] [c0000000000d5f00] .SyS_newfstatat+0x2c/0x58
2013-01-25 11:51:10.834436 {DefaultControlEventListener} [mmcs]{131}.0.0: [c0000003c3937e30] [c000000000000580] syscall_exit+0x0/0x2c
It seems to grab a lock and then sleep in cfs_waitq_wait(). But the locking is fairly complex here (as acknoledged in the comment for lov_lock_enqueue()), and this code is new to me. |
| Comment by Christopher Morrone [ 25/Jan/13 ] |
|
Oh, and don't both requesting additional info about who is holding the lock using crash. This is ppc64 and crash does not work. The most I can provide is lustre logs and proc info. If we need more info, the best route is a debugging patch. |
| Comment by Jinshan Xiong (Inactive) [ 25/Jan/13 ] |
|
This is the same problem we have seen in |
| Comment by Jinshan Xiong (Inactive) [ 29/Jan/13 ] |
|
Hi Chris, can you please show me the code snippet at lov_lock_enqueue+0x3f0? |
| Comment by Prakash Surya (Inactive) [ 29/Jan/13 ] |
(gdb) l *lov_lock_enqueue+0x3f0
0x51fd0 is in lov_lock_enqueue (/builddir/build/BUILD/lustre-2.3.58/lustre/lov/lov_lock.c:562).
557 /builddir/build/BUILD/lustre-2.3.58/lustre/lov/lov_lock.c: No such file or directory.
in /builddir/build/BUILD/lustre-2.3.58/lustre/lov/lov_lock.c
lov_lock.c 560 if (!IS_ERR(sublock))
561 cl_lock_mutex_get(env, sublock);
562 cl_lock_mutex_get(env, parent);
563
564 if (!IS_ERR(sublock)) {
Out of curiosity, where did you get that offset into lov_lock_enqueue from? I don't see that in either stack posted in this ticket. |
| Comment by Jinshan Xiong (Inactive) [ 29/Jan/13 ] |
|
From the log you guys posted in this ticket: 2013-01-25 11:51:10.827995 {DefaultControlEventListener} [mmcs]{131}.0.0: sysiod D 00000fffa78b7148 0 6707 3077 0x00000002
2013-01-25 11:51:10.828047 {DefaultControlEventListener} [mmcs]{131}.0.0: Call Trace:
2013-01-25 11:51:10.828100 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da6d60] [c00000000068b6e0] svc_rdma_ops+0xa0e8/0x1a900 (unreliable)
2013-01-25 11:51:10.828151 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da6f30] [c000000000009b2c] .__switch_to+0xc4/0x100
2013-01-25 11:51:10.828203 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da6fc0] [c00000000042a418] .schedule+0x7d4/0x944
2013-01-25 11:51:10.828255 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7270] [c00000000042b4a8] .__mutex_lock_slowpath+0x208/0x390
2013-01-25 11:51:10.828316 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7370] [c00000000042bf44] .mutex_lock+0x38/0x58
2013-01-25 11:51:10.828407 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da73f0] [8000000002501cf8] .cl_lock_mutex_get+0xc8/0x110 [obdclass]
2013-01-25 11:51:10.828496 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7480] [8000000005141fd0] .lov_lock_enqueue+0x3f0/0xb20 [lov]
2013-01-25 11:51:10.828583 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da75c0] [8000000002505108] .cl_enqueue_try+0x1d8/0x510 [obdclass]
2013-01-25 11:51:10.828700 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da76d0] [8000000002507108] .cl_enqueue_locked+0xa8/0x2c0 [obdclass]
2013-01-25 11:51:10.828807 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7780] [8000000002507630] .cl_lock_request+0xe0/0x370 [obdclass]
2013-01-25 11:51:10.828887 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7850] [800000000695efb4] .cl_glimpse_lock+0x2b4/0x640 [lustre]
2013-01-25 11:51:10.828962 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7940] [800000000695f538] .cl_glimpse_size0+0x1f8/0x270 [lustre]
2013-01-25 11:51:10.829052 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7a10] [80000000068f1510] .ll_inode_revalidate_it+0x220/0x2c0 [lustre]
2013-01-25 11:51:10.829122 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7ad0] [80000000068f15f0] .ll_getattr_it+0x40/0x180 [lustre]
2013-01-25 11:51:10.829178 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7b70] [80000000068f1774] .ll_getattr+0x44/0x60 [lustre]
2013-01-25 11:51:10.829230 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7c20] [c0000000000d57d8] .vfs_getattr+0x38/0x60
2013-01-25 11:51:10.829282 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7c90] [c0000000000d5e4c] .vfs_fstatat+0x78/0xa8
2013-01-25 11:51:10.829334 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7d30] [c0000000000d5f00] .SyS_newfstatat+0x2c/0x58
2013-01-25 11:51:10.829386 {DefaultControlEventListener} [mmcs]{131}.0.0: [c000000362da7e30] [c000000000000580] syscall_exit+0x0/0x2c
2013-01-25 11:51:10.829437 {DefaultControlEventListener} [mmcs]{131}.0.0: sysiod S 00000fff9db9bf74 0 6563 3077 0x00000000
|
| Comment by Peter Jones [ 04/Feb/13 ] |
|
It seems that there is a patch for this one |
| Comment by Prakash Surya (Inactive) [ 04/Feb/13 ] |
|
Peter, is this the patch you're referring to.. http://review.whamcloud.com/5208 ? |
| Comment by Christopher Morrone [ 04/Feb/13 ] |
|
That is the one. |
| Comment by Prakash Surya (Inactive) [ 12/Feb/13 ] |
|
That patch doesn't fix the problem for us. We still see the hang in production with it applied. |
| Comment by Jinshan Xiong (Inactive) [ 13/Feb/13 ] |
|
hmm... it should solve the problem. Did you get the exactly same stack trace as before? |
| Comment by Prakash Surya (Inactive) [ 13/Feb/13 ] |
|
I don't have sysrq-l information from the node because it crashed shortly after due to 2013-02-12 16:46:00 INFO: task ptlrpcd_10:3486 blocked for more than 120 seconds. 2013-02-12 16:46:00 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-02-12 16:46:00 ptlrpcd_10 D 0000000000000000 0 3486 2 0x00008000 2013-02-12 16:46:00 Call Trace: 2013-02-12 16:46:00 [c000000f321a2de0] [c000000f321a2e90] 0xc000000f321a2e90 (unreliable) 2013-02-12 16:46:00 [c000000f321a2fb0] [c0000000000142d8] .__switch_to+0xf8/0x1d0 2013-02-12 16:46:00 [c000000f321a3040] [c0000000005bb0a8] .schedule+0x3f8/0xd30 2013-02-12 16:46:00 [c000000f321a3340] [c0000000005bcb4c] .__mutex_lock_slowpath+0x1bc/0x2d0 2013-02-12 16:46:00 [c000000f321a3420] [c0000000005bd42c] .mutex_lock+0x5c/0x60 2013-02-12 16:46:00 [c000000f321a34a0] [d00000000f2923e8] .cl_lock_mutex_get+0xc8/0x110 [obdclass] 2013-02-12 16:46:00 [c000000f321a3530] [d000000011d2d294] .lovsub_parent_lock+0x94/0x260 [lov] 2013-02-12 16:46:00 [c000000f321a35e0] [d000000011d2f058] .lovsub_lock_modify+0xc8/0x2c0 [lov] 2013-02-12 16:46:00 [c000000f321a36b0] [d00000000f2914f4] .cl_lock_modify+0x114/0x490 [obdclass] 2013-02-12 16:46:00 [c000000f321a37a0] [d0000000111ec3a4] .osc_lock_granted+0x184/0x3d0 [osc] 2013-02-12 16:46:00 [c000000f321a3870] [d0000000111ee1c4] .osc_lock_upcall+0x4c4/0x840 [osc] 2013-02-12 16:46:00 [c000000f321a3950] [d0000000111c8b3c] .osc_enqueue_fini+0xfc/0x3f0 [osc] 2013-02-12 16:46:00 [c000000f321a3a40] [d0000000111d7504] .osc_enqueue_interpret+0x104/0x240 [osc] 2013-02-12 16:46:00 [c000000f321a3b20] [d00000001097a724] .ptlrpc_check_set+0x364/0x4f40 [ptlrpc] 2013-02-12 16:46:00 [c000000f321a3cc0] [d0000000109d127c] .ptlrpcd_check+0x66c/0x8a0 [ptlrpc] 2013-02-12 16:46:00 [c000000f321a3de0] [d0000000109d182c] .ptlrpcd+0x37c/0x530 [ptlrpc] 2013-02-12 16:46:00 [c000000f321a3f90] [c000000000032fd4] .kernel_thread+0x54/0x70 2013-02-12 16:46:00 INFO: task ptlrpcd_11:3487 blocked for more than 120 seconds. 2013-02-12 16:46:00 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-02-12 16:46:00 ptlrpcd_11 D 0000000000000000 0 3487 2 0x00008000 2013-02-12 16:46:00 Call Trace: 2013-02-12 16:46:00 [c000000f4fefade0] [c000000f4fefae90] 0xc000000f4fefae90 (unreliable) 2013-02-12 16:46:00 [c000000f4fefafb0] [c0000000000142d8] .__switch_to+0xf8/0x1d0 2013-02-12 16:46:00 [c000000f4fefb040] [c0000000005bb0a8] .schedule+0x3f8/0xd30 2013-02-12 16:46:00 [c000000f4fefb340] [c0000000005bcb4c] .__mutex_lock_slowpath+0x1bc/0x2d0 2013-02-12 16:46:00 [c000000f4fefb420] [c0000000005bd42c] .mutex_lock+0x5c/0x60 2013-02-12 16:46:00 [c000000f4fefb4a0] [d00000000f2923e8] .cl_lock_mutex_get+0xc8/0x110 [obdclass] 2013-02-12 16:46:00 [c000000f4fefb530] [d000000011d2d294] .lovsub_parent_lock+0x94/0x260 [lov] 2013-02-12 16:46:00 [c000000f4fefb5e0] [d000000011d2f058] .lovsub_lock_modify+0xc8/0x2c0 [lov] 2013-02-12 16:46:00 [c000000f4fefb6b0] [d00000000f2914f4] .cl_lock_modify+0x114/0x490 [obdclass] 2013-02-12 16:46:00 [c000000f4fefb7a0] [d0000000111ec3a4] .osc_lock_granted+0x184/0x3d0 [osc] 2013-02-12 16:46:00 [c000000f4fefb870] [d0000000111ee1c4] .osc_lock_upcall+0x4c4/0x840 [osc] 2013-02-12 16:46:00 [c000000f4fefb950] [d0000000111c8b3c] .osc_enqueue_fini+0xfc/0x3f0 [osc] 2013-02-12 16:46:00 [c000000f4fefba40] [d0000000111d7504] .osc_enqueue_interpret+0x104/0x240 [osc] 2013-02-12 16:46:00 [c000000f4fefbb20] [d00000001097a724] .ptlrpc_check_set+0x364/0x4f40 [ptlrpc] 2013-02-12 16:46:00 [c000000f4fefbcc0] [d0000000109d127c] .ptlrpcd_check+0x66c/0x8a0 [ptlrpc] 2013-02-12 16:46:00 [c000000f4fefbde0] [d0000000109d182c] .ptlrpcd+0x37c/0x530 [ptlrpc] 2013-02-12 16:46:00 [c000000f4fefbf90] [c000000000032fd4] .kernel_thread+0x54/0x70 2013-02-12 16:46:00 INFO: task ldlm_poold:3516 blocked for more than 120 seconds. 2013-02-12 16:46:00 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-02-12 16:46:00 ldlm_poold D 0000000000000000 0 3516 2 0x00008000 2013-02-12 16:46:00 Call Trace: 2013-02-12 16:46:00 [c000000f545b31b0] [c000000f545b3260] 0xc000000f545b3260 (unreliable) 2013-02-12 16:46:00 [c000000f545b3380] [c0000000000142d8] .__switch_to+0xf8/0x1d0 2013-02-12 16:46:00 [c000000f545b3410] [c0000000005bb0a8] .schedule+0x3f8/0xd30 2013-02-12 16:46:00 [c000000f545b3710] [c0000000005bc388] .schedule_timeout+0x258/0x340 2013-02-12 16:46:00 [c000000f545b3830] [c0000000005bbf88] .wait_for_common+0x198/0x210 2013-02-12 16:46:00 [c000000f545b3910] [d0000000109597c8] .__ldlm_bl_to_thread+0x1f8/0x280 [ptlrpc] 2013-02-12 16:46:00 [c000000f545b39c0] [d000000010959c2c] .ldlm_bl_to_thread+0x3dc/0x720 [ptlrpc] 2013-02-12 16:46:00 [c000000f545b3b70] [d00000001094d488] .ldlm_cancel_lru+0xb8/0x2c0 [ptlrpc] 2013-02-12 16:46:00 [c000000f545b3c40] [d000000010967010] .ldlm_cli_pool_recalc+0x320/0x3e0 [ptlrpc] 2013-02-12 16:46:00 [c000000f545b3ce0] [d0000000109671fc] .ldlm_pool_recalc+0x12c/0x180 [ptlrpc] 2013-02-12 16:46:00 [c000000f545b3d80] [d000000010969198] .ldlm_pools_recalc+0x1c8/0x480 [ptlrpc] 2013-02-12 16:46:00 [c000000f545b3e70] [d000000010969550] .ldlm_pools_thread_main+0x100/0x490 [ptlrpc] 2013-02-12 16:46:00 [c000000f545b3f90] [c000000000032fd4] .kernel_thread+0x54/0x70 2013-02-12 16:46:00 INFO: task bgxlC_r:5444 blocked for more than 120 seconds. 2013-02-12 16:46:00 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2013-02-12 16:46:00 bgxlC_r D 000000000fedd308 0 5444 1 0x00008012 2013-02-12 16:46:00 Call Trace: 2013-02-12 16:46:00 [c000000afa5c6b90] [c000000afa5c6c40] 0xc000000afa5c6c40 (unreliable) 2013-02-12 16:46:00 [c000000afa5c6d60] [c0000000000142d8] .__switch_to+0xf8/0x1d0 2013-02-12 16:46:00 [c000000afa5c6df0] [c0000000005bb0a8] .schedule+0x3f8/0xd30 2013-02-12 16:46:00 [c000000afa5c70f0] [c0000000005bc388] .schedule_timeout+0x258/0x340 2013-02-12 16:46:00 [c000000afa5c7210] [c0000000005bbf88] .wait_for_common+0x198/0x210 2013-02-12 16:46:00 [c000000afa5c72f0] [d0000000111f0878] .osc_io_setattr_end+0xd8/0x1e0 [osc] 2013-02-12 16:46:00 [c000000afa5c73b0] [d00000000f2987fc] .cl_io_end+0xcc/0x2b0 [obdclass] 2013-02-12 16:46:00 [c000000afa5c7460] [d000000011d25950] .lov_io_end_wrapper+0x180/0x1f0 [lov] 2013-02-12 16:46:00 [c000000afa5c7500] [d000000011d23ed4] .lov_io_call+0xe4/0x230 [lov] 2013-02-12 16:46:00 [c000000afa5c75b0] [d000000011d24484] .lov_io_end+0x54/0x110 [lov] 2013-02-12 16:46:00 [c000000afa5c7640] [d00000000f2987fc] .cl_io_end+0xcc/0x2b0 [obdclass] 2013-02-12 16:46:00 [c000000afa5c76f0] [d00000000f2a22a4] .cl_io_loop+0x1a4/0x2c0 [obdclass] 2013-02-12 16:46:00 [c000000afa5c77a0] [d0000000136a197c] .cl_setattr_ost+0x25c/0x390 [lustre] 2013-02-12 16:46:00 [c000000afa5c7880] [d000000013659030] .ll_setattr_raw+0xc10/0x1930 [lustre] 2013-02-12 16:46:00 [c000000afa5c79a0] [c0000000001e3814] .notify_change+0x1f4/0x400 2013-02-12 16:46:00 [c000000afa5c7a60] [c0000000001c1d28] .do_truncate+0x78/0xd0 2013-02-12 16:46:00 [c000000afa5c7b40] [c0000000001d6140] .do_filp_open+0x4f0/0xdc0 2013-02-12 16:46:00 [c000000afa5c7d10] [c0000000001c0724] .do_sys_open+0x84/0x180 2013-02-12 16:46:00 [c000000afa5c7dc0] [c0000000002209b4] .compat_sys_open+0x24/0x40 2013-02-12 16:46:00 [c000000afa5c7e30] [c000000000008564] syscall_exit+0x0/0x40 The stacks of ptlrpcd_10 and ptlrpcd_11 look similar to what's in the description, so I thought it was the same bug. Do you think this is something different? |
| Comment by Jinshan Xiong (Inactive) [ 13/Feb/13 ] |
|
I need more information. Let me work out a patch for |
| Comment by Jinshan Xiong (Inactive) [ 13/Feb/13 ] |
|
Did you find anything was printed in osc_extent_wait()? |
| Comment by Prakash Surya (Inactive) [ 13/Feb/13 ] |
|
Like these messages? 2013-02-11 13:29:13.788428 {DefaultControlEventListener} [mmcs]{613}.12.0: LustreError: 31967:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001c1fc2fe8@{[26 -> 28/31], [3|0|+|rpc|wihY|c0000001b0756f60], [196608|3|+|-|c0000001cdfaa6e0|16|c00000037ab0f300]} ls1-OST005a-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
2013-02-11 13:30:26.835353 {DefaultControlEventListener} [mmcs]{613}.0.3: LustreError: 40667:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001cdacee38@{[2 -> 4/15], [3|0|+|rpc|wiY|c0000001b0759480], [196608|3|+|-|c0000001cdfa9600|16|c0000003c54483a0]} ls1-OST007b-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
2013-02-11 13:30:58.855528 {DefaultControlEventListener} [mmcs]{613}.3.0: LustreError: 32002:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001cdacfca8@{[2 -> 4/15], [3|0|+|rpc|wihY|c0000001b0759d80], [196608|3|+|-|c0000001cdfa99c0|16|c0000003742646e0]} ls1-OST0083-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
2013-02-11 13:30:58.856226 {DefaultControlEventListener} [mmcs]{613}.3.2: LustreError: 3958:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001cedc8d18@{[42 -> 44/47], [3|0|+|rpc|wihY|c0000002e43ed020], [196608|3|+|-|c0000001c0c13c00|16|c0000003c87a1e60]} ls1-OST014d-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
2013-02-11 13:31:44.273199 {DefaultControlEventListener} [mmcs]{613}.12.0: LustreError: 31983:0:(osc_cache.c:896:osc_extent_wait()) extent c0000001c9b17230@{[2 -> 4/15], [3|0|+|rpc|wihY|c0000001b075a680], [196608|3|+|-|c0000002e1471d10|16|c0000003c80ad4a0]} ls1-OST008b-osc-c0000003ec0c5800: wait ext to 0 timedout, recovery in progress?
|
| Comment by Jinshan Xiong (Inactive) [ 13/Feb/13 ] |
|
why were there so many of them? Is one OST in recovery at that time? It will be helpful to dump log, with DLMTRACE, CACHE and VFSTRACE enabled, when you first see this kind of message. It will be easily done with a script |
| Comment by Prakash Surya (Inactive) [ 13/Feb/13 ] |
|
Well in total, I see that message 12 times for this particular occurrence (from Feb. 11th not running with 5208). AFAIK, all of the OSTs are healthy but when the IONs hit this issue, some of the OSTs will report as "inactive" when running "lfs check servers" on the client. That is how the admins discover IONs in this state, and then manually reboot them. With that said, I may have jumped the gun in my previous comments here and here. What I reported in those comments happened on a login node and wasn't actually a deadlock as far as I can tell. I think those were just "slow" threads (cause unknown) as they eventually cleared up (but then hit So unless you think differently |
| Comment by Jinshan Xiong (Inactive) [ 13/Feb/13 ] |
|
I see, thanks for explanation. |
| Comment by Jinshan Xiong (Inactive) [ 15/Feb/13 ] |
|
Bruno has claimed that this patch can at least fix the original problem so let's land this patch for now. |
| Comment by Christopher Morrone [ 15/Feb/13 ] |
|
We have about a day of testing with patch set 2 of change 5208, and so far we have had no occurrences of the client deadlock. Probably too soon to be definitive, but it is promising. |
| Comment by Christopher Morrone [ 19/Feb/13 ] |
|
With change 5208, we have seen no repeats of the problem for four days. It looks like the patched has fixed the problem. Thanks! |
| Comment by Peter Jones [ 04/Mar/13 ] |
|
Landed for 2.4 |
| Comment by Bruno Faccini (Inactive) [ 16/Nov/13 ] |
|
As part of PID: 22597 TASK: ffff8805969c2ae0 CPU: 4 COMMAND: "LU-4112-reprodu" #0 [ffff88059324d748] schedule at ffffffff8150de92 #1 [ffff88059324d810] __mutex_lock_slowpath at ffffffff8150f53e <<<--- cll_guard is owned by 22615 #2 [ffff88059324d880] mutex_lock at ffffffff8150f3db #3 [ffff88059324d8a0] cl_lock_mutex_get at ffffffffa05a3a47 [obdclass] #4 [ffff88059324d8d0] osc_ldlm_completion_ast at ffffffffa09aa4e2 [osc] #5 [ffff88059324d930] ldlm_lock_match at ffffffffa06aca4c [ptlrpc] #6 [ffff88059324da00] osc_enqueue_base at ffffffffa098f2d9 [osc] #7 [ffff88059324dab0] osc_lock_enqueue at ffffffffa09a863e [osc] #8 [ffff88059324db40] cl_enqueue_try at ffffffffa05a54bc [obdclass] #9 [ffff88059324db90] lov_lock_enqueue at ffffffffa0a2529b [lov] #10 [ffff88059324dc20] cl_enqueue_try at ffffffffa05a54bc [obdclass] #11 [ffff88059324dc70] cl_enqueue_locked at ffffffffa05a68ad [obdclass] #12 [ffff88059324dcb0] cl_lock_request at ffffffffa05a753e [obdclass] #13 [ffff88059324dd10] cl_io_lock at ffffffffa05ac84c [obdclass] #14 [ffff88059324dd70] cl_io_loop at ffffffffa05aca9a [obdclass] #15 [ffff88059324ddb0] ll_file_io_generic at ffffffffa0ab8047 [lustre] #16 [ffff88059324de20] ll_file_aio_read at ffffffffa0ab8339 [lustre] #17 [ffff88059324de80] ll_file_read at ffffffffa0ab87f9 [lustre] #18 [ffff88059324def0] vfs_read at ffffffff81181805 #19 [ffff88059324df30] sys_read at ffffffff81181941 #20 [ffff88059324df80] system_call_fastpath at ffffffff8100b072 RIP: 000000312d60e54d RSP: 00007fffcf09d4a8 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffff8100b072 RCX: 0000000000000000 RDX: 000000000000c000 RSI: 0000000001135450 RDI: 0000000000000018 RBP: 00007fffcf09dd70 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 00007fffcf09ef70 R13: 0000000000400a50 R14: 0000000000401843 R15: 0000000000400a50 ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b PID: 22615 TASK: ffff880c33976040 CPU: 1 COMMAND: "LU-4112-reprodu" #0 [ffff880b9f2b9a08] schedule at ffffffff8150de92 #1 [ffff880b9f2b9ad0] __mutex_lock_slowpath at ffffffff8150f53e <<<--- cll_guard is owned by 22597 #2 [ffff880b9f2b9b40] mutex_lock at ffffffff8150f3db #3 [ffff880b9f2b9b60] cl_lock_mutex_get at ffffffffa05a3a47 [obdclass] #4 [ffff880b9f2b9b90] lov_lock_enqueue at ffffffffa0a25370 [lov] #5 [ffff880b9f2b9c20] cl_enqueue_try at ffffffffa05a54bc [obdclass] #6 [ffff880b9f2b9c70] cl_enqueue_locked at ffffffffa05a68ad [obdclass] #7 [ffff880b9f2b9cb0] cl_lock_request at ffffffffa05a753e [obdclass] #8 [ffff880b9f2b9d10] cl_io_lock at ffffffffa05ac84c [obdclass] #9 [ffff880b9f2b9d70] cl_io_loop at ffffffffa05aca9a [obdclass] #10 [ffff880b9f2b9db0] ll_file_io_generic at ffffffffa0ab8047 [lustre] #11 [ffff880b9f2b9e20] ll_file_aio_read at ffffffffa0ab8339 [lustre] #12 [ffff880b9f2b9e80] ll_file_read at ffffffffa0ab87f9 [lustre] #13 [ffff880b9f2b9ef0] vfs_read at ffffffff81181805 #14 [ffff880b9f2b9f30] sys_read at ffffffff81181941 #15 [ffff880b9f2b9f80] system_call_fastpath at ffffffff8100b072 RIP: 000000312d60e54d RSP: 00007fff91890748 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffff8100b072 RCX: 0000000000000000 RDX: 000000000000c000 RSI: 0000000001a50210 RDI: 0000000000000017 RBP: 00007fff91891010 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 00007fff91892210 R13: 0000000000400a50 R14: 0000000000401843 R15: 0000000000400a50 ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b thus I back-ported change #5208 patch-set #2 in b2_1 tree at http://review.whamcloud.com/8300. |