[LU-8936] Client LBUGs with cl_object.c:735:cl_env_attach() ASSERTION( rc == 0 ) in process ldlm_bl_02 Created: 14/Dec/16  Updated: 10/Aug/17  Resolved: 07/Jan/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Olaf Faaland Assignee: Jinshan Xiong (Inactive)
Resolution: Fixed Votes: 0
Labels: llnl, llnlfixready
Environment:

lustre-2.8.0_5.chaos-2.ch6.x86_64
kernel 3.10.0-514.0.0.1chaos.ch6.x86_64


Issue Links:
Duplicate
duplicates LU-8509 drop_caches hangs in cl_inode_fini() Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Console reports first this:

LustreError: 7526:0:(cl_object.c:735:cl_env_attach()) ASSERTION( rc == 0 ) failed:
LustreError: 7526:0:(cl_object.c:735:cl_env_attach()) LBUG
Pid: 7526, comm: ldlm_bl_02

Call Trace:
 libcfs_debug_dumpstack+0x53/0x80 [libcfs]
 lbug_with_loc+0x45/0xc0 [libcfs]
 cl_env_percpu_get+0xc2/0xd0 [obdclass]
 ll_invalidatepage+0x41/0x170 [lustre]
 vvp_page_discard+0xbd/0x160 [lustre]
 cl_page_invoid+0x68/0x170 [obdclass]
 cl_page_discard+0x13/0x20 [obdclass]
 discard_cb+0x67/0x190 [osc]
 osc_page_gang_lookup+0x1e0/0x320 [osc]
 ? discard_cb+0x0/0x190 [osc]
 osc_lock_discard_pages+0x119/0x22d [osc]
 ? discard_cb+0x0/0x190 [osc]
 osc_lock_flush+0x89/0x280 [osc]
 osc_ldlm_blocking_ast+0x2e3/0x3a0 [osc]
 ldlm_cancel_callback+0x8a/0x2e0 [ptlrpc]
 ? dequeue_entity+0x11c/0x5d0
 ldlm_cli_cancel_local+0xa0/0x420 [ptlrpc]
 ldlm_cli_cancel+0xab/0x3d0 [ptlrpc]
 osc_ldlm_blocking_ast+0x17a/0x3a0 [osc]
 ? __schedule+0x3b8/0x9c0
 ldlm_handle_bl_callback+0xcf/0x410 [ptlrpc]
 ldlm_bl_thread_main+0x531/0x700 [ptlrpc]
 ? default_wake_function+0x0/0x20
 ? ldlm_bl_thread_main+0x0/0x700 [ptlrpc]
 kthread+0xcf/0xe0
 ? kthread+0x0/0xe0
 ret_from_fork+0x58/0x90
 ? kthread+0x0/0xe0

is followed by

BUG: sleeping function called from invalid context at mm/slub.c:941
in_atomic(): 1, irqs_disabled(): 0, pid: 7526, name: ldlm_bl_02
CPU: 23 PID: 7526 Comm: ldlm_bl_02 Tainted: G           OE  ------------   3.10.0-514.0.0.1chaos.ch6.x86_64 #1
Call Trace:
 dump_stack+0x19/0x1b
 __might_sleep+0xd9/0x100
 kmem_cache_alloc_trace+0x4a/0x250
 ? call_usermodehelper_setup+0x3f/0xa0
 call_usermodehelper_setup+0x3f/0xa0
 call_usermodehelper+0x31/0x60
 libcfs_run_upcall+0x9e/0x3b0 [libcfs]
 ? snprintf+0x49/0x70
 libcfs_run_lbug_upcall+0x7d/0x100 [libcfs]
 lbug_with_loc+0x57/0xc0 [libcfs]
 cl_env_percpu_get+0xc2/0xd0 [obdclass]
 ll_invalidatepage+0x41/0x170 [lustre]
 vvp_page_discard+0xbd/0x160 [lustre]
 cl_page_invoid+0x68/0x170 [obdclass]
 cl_page_discard+0x13/0x20 [obdclass]
 discard_cb+0x67/0x190 [osc]
 osc_page_gang_lookup+0x1e0/0x320 [osc]
 ? check_and_discard_cb+0x150/0x150 [osc]
 osc_lock_discard_pages+0x119/0x22d [osc]
 ? check_and_discard_cb+0x150/0x150 [osc]
 osc_lock_flush+0x89/0x280 [osc]
 osc_ldlm_blocking_ast+0x2e3/0x3a0 [osc]
 ldlm_cancel_callback+0x8a/0x2e0 [ptlrpc]
 ? dequeue_entity+0x11c/0x5d0
 ldlm_cli_cancel_local+0xa0/0x420 [ptlrpc]
 ldlm_cli_cancel+0xab/0x3d0 [ptlrpc]
 osc_ldlm_blocking_ast+0x17a/0x3a0 [osc]
 ? __schedule+0x3b8/0x9c0
 ldlm_handle_bl_callback+0xcf/0x410 [ptlrpc]
 ldlm_bl_thread_main+0x531/0x700 [ptlrpc]
 ? wake_up_state+0x20/0x20
 ? ldlm_handle_bl_callback+0x410/0x410 [ptlrpc]
 kthread+0xcf/0xe0
 ? kthread_create_on_node+0x140/0x140
 ret_from_fork+0x58/0x90
 ? kthread_create_on_node+0x140/0x140

I'm not certain whether there is a particular workload that triggers this. We've been running concurrent mdtest and ior jobs, using remote directories but not striped directories.

The frequency is high; running on 300 clients for about 2 hours triggered this bug in 1/3 of the nodes.



 Comments   
Comment by Andreas Dilger [ 14/Dec/16 ]

The sleeping in interrupt context is just fallout from trying to dump the kernel logs while a spinlock is held.

Jinshan, can you please look at the initial problem.

Comment by Jinshan Xiong (Inactive) [ 14/Dec/16 ]

It seems like the problematic code locates at:

ll_releasepage() {
        ...
        cookie = cl_env_reenter();
        env = cl_env_percpu_get();
        LASSERT(!IS_ERR(env));
        ...
}

Where it tries to replace the cl_env of current process with percpu env. Somehow it failed to remove the old env, so that reattaching env hit this assertion.

Did you dump the log successfully? If that is the case, I can provide a debug patch to understand the problem.

Also please show me the source code at ll_invalidatepage+0x41 to make sure I looked at the code correctly.

Comment by Ned Bass [ 14/Dec/16 ]

Unfortunately we don't have kdump working on this cluster which has made it hard to debug this problem.

(gdb) l *(ll_invalidatepage+0x41)
0x53121 is in ll_invalidatepage (/usr/src/debug/lustre-2.8.0_5.chaos/lustre/llite/rw26.c:106).
101     #else
102             if (offset == 0) {
103     #endif
104                     /* See the comment in ll_releasepage() */
105                     env = cl_env_percpu_get();
106                     LASSERT(!IS_ERR(env));
107
108                     inode = vmpage->mapping->host;
109                     obj = ll_i2info(inode)->lli_clob;
110                     if (obj != NULL) {
(gdb) 
Comment by Jinshan Xiong (Inactive) [ 14/Dec/16 ]

Ned, can you please point the code repository you're using?

Comment by Olaf Faaland [ 14/Dec/16 ]

Jinshan,
See the lustre-release-fe-llnl repo on your gerritt server.

Comment by Jinshan Xiong (Inactive) [ 14/Dec/16 ]

This is a problem introduced by LU-8509. We can fix it by reverting part of the patch.

Comment by Gerrit Updater [ 14/Dec/16 ]

Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: https://review.whamcloud.com/24351
Subject: LU-8936 llite: use percpu env correctly in ll_invalidatepage
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 607744846d42d4f1199aa5172d7bd4ee3f8f044f

Comment by Jinshan Xiong (Inactive) [ 20/Dec/16 ]

In LU-4257, I made a patch(commit 4533271) to get rid of cl_env hash table on the client side. From now on, cl_env won't be attached to processes. This made it possible to use cl_env_percpu_get() in ll_invalidatepage(). However, LLNL FE branch it doesn't include the fix for LU-4257 but does include LU-8509, which caused the problem on this ticket because an env is already attached to the process when it calls ll_invalidatepage().

To make it work, either we can land the corresponding page from LU-4257 to LLNL FE branch, or we can revert part of the change introduced by LU-8509. For simplicity, I chose to revert it.

Please push patch 24351 to LLNL FE branch.

Comment by Olaf Faaland [ 20/Dec/16 ]

Jinshan,
Thanks for the explanation. We landed patch 24351 to our patch stack and the high frequency LBUGs have stopped.

Is there any reason for us not to just revert the change from LU-8509 in its entirety? It looks to me like there were a few bits not reverted by your patch, but that they have no effect - things like initializing a variable that gets set before it is read anyway.

I looked at LU-4257. For the benefit of anyone else watching this ticket, LU-4257 has 5 changes, collectively alter several hundred lines of code, but produced very large performance improvements, particularly for small file IO.

Comment by Jinshan Xiong (Inactive) [ 20/Dec/16 ]

The reset of LU-8509 fixed an actual problem:

@@ -721,6 +721,10 @@ out:
                        unlock_page(vmpage);
                        put_page(vmpage);
                }
+               if (!IS_ERR_OR_NULL(page)) {
+                       lu_ref_del(&page->cp_reference, "cl_io", io);
+                       cl_page_put(env, page);
+               }

Where a cl_page was leaked in some scenarios.

Yes, the major benefit of LU-4257 is the support of fast read, which can improve small I/O read performance significantly.

Comment by Olaf Faaland [ 20/Dec/16 ]

Jinshan,

I saw that change but it looks to me like the only way to get to out, with page non-NULL, is if page already tested for IS_ERR(). But even if I'm correct a future change could alter that, so I guess the code makes sense either way.

And I now see that the initialization of page does matter in some cases, like ll_cl_find() returns NULL.

So OK, this seems right. Thank you.

Comment by Olaf Faaland [ 20/Dec/16 ]

Jinshan,

If we chose instead to pull in LU-4257 and keep LU-8509, are there any other changes you are aware of that we would need?

Comment by Christopher Morrone [ 20/Dec/16 ]

Pulling in something like LU-4257 is not the sort of thing we should generally be doing on our stable branch.

Comment by Jinshan Xiong (Inactive) [ 20/Dec/16 ]

That's true. This is why I proposed to revert part of LU-8509 instead of porting the rest of LU-4257.

Comment by Olaf Faaland [ 21/Dec/16 ]

OK.
Thanks for updating the commit message on your patch. Shouldn't it be pushed to gerrit against the b2_8_fe in the fe project instead of master?

Comment by Peter Jones [ 21/Dec/16 ]

Talking to Jinshan, it sounds like LU-4257 is landed to 2.8 FE and so this fix would not be needed once you rebase onto the latest 2.8 FE branch

Comment by Olaf Faaland [ 21/Dec/16 ]

Peter,

Maybe I'm missing something, but here's what I see.

LU-4257 has several associated changes. The relevant one that is needed for LU-8509, I think, is:

4533271 LU-4257 obdclass: Get rid of cl_env hash table

I don't see that commit in 2.8 FE:

[faaland1@hefe branch:2.9.0) lustre] $git lg wc28fe/b2_8_fe | grep LU-4257
* 541afe2 LU-4257 llite: fix up iov_iter implementation
* 96f482b LU-4257 clio: replace semaphore with mutex
Comment by Jinshan Xiong (Inactive) [ 21/Dec/16 ]

But 2.8 FE doesn't have LU-8509 either.

You need to have both LU-8509 and 4533271 or have none of them.

Comment by Jinshan Xiong (Inactive) [ 21/Dec/16 ]

I just remembered 2.8 FE doesn't need this patch but obviously I gave the wrong reason - sorry for confusion.

Comment by Jinshan Xiong (Inactive) [ 21/Dec/16 ]

That being said, we should land part of LU-8509 to b2_8_fe to fix the bug: https://review.whamcloud.com/24484

Comment by Olaf Faaland [ 21/Dec/16 ]

That being said, we should land part of LU-8509 to b2_8_fe to fix the bug: https://review.whamcloud.com/24484

Agreed.

Generated at Sat Feb 10 02:21:50 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.