[LU-4327] tgt_ses_info()) ASSERTION( env->le_ses != ((void *)0) ) failed Created: 29/Nov/13 Updated: 06/Jan/14 Resolved: 26/Dec/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Oleg Drokin | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 11829 | ||||||||
| Description |
|
I hit this soon after landing unified target support, running racer. <0>[46139.810353] LustreError: 31320:0:(lu_target.h:129:tgt_ses_info()) ASSERTIO N( env->le_ses != ((void *)0) ) failed: <0>[46139.811605] LustreError: 31320:0:(lu_target.h:129:tgt_ses_info()) LBUG <0>[46139.812210] Kernel panic - not syncing: LBUG in interrupt. <0>[46139.812211] <4>[46139.813195] Pid: 31320, comm: ll_ost00_007 Not tainted 2.6.32-rhe6.4-debug 2 #1 <4>[46139.815359] Call Trace: <4>[46139.815359] [<ffffffff814fade7>] ? panic+0xa7/0x16f <4>[46139.815359] [<ffffffffa0abbeed>] ? lbug_with_loc+0x8d/0xb0 [libcfs] <4>[46139.815359] [<ffffffffa14367d4>] ? tgt_punch_hpreq_lock_match+0x104/0x110 [ptlrpc] <4>[46139.815359] [<ffffffffa13c03f8>] ? ldlm_server_blocking_ast+0x1e8/0x880 [ptlrpc] <4>[46139.815359] [<ffffffffa1434f5b>] ? tgt_blocking_ast+0x7b/0x5e0 [ptlrpc] <4>[46139.815359] [<ffffffffa0ac7685>] ? libcfs_nid2str+0x155/0x160 [libcfs] <4>[46139.815359] [<ffffffffa1393e8d>] ? ldlm_work_bl_ast_lock+0xdd/0x290 [ptlrpc] <4>[46139.815359] [<ffffffffa13d453f>] ? ptlrpc_set_wait+0x6f/0x830 [ptlrpc] <4>[46139.815359] [<ffffffffa13d0ea8>] ? ptlrpc_prep_set+0x38/0x300 [ptlrpc] <4>[46139.815359] [<ffffffff81094e64>] ? __init_waitqueue_head+0x24/0x40 <4>[46139.815359] [<ffffffffa13d0f8f>] ? ptlrpc_prep_set+0x11f/0x300 [ptlrpc] <4>[46139.815359] [<ffffffffa1393db0>] ? ldlm_work_bl_ast_lock+0x0/0x290 [ptlrpc] <4>[46139.815359] [<ffffffffa1396e3b>] ? ldlm_run_ast_work+0x1bb/0x440 [ptlrpc] <4>[46139.815359] [<ffffffffa13ada6f>] ? ldlm_process_extent_lock+0x1af/0xaa0 [ptlrpc] <4>[46139.815359] [<ffffffffa13963cc>] ? ldlm_lock_enqueue+0x38c/0x860 [ptlrpc] <4>[46139.815359] [<ffffffffa13bef1f>] ? ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc] <4>[46139.815359] [<ffffffffa1438012>] ? tgt_enqueue+0x62/0x1d0 [ptlrpc] <4>[46139.815359] [<ffffffffa143c2c4>] ? tgt_request_handle+0x224/0x9f0 [ptlrpc] <4>[46139.815359] [<ffffffffa13efdd3>] ? ptlrpc_main+0xcd3/0x1940 [ptlrpc] <4>[46139.815359] [<ffffffffa13ef100>] ? ptlrpc_main+0x0/0x1940 [ptlrpc] <4>[46139.815359] [<ffffffff81094726>] ? kthread+0x96/0xa0 <4>[46139.815359] [<ffffffff8100c10a>] ? child_rip+0xa/0x20 <4>[46139.815359] [<ffffffff81094690>] ? kthread+0x0/0xa0 <4>[46139.815359] [<ffffffff8100c100>] ? child_rip+0x0/0x20 code branch in my tree: master-20131128 This might be slightly related to lu-2246 too, which failed with the same assrtion, though with a totally different path. |
| Comments |
| Comment by Mikhail Pershin [ 30/Nov/13 ] |
|
That doesn't look like master branch, there is no tgt_punch_hpreq_lock_match() in master now, it is not yet landed http://review.whamcloud.com/#/c/7383/26 |
| Comment by Mikhail Pershin [ 01/Dec/13 ] |
|
Well, the hpreq_lock_match() may be called for requests in exp_hp_rpcs list, but in that list request is put earlier, before processing, so it might has no thread and corresponding lu_env. Patch was refreshed to don't use thread environment in hpreq_lock_match() but take everything from request itself. |
| Comment by Oleg Drokin [ 26/Dec/13 ] |
|
problem was in a patch that did not land and underwent some changes to fix this. |
| Comment by nasf (Inactive) [ 06/Jan/14 ] |
|
Another failure instance: https://maloo.whamcloud.com/test_sets/fe5a3d08-76ab-11e3-9ce8-52540035b04c |