Details
-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
Lustre 2.13.0
-
None
-
3
-
9223372036854775807
Description
It looks patch aa82cc8361 ("obdclass: put all service's env on the list") introduce one more regression.
sanity test hit a panic sometimes.
[84431.019155] Lustre: DEBUG MARKER: == sanity test 134a: Server reclaims locks when reaching lock_reclaim_threshold ====================== 12:42:06 (1564134126) [84431.770714] Lustre: *** cfs_fail_loc=327, val=0*** [84431.787957] LustreError: 10728:0:(ofd_internal.h:410:ofd_info()) ASSERTION( info ) failed: [84431.788969] LustreError: 10728:0:(ofd_internal.h:410:ofd_info()) LBUG [84431.790039] Pid: 10728, comm: mdt00_005 3.10.0-neo7.4.x86_64 #0 SMP Thu Nov 15 06:30:59 EST 2018 [84431.791221] Call Trace: [84431.792331] [<ffffffff810434f2>] save_stack_trace_tsk+0x22/0x40 [84431.793382] [<ffffffffc06c47ec>] libcfs_call_trace+0x8c/0xc0 [libcfs] [84431.794333] [<ffffffffc06c489c>] lbug_with_loc+0x4c/0xa0 [libcfs] [84431.795260] [<ffffffffc1394e71>] ofd_exit+0x0/0x18f [ofd] [84431.796169] [<ffffffffc1393a3b>] ofd_lvbo_update+0xd5b/0xe60 [ofd] [84431.797248] [<ffffffffc0b1d6b5>] ldlm_handle_ast_error+0x475/0x860 [ptlrpc] [84431.798655] [<ffffffffc0b1f32a>] ldlm_cb_interpret+0x19a/0x750 [ptlrpc] [84431.800193] [<ffffffffc0b3a954>] ptlrpc_check_set.part.22+0x494/0x1e90 [ptlrpc] [84431.801494] [<ffffffffc0b3c3ab>] ptlrpc_check_set+0x5b/0xe0 [ptlrpc] [84431.802831] [<ffffffffc0b3c774>] ptlrpc_set_wait+0x344/0x7c0 [ptlrpc] [84431.803902] [<ffffffffc0af8475>] ldlm_run_ast_work+0xd5/0x3a0 [ptlrpc] [84431.805074] [<ffffffffc0b31235>] ldlm_reclaim_full+0x425/0x7a0 [ptlrpc] [84431.806060] [<ffffffffc0b2243b>] ldlm_handle_enqueue0+0x13b/0x1650 [ptlrpc] [84431.807019] [<ffffffffc0bae3a2>] tgt_enqueue+0x62/0x210 [ptlrpc] [84431.807903] [<ffffffffc0bb70a8>] tgt_request_handle+0x998/0x1610 [ptlrpc] [84431.808731] [<ffffffffc0b56966>] ptlrpc_server_handle_request+0x266/0xb30 [ptlrpc] [84431.809561] [<ffffffffc0b5afc0>] ptlrpc_main+0xd20/0x1cf0 [ptlrpc] [84431.810355] [<ffffffff810ce2df>] kthread+0xef/0x100 [84431.811077] [<ffffffff8178bedd>] ret_from_fork+0x5d/0xb0 [84431.811755] [<ffffffffffffffff>] 0xffffffffffffffff
Attachments
Issue Links
- is related to
-
LU-12570 sanity test 134a crash with SSK in use
-
- Resolved
-
based on discussion with Oleg, it looks two ways to fix exist.
1) fast - just limit a flush to same ldlm namespace type as request ordinated.
2) long - but better.
Current patch need to be interact with LRU resize to don't allow to set hard limit over SLV, otherwise stupid situation exist.
this tunable allowed to set less than LRU resize limit and it block to work LRU resize at all.