[LU-2275] Open request leak Created: 04/Nov/12 Updated: 18/Feb/13 Resolved: 18/Feb/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Oleg Drokin | Assignee: | Oleg Drokin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LB | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 5435 | ||||||||||||
| Description |
|
It appears that we are having some sort of mdc request leak problem on a client. The most easy way I see to trigger it is to run racer in a loop, eventually it fails to cleanup due to mdc still being referenced. Finally I traced it to some (I think, due to replay_cb set at mdc_replay_open) open requests emanating from rm (so I guess those are actually opendir). (gdb) p *(struct ptlrpc_request *)0xffff880099b38bf0
$1 = {rq_type = 4711, rq_status = 301, rq_list = {next = 0xffff880099b38bf8,
prev = 0xffff880099b38bf8}, rq_timed_list = {next = 0xffff880099b38c08,
prev = 0xffff880099b38c08}, rq_history_list = {next = 0xffff880099b38c18,
prev = 0xffff880099b38c18}, rq_exp_list = {next = 0xffff880099b38c28,
prev = 0xffff880099b38c28}, rq_ops = 0x0, rq_svc_thread = 0x0,
rq_history_seq = 0, rq_at_index = 0, rq_lock = {raw_lock = {slock = 0},
magic = 3735899821, owner_cpu = 4294967295, owner = 0xffffffffffffffff},
rq_intr = 0, rq_replied = 1, rq_err = 0, rq_timedout = 0, rq_resend = 0,
rq_restart = 0, rq_replay = 1, rq_no_resend = 0, rq_waiting = 0,
rq_receiving_reply = 0, rq_no_delay = 0, rq_net_err = 0, rq_wait_ctx = 0,
rq_early = 0, rq_must_unlink = 0, rq_fake = 0, rq_memalloc = 0,
rq_packed_final = 0, rq_hp = 0, rq_at_linked = 0, rq_reply_truncate = 0,
rq_committed = 0, rq_invalid_rqset = 0, rq_generation_set = 0,
rq_no_retry_einprogress = 0, rq_nr_resend = 0, rq_phase = 3955285508,
rq_next_phase = 3955285510, rq_refcount = {counter = 1},
rq_request_portal = 12, rq_reply_portal = 10, rq_nob_received = 544,
rq_reqlen = 576, rq_replen = 544, rq_reqmsg = 0xffff8800dd315bf0,
rq_repmsg = 0xffff88009a7d58b0, rq_transno = 0, rq_xid = 1417687094104231,
rq_replay_list = {next = 0xffff880099b38cb8, prev = 0xffff880099b38cb8},
rq_cli_ctx = 0xffffffffa12b5fc0, rq_svc_ctx = 0x0, rq_ctx_chain = {
next = 0xffff880099b38cd8, prev = 0xffff880099b38cd8}, rq_flvr = {
sf_rpc = 0, sf_flags = 0, u_rpc = {<No data fields>}, u_bulk = {hash = {
hash_alg = 0 '\000'}}}, rq_sp_from = LUSTRE_SP_CLI, rq_ctx_init = 0,
---Type <return> to continue, or q <return> to quit---
rq_ctx_fini = 0, rq_bulk_read = 0, rq_bulk_write = 0, rq_auth_gss = 0,
rq_auth_remote = 0, rq_auth_usr_root = 0, rq_auth_usr_mdt = 0,
rq_auth_usr_ost = 0, rq_pack_udesc = 0, rq_pack_bulk = 0, rq_no_reply = 0,
rq_pill_init = 1, rq_auth_uid = 0, rq_auth_mapped_uid = 0,
rq_user_desc = 0x0, rq_reqbuf = 0xffff8800dd315bf0,
rq_repbuf = 0xffff88009a7d57f0 "", rq_repdata = 0xffff88009a7d58b0,
rq_clrbuf = 0x0, rq_reqbuf_len = 1024, rq_reqdata_len = 576,
rq_repbuf_len = 2048, rq_repdata_len = 544, rq_clrbuf_len = 0,
rq_clrdata_len = 0, rq_reply_off = 192, rq_req_swab_mask = 0,
rq_rep_swab_mask = 0, rq_import_generation = 1,
rq_send_state = LUSTRE_IMP_FULL, rq_early_count = 0, rq_req_md_h = {
cookie = 91348489}, rq_req_cbid = {
cbid_fn = 0xffffffffa11f07f0 <request_out_callback>,
cbid_arg = 0xffff880099b38bf0}, rq_delay_limit = 0,
rq_queued_time = 4296236347, rq_arrival_time = {tv_sec = 1352013601,
tv_usec = 266507}, rq_reply_state = 0x0, rq_rqbd = 0x0, rq_reply_md_h = {
cookie = 91348481}, rq_reply_waitq = {lock = {raw_lock = {slock = 0},
magic = 3735899821, owner_cpu = 4294967295, owner = 0xffffffffffffffff},
task_list = {next = 0xffff880099b38dc8, prev = 0xffff880099b38dc8}},
rq_reply_cbid = {cbid_fn = 0xffffffffa11f1390 <reply_in_callback>,
cbid_arg = 0xffff880099b38bf0}, rq_self = 0, rq_peer = {nid = 0, pid = 0},
rq_export = 0x0, rq_import = 0xffff8801e78dd7f0,
rq_replay_cb = 0xffffffffa13cfc20 <mdc_replay_open>, rq_commit_cb = 0,
---Type <return> to continue, or q <return> to quit---
rq_cb_data = 0x0, rq_bulk = 0x0, rq_sent = 1352013601,
rq_real_sent = 1352013601, rq_deadline = 1352013645, rq_reply_deadline = 0,
rq_bulk_deadline = 0, rq_timeout = 43, rq_set_waitq = {lock = {raw_lock = {
slock = 0}, magic = 3735899821, owner_cpu = 4294967295,
owner = 0xffffffffffffffff}, task_list = {next = 0xffff880099b38e78,
prev = 0xffff880099b38e78}}, rq_set_chain = {next = 0xffff880099b38e88,
prev = 0xffff880099b38e88}, rq_set = 0x0, rq_interpret_reply = 0,
rq_async_args = {pointer_arg = {0x0 <repeats 11 times>}, space = {0, 0, 0,
0, 0, 0, 0}}, rq_pool = 0x0, rq_session = {lc_tags = 0, lc_state = 0,
lc_thread = 0x0, lc_value = 0x0, lc_remember = {next = 0x0, prev = 0x0},
lc_version = 0, lc_cookie = 0}, rq_recov_session = {lc_tags = 0,
lc_state = 0, lc_thread = 0x0, lc_value = 0x0, lc_remember = {next = 0x0,
prev = 0x0}, lc_version = 0, lc_cookie = 0}, rq_pill = {
rc_req = 0xffff880099b38bf0, rc_fmt = 0xffffffffa128e5e0,
rc_loc = RCL_CLIENT, rc_area = {{184, 104, 8, 136, 0, 0, 2, 72,
4294967295}, {184, 112, 216, 56, 260, 120, 120, 4294967295,
4294967295}}}}
This is already post-cleanup: [root@rhel6 tests]# ../utils/lctl dl 17 ST mdc lustre-MDT0000-mdc-ffff8801fbd8abf0 0e1d968f-d661-3c7e-0982-22c28b7e3db2 2 24 ST mdc lustre-MDT0000-mdc-ffff8802349e9bf0 d165ded5-d083-219f-81f9-a68115593bdc 2 [root@rhel6 tests]# ls /proc/fs/lustre/mdc/ num_refs [root@rhel6 tests]# |
| Comments |
| Comment by Oleg Drokin [ 04/Nov/12 ] |
|
The leaked requests were allocated on the intent_file_open path: [<ffffffffa11dc0e7>] ptlrpc_request_alloc_internal+0xd7/0x330 [ptlrpc]
[<ffffffffa11dc363>] ptlrpc_request_alloc+0x13/0x20 [ptlrpc]
[<ffffffffa13e34ba>] mdc_enqueue+0x2fa/0x15b0 [mdc]
[<ffffffffa13e4993>] mdc_intent_lock+0x223/0x5a8 [mdc]
[<ffffffffa139f8f0>] lmv_intent_open+0x2d0/0x10d0 [lmv]
[<ffffffffa13a09a3>] lmv_intent_lock+0x2b3/0x380 [lmv]
[<ffffffffa0858fa6>] ll_intent_file_open+0x196/0x8d0 [lustre]
[<ffffffffa085a140>] ll_file_open+0x230/0xc50 [lustre]
[<ffffffffa08421db>] ll_dir_open+0xdb/0xf0 [lustre]
[<ffffffff8117ae9a>] __dentry_open+0x10a/0x3e0
[<ffffffff8117b284>] nameidata_to_filp+0x54/0x70
[<ffffffff8118e5b0>] do_filp_open+0x6c0/0xd90
[<ffffffff8117ac49>] do_sys_open+0x69/0x140
|
| Comment by Oleg Drokin [ 04/Nov/12 ] |
|
The problem seems to stem from the fact that the server returns a request that is both lookup-positive and lookup-negative. It's also enquque aborted, but open lock returned which is double contradictory too. Client is not prepared to such turn of events and leaks the request as it does not expect to have open reference if the lookup was negative. 00000002:00100000:3.0:1352078051.109700:0:2913:0:(mdc_locks.c:523:mdc_finish_enqueue()) @@@ op: 1 disposition: 200002f, status: 0 req@ffff8801688bebf0 x1417754181011236/t0(0) o101->lustre-MDT0000-mdc-ffff8801d5fa9bf0@0@lo:12/10 lens 576/544 e 0 to 0 dl 1352078082 ref 1 fl Complete:RP/0/0 rc 301/301 Now why is this happening I don't have any idea yet, but still digging |
| Comment by Oleg Drokin [ 05/Nov/12 ] |
|
I think I got to the root of it. a bunch of races in mdt_finish_open and mdt_open_by_fid_lock could lead to a situation where DISP_LOOKUP_POS and DISP_OPEN_OPEN could be set and then later code decide that it lost some sort of a race and still return a negative lookup result. In the end client will get the above mentioned request and leak it. Patch to fix the leak on client when a nonsensical request is encountered: http://review.whamcloud.com/4458 Patch to try and eliminate races leading to nonsensical request sending: http://review.whamcloud.com/4459 Misc patch to cleanup error exit path in class_newdev() that I noticed early on doing mdc refcount code review: http://review.whamcloud.com/4460 |
| Comment by Peter Jones [ 18/Feb/13 ] |
|
All three patches have landed for 2.4 |