[LU-1972] Test failure on test suite replay-single, subtest test_53a Created: 18/Sep/12 Updated: 29/May/17 Resolved: 29/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 4218 |
| Description |
|
This issue was created by maloo for Li Wei <liwei@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/0b17713c-015a-11e2-bc4e-52540035b04c. The sub-test test_53a failed with the following error:
Info required for matching: replay-single 53a |
| Comments |
| Comment by Li Wei (Inactive) [ 18/Sep/12 ] |
|
It seems OBD_FAIL_MDS_CLOSE_NET is defined but never appears in any place in the code? |
| Comment by Ian Colle (Inactive) [ 28/Sep/12 ] |
|
https://maloo.whamcloud.com/test_sets/b5b17ebc-0939-11e2-a95c-52540035b04c |
| Comment by Andreas Dilger [ 12/Oct/12 ] |
|
Li Wei, it seems that OBD_FAIL_MDS_CLOSE_NET does exist, but it is defined in an extremely confusing manner: #define DEF_HNDL(prefix, base, suffix, flags, opc, fn, fmt) \
[prefix ## _ ## opc - prefix ## _ ## base] = { \
.mh_name = #opc, \
.mh_fail_id = OBD_FAIL_ ## prefix ## _ ## opc ## suffix, \
.mh_opc = prefix ## _ ## opc, \
.mh_flags = flags, \
.mh_act = fn, \
.mh_fmt = fmt \
}
#define DEF_MDT_HNDL_F(flags, name, fn) \
DEF_HNDL(MDS, GETATTR, _NET, flags, name, fn, &RQF_MDS_ ## name)
static struct mdt_handler mdt_mds_ops[] = {
:
:
DEF_MDT_HNDL_F(HABEO_CORPUS, CLOSE, mdt_close),
static int mdt_req_handle(struct mdt_thread_info *info,
struct mdt_handler *h, struct ptlrpc_request *req)
{
:
:
/*
* Checking for various OBD_FAIL_$PREF_$OPC_NET codes. _Do_ not try
* to put same checks into handlers like mdt_close(), mdt_reint(),
* etc., without talking to mdt authors first. Checking same thing
* there again is useless and returning 0 error without packing reply
* is buggy! Handlers either pack reply or return error.
*
* We return 0 here and do not send any reply in order to emulate
* network failure. Do not send any reply in case any of NET related
* fail_id has occured.
*/
if (OBD_FAIL_CHECK_ORSET(h->mh_fail_id, OBD_FAIL_ONCE))
RETURN(0);
and the only reason that I found it was coincidentally because of mdt_close() in the above comment... Searching for OBD_FAIL_MDS_CLOSE_NET, MDS_CLOSE, or RQF_MDS_CLOSE produced nothing. I found the mdt_close() function and had started searching for that in the code when I found the comment. Grrr, this makes the MDS code nearly impossible to follow (as if it wasn't already)... I've submitted http://review.whamcloud.com/4260 to address the terrible coding style. This may make it easier to understand the code, but does nothing to actually fix the problem reported here. |
| Comment by Jian Yu [ 22/Dec/12 ] |
|
Lustre Client: 2.1.4 RC2 The same issue occurred: https://maloo.whamcloud.com/test_sets/16e08334-4bea-11e2-a817-52540035b04c |
| Comment by Sarah Liu [ 14/Jan/13 ] |
|
another instance seen in 2.3.0 server vs 2.4 client: |
| Comment by Andreas Dilger [ 29/May/17 ] |
|
Close old ticket. |