[LU-4945] req_capsule_get: Wrong buffer for field `name' (5 of 6) in format `LDLM_INTENT_GETATTR': 3 vs. 0 (client) Created: 23/Apr/14 Updated: 01/May/14 Resolved: 01/May/14 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Di Wang | Assignee: | Di Wang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Epic/Theme: | dne | ||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 13685 | ||||||||||||
| Description |
|
I found this problem when I tried to run racer with MDSCOUNT=4 LustreError: 8391:0:(pack_generic.c:815:lustre_msg_string()) can't unpack short string in msg ffffc90017980658 buffer[5] len 3: strlen 0 |
| Comments |
| Comment by Andreas Dilger [ 24/Apr/14 ] |
|
Di, how serious is this bug, and what problem would be visible to the client? Is the source of this bug obvious, and could Lai create the patch? |
| Comment by Di Wang [ 24/Apr/14 ] |
|
Andreas: this bug is pretty serious to me, which seems related with the new readdir change. I am investigating it right now, no obvious clue yet. Thanks. |
| Comment by Di Wang [ 24/Apr/14 ] |
|
Hmm, it turns out using the dir_ent pointer to locate next entry is not very safe without holding ldlm lock. mdc_read_entry()
{
....
/* If op_data->op_ent != NULL(see ll_dir_entry_next), try to get
* next ent directly */
if (likely(op_data->op_ent != NULL)) {
ent = lu_dirent_next(op_data->op_ent);
if (likely(ent != NULL))
GOTO(out, rc);
} else {
.....
So we either find a new way to resolve the hash conflict or hold the ldlm lock during iteration. I will cook a patch. |
| Comment by John Hammond [ 25/Apr/14 ] |
|
This is a good excuse/opportunity to kill all the uses of LOGL0() to pack names: +static void mdc_pack_name(struct ptlrpc_request *req,
+ const struct req_msg_field *field,
+ const char *name, size_t name_len)
+{
+ char *buf;
+ size_t buf_size;
+
+ buf = req_capsule_client_get(&req->rq_pill, field);
+ buf_size = req_capsule_get_size(&req->rq_pill, field, RCL_CLIENT);
+
+ LASSERT(buf != NULL &&
+ buf_size == name_len + 1 &&
+ name != NULL &&
+ name_len != 0 &&
+ strnlen(name, name_len) == name_len &&
+ name[name_len] == '\0');
+
+ strlcpy(buf, name, buf_size);
+
+ LASSERT(strlen(buf) == name_len);
+}
+
...
- tmp = req_capsule_client_get(&req->rq_pill, &RMF_NAME);
- LOGL0(op_data->op_name, op_data->op_namelen, tmp);
+ mdc_pack_name(req, &RMF_NAME, op_data->op_name, op_data->op_namelen);
|
| Comment by Di Wang [ 25/Apr/14 ] |
| Comment by Di Wang [ 25/Apr/14 ] |
|
Sorry, John, I did not include your changes into this patch. I will try to add it later. |
| Comment by Jodi Levi (Inactive) [ 30/Apr/14 ] |
|
Changes merged into http://review.whamcloud.com/#/c/9191/ |
| Comment by Andreas Dilger [ 30/Apr/14 ] |
|
Fix was merged into http://review.whamcloud.com/9191 under |
| Comment by Di Wang [ 01/May/14 ] |
|
The patch is already merged to the fix of |
| Comment by Di Wang [ 01/May/14 ] |
|
John: Sorry, Could you please create a new ticket for your suggestion? Thanks. |
| Comment by John Hammond [ 01/May/14 ] |
|
Done. |