[LU-4370] lu_object.h:853:lu_object_attr()) ASSERTION( ((o)->lo_header->loh_attr & LOHA_EXISTS) != 0 ) failed: Created: 09/Dec/13 Updated: 07/Jan/14 Resolved: 07/Jan/14 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | John Hammond | Assignee: | Di Wang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | dne, mdt | ||
| Severity: | 3 |
| Rank (Obsolete): | 11968 |
| Description |
|
Running racer on a single node with MDSCOUNT=2 I see this: crash> bt PID: 6987 TASK: ffff8801ccbbe040 CPU: 1 COMMAND: "mdt00_008" #0 [ffff8801c619d808] machine_kexec at ffffffff81035d6b #1 [ffff8801c619d868] crash_kexec at ffffffff810c0e22 #2 [ffff8801c619d938] panic at ffffffff8150f01f #3 [ffff8801c619d9b8] lbug_with_loc at ffffffffa02a9eeb [libcfs] #4 [ffff8801c619d9d8] mdt_getattr_internal at ffffffffa0ba07b4 [mdt] #5 [ffff8801c619da68] mdt_getattr_name_lock at ffffffffa0ba1bc6 [mdt] #6 [ffff8801c619db18] mdt_intent_getattr at ffffffffa0ba2883 [mdt] #7 [ffff8801c619db78] mdt_intent_policy at ffffffffa0b91979 [mdt] #8 [ffff8801c619dbd8] ldlm_lock_enqueue at ffffffffa062f509 [ptlrpc] #9 [ffff8801c619dc38] ldlm_handle_enqueue0 at ffffffffa0658c4f [ptlrpc] #10 [ffff8801c619dca8] tgt_enqueue at ffffffffa06d2562 [ptlrpc] #11 [ffff8801c619dcc8] tgt_handle_request0 at ffffffffa06d4f5a [ptlrpc] #12 [ffff8801c619dd58] tgt_request_handle at ffffffffa06d653a [ptlrpc] #13 [ffff8801c619dda8] ptlrpc_main at ffffffffa068a295 [ptlrpc] #14 [ffff8801c619dee8] kthread at ffffffff81096a36 #15 [ffff8801c619df48] kernel_thread at ffffffff8100c0ca This is from: if (info->mti_cross_ref) {
...
if (rc == 0) {
/* Finally, we can get attr for child. */
mdt_set_capainfo(info, 0, mdt_object_fid(child),
BYPASS_CAPA);
rc = mdt_getattr_internal(info, child, 0);
if (unlikely(rc != 0))
mdt_object_unlock(info, child, lhc, 1);
}
RETURN(rc);
}
Above we are checking that parent (which is really child) exists but only if lname is non-NULL. There are several more assertions in mdt_getattr_name_lock(), mdt_getattr_internal, and mdt_raw_lookup() which just depend on the politeness of clients. These should be collected and replaced with error handling. |
| Comments |
| Comment by Peter Jones [ 10/Dec/13 ] |
|
Di Could you please look into this one? Thanks Peter |
| Comment by Di Wang [ 10/Dec/13 ] |
|
John, could you please tell me which line hit this Assertion failed. I also tried racer with MDSCOUNT=2, but it can pass for me with these 2 patches http://review.whamcloud.com/#/c/8370/ probably you can try with these 2 patches, if you are interested. Thanks |
| Comment by John Hammond [ 10/Dec/13 ] |
|
Di, It's from the lu_object_attr() call in mdt_getattr_internal(). Change 8371 will address this. |
| Comment by John Hammond [ 07/Jan/14 ] |
|
Fixed by http://review.whamcloud.com/#/c/8371/. |