[LU-7075] trigger scrub when running racer + migration Created: 01/Sep/15 Updated: 23/Mar/17 Resolved: 23/Mar/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Di Wang | Assignee: | Di Wang |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
The console message also includes some debug information I added. Lustre: lustre-MDT0000: trigger OI scrub by RPC for [0x200000405:0xcc9:0x0], rc = 0 [2] LustreError: 7795:0:(osd_handler.c:460:osd_check_lma()) lma fid [0x200000405:0xed7:0x0] obj fid [0x200000405:0xcc9:0x0] LustreError: 60792:0:(osd_handler.c:642:osd_fid_lookup()) trigger scrub with show -78 Lustre: lustre-MDT0000-o: trigger OI scrub by RPC for [0x200000405:0xcc9:0x0], rc = 0 [1] incoming 6 LustreError: 60792:0:(osd_handler.c:591:osd_fid_lookup()) LBUG Pid: 60792, comm: mdt01_022 Call Trace: [<ffffffffa05cf875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa05cfe77>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0f3c4a6>] osd_object_init+0x1416/0x1420 [osd_ldiskfs] [<ffffffffa0717e5e>] ? dt_object_init+0xe/0x10 [obdclass] [<ffffffffa0715848>] lu_object_alloc+0xd8/0x320 [obdclass] [<ffffffffa0716c31>] lu_object_find_try+0x151/0x260 [obdclass] [<ffffffffa0716df1>] lu_object_find_at+0xb1/0xe0 [obdclass] [<ffffffffa0b004a5>] ? lod_index_lookup+0x25/0x30 [lod] [<ffffffffa101fc0c>] ? __mdd_lookup+0x28c/0x450 [mdd] [<ffffffffa0716e36>] lu_object_find+0x16/0x20 [obdclass] [<ffffffffa10847f6>] mdt_object_find+0x56/0x170 [mdt] [<ffffffffa1098a77>] mdt_getattr_name_lock+0xf87/0x1910 [mdt] [<ffffffffa109d009>] ? old_init_ucred+0x1b9/0x390 [mdt] [<ffffffffa1099922>] mdt_intent_getattr+0x292/0x470 [mdt] [<ffffffffa108b224>] mdt_intent_policy+0x494/0xc40 [mdt] [<ffffffffa08f0267>] ldlm_lock_enqueue+0x127/0x8e0 [ptlrpc] [<ffffffffa091cfa7>] ldlm_handle_enqueue0+0x807/0x15b0 [ptlrpc] [<ffffffffa0994e41>] ? tgt_lookup_reply+0x31/0x190 [ptlrpc] [<ffffffffa09a7b11>] tgt_enqueue+0x61/0x230 [ptlrpc] [<ffffffffa09a88ec>] tgt_request_handle+0xa4c/0x1290 [ptlrpc] [<ffffffffa09505b1>] ptlrpc_main+0xe41/0x1910 [ptlrpc] [<ffffffffa094f770>] ? ptlrpc_main+0x0/0x1910 [ptlrpc] [<ffffffff8109e66e>] kthread+0x9e/0xc0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0 [<ffffffff8100c200>] ? child_rip+0x0/0x20 |
| Comments |
| Comment by Di Wang [ 04/Sep/15 ] |
|
Hmm, after I disable the OIC cache (OSD ID cache), this problem will go away. So it might be related with OIC cache. |
| Comment by Andreas Dilger [ 15/Sep/15 ] |
|
Di, is there another bug that is fixing the OSD cache problem? |
| Comment by Di Wang [ 15/Sep/15 ] |
|
Andreas: No, there are no other bugs. I will discuss with Fan Yong to see if this OIC bug or migration problem. |
| Comment by Peter Jones [ 24/Sep/15 ] |
|
Is this ticket a duplicate of |
| Comment by Di Wang [ 24/Sep/15 ] |
|
No, it should not be duplicate of |
| Comment by Di Wang [ 09/Sep/16 ] |
|
This does not happen for very long time, let's close it for now. |