[LU-2556] osp_sync_interpret()) ASSERTION( d->opd_syn_rpc_in_progress > 0 ) failed Created: 31/Dec/12 Updated: 26/Mar/13 Resolved: 26/Mar/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Oleg Drokin | Assignee: | Alex Zhuravlev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | MB | ||
| Severity: | 3 |
| Rank (Obsolete): | 5980 |
| Description |
|
Hit this while running recovery-small (test 29b) in a loop: [386997.184707] Lustre: Failing over lustre-MDT0000 [386997.191570] LustreError: 11-0: an error occurred while communicating with 0@lo. The mds_close operation failed with -19 [386997.192109] LustreError: Skipped 15 previous similar messages [386997.502568] LustreError: 31761:0:(osp_sync.c:393:osp_sync_interpret()) ASSERTION( d->opd_syn_rpc_in_progress > 0 ) failed: [386997.503167] LustreError: 31761:0:(osp_sync.c:393:osp_sync_interpret()) LBUG [386997.503450] Pid: 31761, comm: ptlrpcd_0 [386997.503663] [386997.503664] Call Trace: [386997.504041] [<ffffffffa0aea915>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [386997.504325] [<ffffffffa0aeaf27>] lbug_with_loc+0x47/0xb0 [libcfs] [386997.504595] [<ffffffffa09870f8>] osp_sync_interpret+0x4a8/0x560 [osp] [386997.504915] [<ffffffffa11ffc16>] ptlrpc_check_set+0x2b6/0x1db0 [ptlrpc] [386997.505238] [<ffffffffa1231b6b>] ptlrpcd_check+0x55b/0x590 [ptlrpc] [386997.505538] [<ffffffffa12320bb>] ptlrpcd+0x22b/0x3a0 [ptlrpc] [386997.505809] [<ffffffff81057d60>] ? default_wake_function+0x0/0x20 [386997.506099] [<ffffffffa1231e90>] ? ptlrpcd+0x0/0x3a0 [ptlrpc] [386997.506364] [<ffffffff8100c14a>] child_rip+0xa/0x20 [386997.506635] [<ffffffffa1231e90>] ? ptlrpcd+0x0/0x3a0 [ptlrpc] [386997.506921] [<ffffffffa1231e90>] ? ptlrpcd+0x0/0x3a0 [ptlrpc] [386997.507185] [<ffffffff8100c140>] ? child_rip+0x0/0x20 [386997.507433] [386997.511844] Kernel panic - not syncing: LBUG Crashdump is in /exports/crashdumps/192.168.10.217-2012-12-31-11\:44\:36/ |
| Comments |
| Comment by Oleg Drokin [ 06/Feb/13 ] |
|
Just hit it again, this time in replay-dual test 23d |
| Comment by Jodi Levi (Inactive) [ 06/Feb/13 ] |
|
Alex, Oleg indicated you are looking into this one, so assigning to you. |
| Comment by Alex Zhuravlev [ 06/Feb/13 ] |
|
the request was with: rq_status = -5, and such a case seem to be handled improperly in OSP code now. |
| Comment by Alex Zhuravlev [ 18/Feb/13 ] |
| Comment by Peter Jones [ 26/Mar/13 ] |
|
Landed for 2.4 |