[LU-2293] Assertion triggered in osp_sync_thread Created: 06/Nov/12 Updated: 07/Nov/12 Resolved: 06/Nov/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Prakash Surya (Inactive) | Assignee: | Alex Zhuravlev |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | topsequoia | ||
| Severity: | 3 |
| Rank (Obsolete): | 5484 |
| Description |
|
Triggered this assertion bringing up the MDS after a version change: LustreError: 33030:0:(osp_sync.c:584:osp_sync_process_record()) processed all old entries: 0x3e03:1 LustreError: 33030:0:(osp_sync.c:584:osp_sync_process_record()) Skipped 28 previous similar messages LustreError: 33027:0:(llog_cat.c:187:llog_cat_id2handle()) lstest-OST01c8-osc-MDT0000: error opening log id 0x5a5a5a5a5a5a5a5a:5a5a5a5a: rc = -2 LustreError: 33027:0:(llog_cat.c:513:llog_cat_cancel_records()) Cannot find log 0x5a5a5a5a5a5a5a5a LustreError: 33027:0:(llog_cat.c:552:llog_cat_cancel_records()) lstest-OST01c8-osc-MDT0000: fail to cancel 0 of 1 llog-records: rc = -2 LustreError: 33027:0:(osp_sync.c:714:osp_sync_process_committed()) @@@ lstest-OST01c8-osc-MDT0000: can't cancel record: -2 req@ffff880f7bd3b800 x1417921862573367/t0(0) o6->lstest-OST01c8-osc-MDT0000@172.20.3.56@o2ib500:28/4 lens 664/400 e 0 to 0 dl 1352235775 ref 1 fl Complete:R/0/0 rc 0/-2 LustreError: 33027:0:(llog_cat.c:187:llog_cat_id2handle()) lstest-OST01c8-osc-MDT0000: error opening log id 0x5a5a5a5a5a5a5a5a:5a5a5a5a: rc = -2 LustreError: 33027:0:(llog_cat.c:513:llog_cat_cancel_records()) Cannot find log 0x5a5a5a5a5a5a5a5a LustreError: 33027:0:(llog_cat.c:552:llog_cat_cancel_records()) lstest-OST01c8-osc-MDT0000: fail to cancel 0 of 1 llog-records: rc = -2 LustreError: 33027:0:(osp_sync.c:714:osp_sync_process_committed()) @@@ lstest-OST01c8-osc-MDT0000: can't cancel record: -2 req@ffff880fd2d16000 x1417921862573368/t0(0) o6->lstest-OST01c8-osc-MDT0000@172.20.3.56@o2ib500:28/4 lens 664/400 e 0 to 0 dl 1352235775 ref 1 fl Complete:R/0/0 rc 0/-2 LustreError: 33027:0:(osp_sync.c:866:osp_sync_thread()) ASSERTION( rc == 0 || rc == LLOG_PROC_BREAK ) failed: 0 changes, 7 in progress, 7 in flight: -22 LustreError: 33027:0:(osp_sync.c:866:osp_sync_thread()) LBUG Pid: 33027, comm: osp-syn-456 Call Trace: [<ffffffffa05ae965>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa05aef77>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa1006440>] osp_sync_thread+0x630/0x700 [osp] [<ffffffffa1005e10>] ? osp_sync_thread+0x0/0x700 [osp] [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffffa1005e10>] ? osp_sync_thread+0x0/0x700 [osp] [<ffffffffa1005e10>] ? osp_sync_thread+0x0/0x700 [osp] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Kernel panic - not syncing: LBUG Pid: 33027, comm: osp-syn-456 Tainted: P W ---------------- 2.6.32-220.23.1.2chaos.ch5.x86_64 #1 Call Trace: [<ffffffff814eea92>] ? panic+0x78/0x143 [<ffffffffa05aefcb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] [<ffffffffa1006440>] ? osp_sync_thread+0x630/0x700 [osp] [<ffffffffa1005e10>] ? osp_sync_thread+0x0/0x700 [osp] [<ffffffff8100c14a>] ? child_rip+0xa/0x20 [<ffffffffa1005e10>] ? osp_sync_thread+0x0/0x700 [osp] [<ffffffffa1005e10>] ? osp_sync_thread+0x0/0x700 [osp] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Lustre Version: Lustre: Lustre: Build Version: 2.3.54-2chaos-2chaos--PRISTINE-2.6.32-220.23.1.2chaos.ch5.x86_64 |
| Comments |
| Comment by Prakash Surya (Inactive) [ 06/Nov/12 ] |
|
Looks like a duplicate of |
| Comment by Peter Jones [ 06/Nov/12 ] |
|
Alex Could you please assign someone to this one? Peter |
| Comment by Peter Jones [ 06/Nov/12 ] |
|
Ah. Our comments crossed |
| Comment by Li Wei (Inactive) [ 06/Nov/12 ] |
|
Prakash, I went to https://github.com/chaos/lustre and did not find 2.3.54-2chaos-2chaos tag from the branch/tag drop down list. Was I looking at the wrong place? |
| Comment by Prakash Surya (Inactive) [ 07/Nov/12 ] |
|
Sorry, it looks like we have not pushed that tag to github yet. Looking at what's there, this branch is what is tagged as 2.3.54-2chaos: https://github.com/chaos/lustre/commits/2.3.54-llnl I did not have your two patches from commit 0748ca16b672798ca213b8582979ae5481de19d2
Author: Li Wei <wei.g.li@intel.com>
Date: Fri Nov 2 15:21:01 2012 +0800
LU-2109 llog: Diagnostic patch
To hunt down those who free log handles that are still being
processed.
Change-Id: Ib65c5fb8881cfeeb5cbf5b891ae235b97dde5e82
Signed-off-by: Li Wei <wei.g.li@intel.com>
commit d0f28b8d78ec86041c79d77e9f423e48f9812c6e
Author: Li Wei <wei.g.li@intel.com>
Date: Thu Nov 1 21:47:57 2012 +0800
LU-2109 osp: Tell more when unable to cancel log records
This is to debug LU-2109, but I think it may be useful to be landed to
master.
Change-Id: I7b487271608eb7ecbd9869c6e44643a463f08416
Signed-off-by: Li Wei <wei.g.li@intel.com>
But I've pulled them in since hitting this, so they should be there the next time it occurs. |