[LU-13102] llog_cat_id2handle()) error opening log id rc = -2 osp_sync_thread()) ASSERTION( thread->t_flags != SVC_RUNNING ) Created: 26/Dec/19 Updated: 20/May/20 Resolved: 05/Mar/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | Lustre 2.14.0, Lustre 2.12.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | Alexander Boyko | Assignee: | Alexander Boyko |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
During massive IO at cluster we could reproduce the next issue. Nov 24 12:25:15 snx11281n003 kernel: LustreError: 12929:0:(llog_cat.c:404:llog_cat_id2handle()) snx11281-OST0000-osc-MDT0001: error opening log id [0x22903:0x1:0x0]:0: rc = -2 Nov 24 12:25:15 snx11281n003 kernel: LustreError: 12929:0:(osp_sync.c:1293:osp_sync_thread()) ASSERTION( thread->t_flags != SVC_RUNNING ) failed: 3 changes, 0 in progress, 0 in flight Nov 24 12:25:15 snx11281n003 kernel: LustreError: 12929:0:(osp_sync.c:1293:osp_sync_thread()) LBUG Nov 24 12:25:15 snx11281n003 kernel: Pid: 12929, comm: osp-syn-0-1 3.10.0-957.1.3957.1.3.x3.3.36.x86_64 #1 SMP Tue Nov 19 09:47:32 PST 2019 Nov 24 12:25:15 snx11281n003 kernel: Call Trace: Nov 24 12:25:15 snx11281n003 kernel: [<ffffffffc0a427cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] Nov 24 12:25:15 snx11281n003 kernel: [<ffffffffc0a4287c>] lbug_with_loc+0x4c/0xa0 [libcfs] Nov 24 12:25:15 snx11281n003 kernel: [<ffffffffc18b5a1b>] osp_sync_thread+0xa1b/0xa70 [osp] Nov 24 12:25:15 snx11281n003 kernel: [<ffffffffa72c1c31>] kthread+0xd1/0xe0 Nov 24 12:25:15 snx11281n003 kernel: [<ffffffffa7976c1d>] ret_from_fork_nospec_begin+0x7/0x21 Nov 24 12:25:15 snx11281n003 kernel: [<ffffffffffffffff>] 0xffffffffffffffff And with additional debug. llog_cat_set_first_idx()) catlog [0x6:0x1:0x0] first idx 34730, last_idx 34731 osp_sync_process_queues()) 1 changes, 0 in progress, 0 in flight llog_process_thread()) stop processing plain 0x76941:1:0 index 64767 count 1 llog_process_thread()) index: 34731, lh_last_idx: 34730 synced_idx: 34730 lgh_last_idx: 34731 llog_cat_process_common()) processing log [0x2780f:0x1:0x0]:0 at index 34731 of catalog [0x6:0x1:0x0] llog_cat_id2handle()) snx11281-OST0001-osc-MDT0001: error opening log id [0x2780f:0x1:0x0]:0:rc = -2 RC |
| Comments |
| Comment by Gerrit Updater [ 26/Dec/19 ] |
|
Alexandr Boyko (c17825@cray.com) uploaded a new patch: https://review.whamcloud.com/37102 |
| Comment by Jian Yu [ 21/Jan/20 ] |
|
+1 on master branch: https://testing.whamcloud.com/test_sets/1160632e-3bb7-11ea-adca-52540065bddc |
| Comment by Jian Yu [ 28/Jan/20 ] |
|
More failures on master branch: |
| Comment by Gerrit Updater [ 05/Mar/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37102/ |
| Comment by Peter Jones [ 05/Mar/20 ] |
|
Landed for 2.14 |
| Comment by Gerrit Updater [ 08/May/20 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38541 |
| Comment by Gerrit Updater [ 20/May/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38541/ |