[LU-9699] osp_obd_connect()) ASSERTION( osp->opd_connects == 1 ) failed Created: 21/Jun/17 Updated: 24/Sep/22 Resolved: 22/Sep/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | VIKRAM BABASO JADHAV (Inactive) | Assignee: | VIKRAM BABASO JADHAV (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
Writeconf on an MDT with index > 0000 will cause "add mdc" to be added to $FSNAME-client config and "add osp" to be added to $FSNAME-MDTXXXX configs. However, the configs may already contain these directives. Duplicating the OSP device will cause the assertion failure, unlike duplicating the MDC which will just return -EEXIST. A possible solution is to check configs for duplicates before writing to them. However, sometimes we would like to change nids which are part of "add mdc" and "add osp". Another solution is to mark previous entries with SKIP flags. This patch implements this approach. Since after revoking the config lock, the clients and the MDTs will receive the updated log and apply its newer entries, we still have to handle OSP duplication, but this is only an issue immediately after writeconf processing. [1904009.530445] LDISKFS-fs (md0): mounted filesystem with ordered data mode. quota=on. Opts: [1904010.544738] LustreError: 11-0: snx11117-MDT0000-osp-MDT0001: Communicating with 10.9.100.10@o2ib3, operation mds_connect failed with -114. [1904010.814980] Lustre: snx11117-MDD0001: changelog on [1904019.177269] LustreError: 84835:0:(genops.c:345:class_newdev()) Device snx11117-MDT0002-osp-MDT0001 already exists at 7, won't add [1904019.189880] LustreError: 84835:0:(obd_config.c:368:class_attach()) Cannot create device snx11117-MDT0002-osp-MDT0001 of type osp : -17 [1904019.202925] LustreError: 84835:0:(obd_config.c:1610:class_config_llog_handler()) MGC10.9.100.9@o2ib3: cfg command failed: rc = -17 [1904019.215616] Lustre: cmd=cf001 0:snx11117-MDT0002-osp-MDT0001 1:osp 2:snx11117-MDT0001-mdtlov_UUID [1904019.215617] [1904019.228104] LustreError: 84588:0:(mgc_request.c:517:do_requeue()) failed processing log: -17 [1904036.493105] LustreError: 85373:0:(obd_config.c:464:class_setup()) Device 7 already setup (type osp) [1904036.503095] LustreError: 85373:0:(obd_config.c:1610:class_config_llog_handler()) MGC10.9.100.9@o2ib3: cfg command failed: rc = -17 [1904036.515779] Lustre: cmd=cf003 0:snx11117-MDT0002-osp-MDT0001 1:snx11117-MDT0002_UUID 2:10.9.100.16@o2ib3 [1904036.515780] [1904036.528886] LustreError: 84588:0:(mgc_request.c:517:do_requeue()) failed processing log: -17 [1904044.588103] LustreError: 85579:0:(osp_dev.c:1175:osp_obd_connect()) ASSERTION( osp->opd_connects == 1 ) failed: [1904044.599237] LustreError: 85579:0:(osp_dev.c:1175:osp_obd_connect()) LBUG [1904044.606507] Pid: 85579, comm: llog_process_th [1904044.611432] [1904044.611433] Call Trace: [1904044.616506] [<ffffffffa07dd895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [1904044.624041] [<ffffffffa07dde97>] lbug_with_loc+0x47/0xb0 [libcfs] [1904044.630798] [<ffffffffa0919f2c>] osp_obd_connect+0x3bc/0x420 [osp] [1904044.645834] [<ffffffffa06fb7c1>] lod_add_device+0x8d1/0x1e00 [lod] [1904044.652674] [<ffffffffa06f4259>] lod_process_config+0xb89/0x1720 [lod] [1904044.667268] [<ffffffffa099a370>] class_process_config+0x1900/0x1ac0 [obdclass] [1904044.682914] [<ffffffffa099b664>] class_config_llog_handler+0xa34/0x18b0 [obdclass] [1904044.697415] [<ffffffffa095ecf9>] llog_process_thread+0xaa9/0xe80 [obdclass] [1904044.705421] [<ffffffffa095f115>] llog_process_thread_daemonize+0x45/0x70 [obdclass] [1904044.722898] [<ffffffff8109ac66>] kthread+0x96/0xa0 [1904044.728341] [<ffffffff8100c20a>] child_rip+0xa/0x20 [1904044.745110] [1904044.747419] Kernel panic - not syncing: LBUG |
| Comments |
| Comment by Gerrit Updater [ 21/Jun/17 ] |
|
jadhav.vikram (jadhav.vikram@seagate.com) uploaded a new patch: https://review.whamcloud.com/27753 |
| Comment by VIKRAM BABASO JADHAV (Inactive) [ 13/Jul/17 ] |
|
Patch https://review.whamcloud.com/27753 abounded so please close this ticket https://review.whamcloud.com/#/c/28026/ is created under SEA-428 |
| Comment by Peter Jones [ 09/Mar/18 ] |
|
It seems as if this patch has been resurrected |
| Comment by John Hammond [ 22/Mar/18 ] |
|
The issue description should be updated to say how to reproduce this. |
| Comment by Gerrit Updater [ 14/Sep/21 ] |
|
"Mike Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44912 |
| Comment by Gerrit Updater [ 22/Sep/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/27753/ |
| Comment by Peter Jones [ 22/Sep/21 ] |
|
Landed for 2.15 |