Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9699

osp_obd_connect()) ASSERTION( osp->opd_connects == 1 ) failed



    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807


      Writeconf on an MDT with index > 0000 will cause "add mdc" to be added to $FSNAME-client config and "add osp" to be added to $FSNAME-MDTXXXX configs.

      However, the configs may already contain these directives. Duplicating the OSP device will cause the assertion failure, unlike duplicating the MDC which will just return -EEXIST.

      A possible solution is to check configs for duplicates before writing to them. However, sometimes we would like to change nids which are part of "add mdc" and "add osp".

      Another solution is to mark previous entries with SKIP flags. This patch implements this approach. Since after revoking the config lock, the clients and the MDTs will receive the updated log and apply its newer entries, we still have to handle OSP duplication, but this is only an issue immediately after writeconf processing.

      [1904009.530445] LDISKFS-fs (md0): mounted filesystem with ordered data mode. quota=on. Opts:
      [1904010.544738] LustreError: 11-0: snx11117-MDT0000-osp-MDT0001: Communicating with, operation mds_connect failed with -114.
      [1904010.814980] Lustre: snx11117-MDD0001: changelog on
      [1904019.177269] LustreError: 84835:0:(genops.c:345:class_newdev()) Device snx11117-MDT0002-osp-MDT0001 already exists at 7, won't add
      [1904019.189880] LustreError: 84835:0:(obd_config.c:368:class_attach()) Cannot create device snx11117-MDT0002-osp-MDT0001 of type osp : -17
      [1904019.202925] LustreError: 84835:0:(obd_config.c:1610:class_config_llog_handler()) MGC10.9.100.9@o2ib3: cfg command failed: rc = -17
      [1904019.215616] Lustre:    cmd=cf001 0:snx11117-MDT0002-osp-MDT0001  1:osp  2:snx11117-MDT0001-mdtlov_UUID
      [1904019.228104] LustreError: 84588:0:(mgc_request.c:517:do_requeue()) failed processing log: -17
      [1904036.493105] LustreError: 85373:0:(obd_config.c:464:class_setup()) Device 7 already setup (type osp)
      [1904036.503095] LustreError: 85373:0:(obd_config.c:1610:class_config_llog_handler()) MGC10.9.100.9@o2ib3: cfg command failed: rc = -17
      [1904036.515779] Lustre:    cmd=cf003 0:snx11117-MDT0002-osp-MDT0001  1:snx11117-MDT0002_UUID  2:
      [1904036.528886] LustreError: 84588:0:(mgc_request.c:517:do_requeue()) failed processing log: -17
      [1904044.588103] LustreError: 85579:0:(osp_dev.c:1175:osp_obd_connect()) ASSERTION( osp->opd_connects == 1 ) failed:
      [1904044.599237] LustreError: 85579:0:(osp_dev.c:1175:osp_obd_connect()) LBUG
      [1904044.606507] Pid: 85579, comm: llog_process_th
      [1904044.611433] Call Trace:
      [1904044.616506]  [<ffffffffa07dd895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      [1904044.624041]  [<ffffffffa07dde97>] lbug_with_loc+0x47/0xb0 [libcfs]
      [1904044.630798]  [<ffffffffa0919f2c>] osp_obd_connect+0x3bc/0x420 [osp]
      [1904044.645834]  [<ffffffffa06fb7c1>] lod_add_device+0x8d1/0x1e00 [lod]
      [1904044.652674]  [<ffffffffa06f4259>] lod_process_config+0xb89/0x1720 [lod]
      [1904044.667268]  [<ffffffffa099a370>] class_process_config+0x1900/0x1ac0 [obdclass]
      [1904044.682914]  [<ffffffffa099b664>] class_config_llog_handler+0xa34/0x18b0 [obdclass]
      [1904044.697415]  [<ffffffffa095ecf9>] llog_process_thread+0xaa9/0xe80 [obdclass]
      [1904044.705421]  [<ffffffffa095f115>] llog_process_thread_daemonize+0x45/0x70 [obdclass]
      [1904044.722898]  [<ffffffff8109ac66>] kthread+0x96/0xa0
      [1904044.728341]  [<ffffffff8100c20a>] child_rip+0xa/0x20
      [1904044.747419] Kernel panic - not syncing: LBUG


        Issue Links



              jadhav.vikram VIKRAM BABASO JADHAV (Inactive)
              jadhav.vikram VIKRAM BABASO JADHAV (Inactive)
              0 Vote for this issue
              10 Start watching this issue