Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2718

Unable to re-mount OST (-17)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • None
    • None
    • 3
    • 6611

    Description

      Seen in a 2.3.58 build.

      The error messages here are similar to LU-2110, but these occur when mounting a target right after a umount. This is easily reproducible in my test environment.

      2013-01-30 22:11:47,707 [1494] shell.run: ['mount', '-t', 'lustre', '/dev/sdf', '/mnt/targets/scratch-OST0009']
      2013-01-30 22:11:47,948 [1494] ERR> mount.lustre: mount /dev/xvdj at /mnt/targets/scratch-OST0009 failed: File exists
      

      Messages:

      LustreError: 1634:0:(genops.c:318:class_newdev()) Device scratch-MDT0000-osp-OST0009 already exists at 4, won't add
      LustreError: 1634:0:(obd_config.c:374:class_attach()) Cannot create device scratch-MDT0000-osp-OST0009 of type osp : -17
      LustreError: 1634:0:(obd_mount.c:373:lustre_start_simple()) scratch-MDT0000-osp-OST0009 attach error -17
      LustreError: 1634:0:(obd_mount.c:1137:lustre_osp_setup()) scratch-MDT0000-osp-OST0009: setup up failed: rc -17
      LustreError: 15c-8: MGC10.197.11.79@tcp: The configuration from log 'scratch-client' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
      LustreError: 1598:0:(obd_mount.c:1865:server_start_targets()) scratch-OST0009: failed to start OSP: -17
      LustreError: 1598:0:(obd_mount.c:2400:server_fill_super()) Unable to start targets: -17
      LustreError: 1598:0:(obd_mount.c:1365:lustre_disconnect_osp()) Can't find osp-on-ost scratch-MDT0000-osp-OST0009
      LustreError: 1598:0:(obd_mount.c:2114:server_put_super()) scratch-OST0009: failed to disconnect osp-on-ost (rc=-2)!
      Lustre: Failing over scratch-OST0009
      LustreError: 1598:0:(obd_mount.c:1420:lustre_stop_osp()) Can not find osp-on-ost scratch-MDT0000-osp-OST0009
      LustreError: 1598:0:(obd_mount.c:2159:server_put_super()) scratch-OST0009: Fail to stop osp-on-ost!
      Lustre: server umount scratch-OST0009 complete
      LustreError: 1598:0:(obd_mount.c:2988:lustre_fill_super()) Unable to mount  (-17)
      

      From reading the debug log (attached), the osp device is not cleaned up until a few seconds after the initial umount completes. In the meantime, there were two unsuccessful mount attemps. After the osp device was cleaned up, I was able to mount.

      Would it be possible to block umount on this device cleanup?

      Attachments

        Activity

          People

            wc-triage WC Triage
            rread Robert Read (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: