Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
None
-
None
-
3
-
6611
Description
Seen in a 2.3.58 build.
The error messages here are similar to LU-2110, but these occur when mounting a target right after a umount. This is easily reproducible in my test environment.
2013-01-30 22:11:47,707 [1494] shell.run: ['mount', '-t', 'lustre', '/dev/sdf', '/mnt/targets/scratch-OST0009'] 2013-01-30 22:11:47,948 [1494] ERR> mount.lustre: mount /dev/xvdj at /mnt/targets/scratch-OST0009 failed: File exists
Messages:
LustreError: 1634:0:(genops.c:318:class_newdev()) Device scratch-MDT0000-osp-OST0009 already exists at 4, won't add LustreError: 1634:0:(obd_config.c:374:class_attach()) Cannot create device scratch-MDT0000-osp-OST0009 of type osp : -17 LustreError: 1634:0:(obd_mount.c:373:lustre_start_simple()) scratch-MDT0000-osp-OST0009 attach error -17 LustreError: 1634:0:(obd_mount.c:1137:lustre_osp_setup()) scratch-MDT0000-osp-OST0009: setup up failed: rc -17 LustreError: 15c-8: MGC10.197.11.79@tcp: The configuration from log 'scratch-client' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 1598:0:(obd_mount.c:1865:server_start_targets()) scratch-OST0009: failed to start OSP: -17 LustreError: 1598:0:(obd_mount.c:2400:server_fill_super()) Unable to start targets: -17 LustreError: 1598:0:(obd_mount.c:1365:lustre_disconnect_osp()) Can't find osp-on-ost scratch-MDT0000-osp-OST0009 LustreError: 1598:0:(obd_mount.c:2114:server_put_super()) scratch-OST0009: failed to disconnect osp-on-ost (rc=-2)! Lustre: Failing over scratch-OST0009 LustreError: 1598:0:(obd_mount.c:1420:lustre_stop_osp()) Can not find osp-on-ost scratch-MDT0000-osp-OST0009 LustreError: 1598:0:(obd_mount.c:2159:server_put_super()) scratch-OST0009: Fail to stop osp-on-ost! Lustre: server umount scratch-OST0009 complete LustreError: 1598:0:(obd_mount.c:2988:lustre_fill_super()) Unable to mount (-17)
From reading the debug log (attached), the osp device is not cleaned up until a few seconds after the initial umount completes. In the meantime, there were two unsuccessful mount attemps. After the osp device was cleaned up, I was able to mount.
Would it be possible to block umount on this device cleanup?