Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4606

Lustre hard codes libzfs.so.1 in lustre/utils/mount_utils_zfs.c

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.6.0, Lustre 2.5.3
    • Lustre 2.5.0, Lustre 2.6.0, Lustre 2.4.2
    • RHEL 6.x Lustre 2.4.2 with ZFS
    • 3
    • 12600

    Description

      Lustre hard codes libzfs.so.1 in mount_utils_zfs.c
      ZoL git tree (after v0.6.2 tag) has bumped the so version to 2:0:0 so
      libzfs is now libzfs.so.2.

      This breaks user space utilities.
      Linking libzfs.so.2.0.0 to libzfs.so.1 works so there doesn't seem to be
      any ABI breakage.

      Attachments

        Issue Links

          Activity

            [LU-4606] Lustre hard codes libzfs.so.1 in lustre/utils/mount_utils_zfs.c

            Patch landed to master, Opened LU-5096 to address Brian's issues with custom libzfs location that I missed.

            utopiabound Nathaniel Clark added a comment - Patch landed to master, Opened LU-5096 to address Brian's issues with custom libzfs location that I missed.

            This splits zfs functionality out of mount/mkfs into a loadable module and puts that module in lustre-osd-zfs

            http://review.whamcloud.com/10193

            utopiabound Nathaniel Clark added a comment - This splits zfs functionality out of mount/mkfs into a loadable module and puts that module in lustre-osd-zfs http://review.whamcloud.com/10193

            LLNL completely agrees with the previous commenters in this ticket that argue for the removal of the hardcoded zfs library versions in dlopen() calls in lustre. I commented to that effect in my review of patch 8697.

            The issue was raised that rpm will add a dependency on zfs if you compile with ZFS support. Frankly, I don't see the problem. If you don't want a dependency on zfs, don't compile against zfs. Same with ldiskfs.

            But ok, fine, I'll play along. So you want to make both OSD rpms available, but neither required. Then obviously we need to have any direct dependencies on the contents of those OSD rpms contained in those OSD rpms, not present in more general shared RPMs.

            So how about fixing that issue? Modularize the mount command. Each OSD rpm offers a dynamically loadable component for the main mount command. The mount command can then load the osd-specific shared library as needed, and only that osd shared library has dependencies on the backend filesystem. At rpm build time, the zfs osd rpm would have the dependencies on zfs, and the ldiskfs osd rpm would have the dependencies on ldiskfs. No leakage of dependencies into the more general rpms.

            I'm just throwing this out off the top of my head. Other options are possible.

            Granted, our current hard-code-the-library-string strategy was easy the first time, but it also half-assed and causing maintainance problems. Lets take the time to do it right this time around.

            morrone Christopher Morrone (Inactive) added a comment - LLNL completely agrees with the previous commenters in this ticket that argue for the removal of the hardcoded zfs library versions in dlopen() calls in lustre. I commented to that effect in my review of patch 8697 . The issue was raised that rpm will add a dependency on zfs if you compile with ZFS support. Frankly, I don't see the problem. If you don't want a dependency on zfs, don't compile against zfs. Same with ldiskfs. But ok, fine, I'll play along. So you want to make both OSD rpms available, but neither required. Then obviously we need to have any direct dependencies on the contents of those OSD rpms contained in those OSD rpms, not present in more general shared RPMs. So how about fixing that issue? Modularize the mount command. Each OSD rpm offers a dynamically loadable component for the main mount command. The mount command can then load the osd-specific shared library as needed, and only that osd shared library has dependencies on the backend filesystem. At rpm build time, the zfs osd rpm would have the dependencies on zfs, and the ldiskfs osd rpm would have the dependencies on ldiskfs. No leakage of dependencies into the more general rpms. I'm just throwing this out off the top of my head. Other options are possible. Granted, our current hard-code-the-library-string strategy was easy the first time, but it also half-assed and causing maintainance problems. Lets take the time to do it right this time around.

            Does sanity/65ia consistently reproduce this issue?

            behlendorf Brian Behlendorf added a comment - Does sanity/65ia consistently reproduce this issue?

            The ZFS bug need not hold up fixing this bug correctly.

            morrone Christopher Morrone (Inactive) added a comment - The ZFS bug need not hold up fixing this bug correctly.

            Ran with lastest masters of lustre, zfs and spl http://review.whamcloud.com/#/c/8979/6 but ran into ZFS bug 1891 in sanity/65ia

            utopiabound Nathaniel Clark added a comment - Ran with lastest masters of lustre, zfs and spl http://review.whamcloud.com/#/c/8979/6 but ran into ZFS bug 1891 in sanity/65ia

            It seemed that the dynamic linker would only try to resolve symbols when they're first used. From ld.so(8):
            LD_BIND_NOW
            If present, causes the dynamic linker to resolve all symbols at program startup instead of when they are first referenced.

            If true, then it might work if we just delay that zfs_init() call until when the 1st real zfs operation is needed, e.g. in zfs_make_lustre(). Then we'd be able to get rid of the dlopen() without causing headache to ldiskfs systems. I'm no expert on the dynamic linker, but it seems worthwhile to give it a shot.

            isaac Isaac Huang (Inactive) added a comment - It seemed that the dynamic linker would only try to resolve symbols when they're first used. From ld.so(8): LD_BIND_NOW If present, causes the dynamic linker to resolve all symbols at program startup instead of when they are first referenced. If true, then it might work if we just delay that zfs_init() call until when the 1st real zfs operation is needed, e.g. in zfs_make_lustre(). Then we'd be able to get rid of the dlopen() without causing headache to ldiskfs systems. I'm no expert on the dynamic linker, but it seems worthwhile to give it a shot.

            The reason that mkfs.lustre is using dlopen() instead of linking to the ZFS library directly is so that the same user tools can be used on systems with or without ZFS packages installed.

            See ORI-425: http://review.whamcloud.com/1740 and http://review.whamcloud.com/1742

            adilger Andreas Dilger added a comment - The reason that mkfs.lustre is using dlopen() instead of linking to the ZFS library directly is so that the same user tools can be used on systems with or without ZFS packages installed. See ORI-425: http://review.whamcloud.com/1740 and http://review.whamcloud.com/1742
            ryao Richard Yao added a comment -

            ZFSOnLinux imported a change to its internal library API from Illumos, which required a SONAME bump. It would be advisable to evaluate a switch to the new libzfs_core library, which is meant to provide a public library API with a stable interface.

            ryao Richard Yao added a comment - ZFSOnLinux imported a change to its internal library API from Illumos, which required a SONAME bump. It would be advisable to evaluate a switch to the new libzfs_core library, which is meant to provide a public library API with a stable interface.

            Its better to add configure time check for libzfs version and link it directly to mkfs.lustre instead of dlopening it

            alexxy Alexey Shvetsov (Inactive) added a comment - Its better to add configure time check for libzfs version and link it directly to mkfs.lustre instead of dlopening it

            People

              utopiabound Nathaniel Clark
              aalba6675 Anthony Alba
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: