[LU-16443] "mount -t lustre ..." Does not work. (Backend fs=ZFS) Created: 04/Jan/23  Updated: 04/Jan/23  Resolved: 04/Jan/23

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Arshad Hussain Assignee: WC Triage
Resolution: Not a Bug Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This is a single node(client+server) vm and backend FS is zfs. 'mkfs' works. However, mounting fails when called via 'mount -t lustre...". It works perfectly when called through "mount.lustre". I am not sure if this is general case. But this seem to be affecting me.

What works
mkfs (works ...) + followed by mount.lustre

$ mkfs -t lustre --reformat --index 0 --backfstype=zfs --fsname=lustre --mgsnode=192.168.50.117@tcp --mgs --mdt gpool/metadata /dev/loop0

$ mount.lustre -o localrecov gpool/metadata /mnt/zfsmdt
 
$ mount | grep mds1
gpool/metadata on /mnt/zfsmdt type lustre (rw,seclabel,svname=lustre-MDT0000,mgs,osd=osd-zfs)
 

However, with the same mkfs as above if mount is done via 'mount -t lustre' instead of 'mount.lustre' it fails.

What does not work; With same mkfs. (below does not work)

$ mount -t lustre -o localrecov gpool/metadata /mnt/zfsmdt
mount.lustre: gpool/metadata has not been formatted with mkfs.lustre or the backend filesystem type is not supported by this tool
 

Where exactly it is failing

It is failing under lustre/utils/mount_utils.c:load_backfs_module()

handle = dlopen(filename, RTLD_LAZY); /* where filename is /usr/lib/lustre/mount_osd_zfs.so */
libzfs.so.4: cannot open shared object file: No such file or directory (dlerror() output)
 

It cannot load the library. Therefore the callbacks for backfs_ops is all NULL. Making it fail.

For the mount.lustre case. The dlopen() call is a success. The callback for backend (zfs_init(), is_lustre(), etc) is properly registered.

 

filename = /usr/lib/lustre/mount_osd_zfs.so
libzfs.so.4: cannot open shared object file: No such file or directory
 

Other information:

 

$ uname -r
3.10.0-1160.15.2.el7.x86_64
$ cat /etc/redhat-release 
CentOS Linux release 7.9.2009 (Core)
 

ZFS libs are installed in non-standard path.

 

# pkg-config --cflags libzfs
-I/root/zfs/zfs_git_lustre_build/zfsbins/include/libzfs -I/root/zfs/zfs_git_lustre_build/zfsbins/include/libspl -I/usr/include/blkid -I/usr/include/uuid  
# pkg-config --libs libzfs
-L/root/zfs/zfs_git_lustre_build/zfsbins/lib -lzfs -lzfs_core -lnvpair 

 



 Comments   
Comment by Andreas Dilger [ 04/Jan/23 ]

It sounds like the library path loading is the source of the problem. Having the ZFS libraries in a non-standard path is probably preventing the dlopen() from succeeding. You could try adding "LD_LIBRARY_PATH=/root/...." before the command to see if it helps, and/or change /etc/ldconfig to point to that directory.

Comment by Arshad Hussain [ 04/Jan/23 ]

Was failing due to improper library path. Setting LD_LIBRARY_PATH (as Andreas pointed out) makes it work.

 

 

export LD_LIBRARY_PATH=/root/zfs/zfs_git_lustre_build/zfsbins/lib/
$ mount -t lustre -o localrecov  gpool/metadata /mnt/zfsmdt
$ mount | grep lustre
gpool/metadata on /mnt/zfsmdt type lustre (rw,seclabel,svname=lustre-MDT0000,mgs,osd=osd-zfs)

 

Comment by Arshad Hussain [ 04/Jan/23 ]

Not a bug. Library path was incorrectly /not set.

Generated at Sat Feb 10 03:27:04 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.