[LU-5444] Lustre For MPSS3.3 Created: 01/Aug/14  Updated: 20/Aug/14  Resolved: 20/Aug/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.2
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Atul Yadav Assignee: Dmitry Eremin (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

MPSS 3.3
Lustre 2.5.2


Epic/Theme: Lustre-2.5.2, MPSS-3.3
Severity: 3
Rank (Obsolete): 15160

 Description   

Dear Team,

In our setup we integrating lustre with intel phi.
After following steps, we are able to build the lustre mic rpm.
But at the time of installation we are getting error.

rpm -ivh /root/phi/mpss-3.3/mpss-sdk-k1om-3.3-1.x86_64.rpm
tar -xvjf /root/phi/mpss-3.3/src/linux-2.6.38+mpss3.3.tar.bz2
rpm2cpio mpss-3.3/k1om/kernel-dev-2.6.38+mpss3.3-1.knightscorner.rpm | cpio -idmv
cp boot/Module.symvers-2.6.38.8+mpss3.3 linux-2.6.38+mpss3.3/Module.symvers
cp boot/config-2.6.38.8+mpss3.3 linux-2.6.38+mpss3.3/.config
make ARCH=k1om silentoldconfig modules_prepare
source /opt/mpss/3.3/environment-setup-k1om-mpss-linux
export LD=k1om-mpss-linux-ld
./configure --with-linux=/root/phi/linux-2.6.38+mpss3.3 --disable-server --disable-tests --disable-doc --with-o2ib=/usr/src/ofa_kernel/default --host=k1om-mpss-linux --build=x86_64-pc-linux
Checking for unpackaged file(s): /usr/lib/rpm/check-files /root/rpmbuild/BUILDROOT/lustre-client-mic-2.5.2-2.6.38.8+mpss3.3.x86_64
Wrote: /root/rpmbuild/SRPMS/lustre-client-mic-2.5.2-2.6.38.8+mpss3.3.src.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/lustre-client-mic-2.5.2-2.6.38.8+mpss3.3.x86_64.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/lustre-client-mic-modules-2.5.2-2.6.38.8+mpss3.3.x86_64.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/lustre-client-mic-source-2.5.2-2.6.38.8+mpss3.3.x86_64.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/lustre-iokit-2.5.2-2.6.38.8+mpss3.3.x86_64.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/lustre-client-mic-debuginfo-2.5.2-2.6.38.8+mpss3.3.x86_64.rpm
Error
[root@phi2 ~]# rpm -ivh /root/rpmbuild/RPMS/x86_64/lustre-client-mic-modules-2.5.2-2.6.38.8+mpss3.3.x86_64.rpm
error: Failed dependencies:
kernel = 2.6.38.8+mpss3.3 is needed by lustre-client-mic-modules-2.5.2-2.6.38.8+mpss3.3.x86_64
[root@phi2 ~]#

Need your help in proceeding further..

Thank You
Atul Yadav



 Comments   
Comment by Atul Yadav [ 04/Aug/14 ]

Hello Team,
Any Update on this requirement.

Comment by Dmitry Eremin (Inactive) [ 04/Aug/14 ]

You should do one more step before configure and change configure arguments like bellow:

$ rpm2cpio /root/phi/mpss-3.3/ofed/modules/ofed-driver-devel-2.6.32-220.el6.x86_64-3.3-1.x86_64.rpm | cpio -idm
$ ./configure --with-linux=/root/phi/linux-2.6.38+mpss3.3  --disable-server --disable-tests --disable-doc \
  --with-o2ib=/root/phi/usr/src/ofed-driver --host=k1om-mpss-linux --build=x86_64-pc-linux

The issue with incorrect dependency you can avoid by adding --nodeps option to rpm.

# rpm -ivh --nodeps /root/rpmbuild/RPMS/x86_64/lustre-client-mic-modules-2.5.2-2.6.38.8+mpss3.3.x86_64.rpm

It's a known issue and will be fixed in next release.

Comment by Atul Yadav [ 05/Aug/14 ]

Dear team,

As per the input received from your side, we are able to build lustre mic modules without any problem.
But at the time of lustre mounting we are getting error.
Sharing full details with you.
yum install libselinux-devel -y
rpm -ivh /root/phi/mpss-3.3/modules/mpss-modules-dev-2.6.32-431.el6.x86_64-3.3-1.x86_64.rpm
rpm -ivh /root/phi/mpss-3.3/modules/mpss-modules-2.6.32-431.el6.x86_64-3.3-1.x86_64.rpm
rpm -Uvh mpss-3.3/ofed/ofed-ibpd-3.3-r0.glibc2.12.2.x86_64.rpm
rpmbuild --rebuild --define "MOFED 1" mpss-3.3/src/dapl-2.0.42.2-1.glibc2.12.2.src.rpm mpss-3.3/src/libibscif-1.0.0-1.fc13.src.rpm mpss-3.3/src/ofed-driver-3.3-1.src.rpm
rpm -ivh /root/phi/mpss-3.3/mpss-sdk-k1om-3.3-1.x86_64.rpm
tar -xvjf /root/phi/mpss-3.3/src/linux-2.6.38+mpss3.3.tar.bz2
rpm2cpio mpss-3.3/k1om/kernel-dev-2.6.38+mpss3.3-1.knightscorner.rpm | cpio -idmv
rpm2cpio /root/phi/mpss-3.3/ofed/modules/ofed-driver-devel-2.6.32-220.el6.x86_64-3.3-1.x86_64.rpm | cpio -idmv
cp boot/Module.symvers-2.6.38.8+mpss3.3 linux-2.6.38+mpss3.3/Module.symvers
cp boot/config-2.6.38.8+mpss3.3 linux-2.6.38+mpss3.3/.config
cd linux-2.6.38+mpss3.3/
make ARCH=k1om silentoldconfig modules_prepare
source /opt/mpss/3.3/environment-setup-k1om-mpss-linux
export LD=k1om-mpss-linux-ld
cd /root/phi/lustre-release
git checkout 2.5.2
sh autogen.sh
./configure --with-linux=/root/phi/linux-2.6.38+mpss3.3 --disable-server --disable-tests --disable-doc --with-o2ib=/usr/src/ofed-driver --host=k1om-mpss-linux --build=x86_64-pc-linux
rpm -ivvh /root/rpmbuild/RPMS/x86_64/lustre-client-mic-modules-2.5.2-2.6.38.8+mpss3.3.x86_64.rpm --nodeps
rpm -ivh /root/rpmbuild/RPMS/x86_64/lustre-client-mic-2.5.2-2.6.38.8+mpss3.3.x86_64.rpm --nodeps

mic0# modprobe -v lnet
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/net/lustre/libcfs.ko
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/net/lustre/lnet.ko networks=o2ib0(ib0),o2ib1(ib1) forwarding="enabled"
mic0#modprobe -v lustre
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/fs/lustre/lvfs.ko
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/fs/lustre/obdclass.ko
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/fs/lustre/ptlrpc.ko
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/fs/lustre/fid.ko
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/fs/lustre/mdc.ko
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/fs/lustre/osc.ko
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/fs/lustre/lov.ko
insmod /lib/modules/2.6.38.8+mpss3.3/extra/kernel/fs/lustre/lustre.ko

mic0# mount.lustre 192.168.3.101@o2ib:/lustre /lustre

Error On MGS node
Aug 5 23:59:32 IO1 kernel: LNetError: 2139:0:(o2iblnd_cb.c:2267:kiblnd_passive_connect()) Can't accept 192.168.2.88@o2ib on 192.168.2.101@o2ib (ib1:1:192.168.3.101): bad dst nid 192.168.3.101@o2ib
Aug 5 23:59:57 IO1 kernel: LNetError: 2139:0:(o2iblnd_cb.c:2267:kiblnd_passive_connect()) Can't accept 192.168.2.88@o2ib on 192.168.2.101@o2ib (ib1:1:192.168.3.101): bad dst nid 192.168.3.101@o2ib
Aug 6 00:00:22 IO1 kernel: LNetError: 2139:0:(o2iblnd_cb.c:2267:kiblnd_passive_connect()) Can't accept 192.168.2.88@o2ib on 192.168.2.101@o2ib (ib1:1:192.168.3.101): bad dst nid 192.168.3.101@o2ib
Aug 6 00:11:01 IO1 kernel: LNetError: 2139:0:(o2iblnd_cb.c:2923:kiblnd_cm_callback()) 192.168.2.88@o2ib: REJECTED 28
Aug 6 00:11:26 IO1 kernel: LNetError: 2139:0:(o2iblnd_cb.c:2923:kiblnd_cm_callback()) 192.168.3.88@o2ib1: REJECTED 28

MDS
[root@IO1 ~]# lctl list_nids
192.168.2.101@o2ib
192.168.3.101@o2ib1

MIC0#
192.168.2.88@o2ib
192.168.3.88@o2ib1

need assistance to resolve the errors

thank You
Atul Yadav

Comment by Dmitry Eremin (Inactive) [ 06/Aug/14 ]

Most probably you should swap IP address of ib0 and ib1. Try to reach servers from MIC card by just ping 192.168.2.101 and ping 192.168.3.101 and be sure it works. Then make sure that name of IB interface the same as on server. For example, if the name o2ib0 correspond to 192.168.2.88 on MIC the same name o2ib0 should correspond to 192.168.2.101 on server.

Comment by Atul Yadav [ 10/Aug/14 ]

Dear Team,

After swapping the IP address also it's not working .

We are able to ping the IB address, but we are not able to ping through lctl.

Thank You
Atul Yadav

Comment by Dmitry Eremin (Inactive) [ 12/Aug/14 ]

According your logs the Lustre client works but Lustre configuration is incorrect. Please, check /etc/modprobe.d/lustre.conf on servers and clients for correct configuration. It seems you have an issue with gateway configuration between two networks.

Comment by Dmitry Eremin (Inactive) [ 12/Aug/14 ]

Maybe you just need to change the mount command to following:
# mount.lustre 192.168.3.101@o2ib1:/lustre /lustre

Comment by Dmitry Eremin (Inactive) [ 18/Aug/14 ]

Just for info: I updated instructions in my WiKi page. It contains detailed steps how to build Lustre with MPSS 3.3.

Generated at Sat Feb 10 01:51:32 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.