[LU-6763] redefinition of sk_sleep when using external OFED and CentOS 6.5 Created: 24/Jun/15  Updated: 18/Jul/16  Resolved: 10/Jul/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Minor
Reporter: Justin Miller Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: patch
Environment:

Lustre master on CentOS 6.5 with 2.6.32-431.el6 kernel and OFED 3.12-1


Issue Links:
Related
is related to LU-6769 Mellanox backport header (kthread.h) ... Resolved
is related to LU-8401 modprobe: ERROR: could not insert 'ln... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

While building Lustre master on CentOS 6.5 with 2.6.32-431.el6 kernel and OFED 3.12-1 we get a build error:

[ 212s] In file included from /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.h:82,
[ 212s] from /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.c:42:
[ 212s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/include/lnet/lib-lnet.h:708: error: redefinition of 'sk_sleep'
[ 212s] /usr/src/kernel-modules-ofed/x86_64/cray_ari_s_cos/include/linux/compat-2.6.35.h:41: note: previous definition of 'sk_sleep' was here

The kernel is not providing sk_sleep:
[ 146s] checking if Linux kernel has 'sk_sleep'... no

So it appears that the compat header from OFED defines sk_sleep, then later lnet/include/lnet/lib-lnet.h checks if the kernel provides sk_sleep and defines sk_sleep again if the kernel doesn't.



 Comments   
Comment by James A Simmons [ 24/Jun/15 ]

One of those funny corner cases. The solution is to move the sk_sleep test out of libcfs to lustre-lnet.m4. There we can use HAVE_COMPACT to include compact-2.6.h which will contain the OFED version of sk_sleep.

Comment by Gerrit Updater [ 24/Jun/15 ]

James Simmons (uja.ornl@yahoo.com) uploaded a new patch: http://review.whamcloud.com/15386
Subject: LU-6763 lnet: test for sk_sleep presence in compact-2.6.h
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7f8c7d7904ebad8afbcbd64fc147f7ea7abe23da

Comment by James A Simmons [ 24/Jun/15 ]

Justin can you try the patch at http://review.whamcloud.com/#/c/15386.

Comment by Chris Horn [ 24/Jun/15 ]

Hi James,

I tried the patch and got this:

[  116s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c: In function 'lnet_sock_accept':
[  116s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c:575: error: implicit declaration of function 'sk_sleep'
[  116s] cc1: warnings being treated as errors
[  116s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c:575: error: passing argument 1 of 'add_wait_queue' makes pointer from integer without a cast
[  116s] /usr/src/linux-2.6.32-431.el6_1.0000.8785/include/linux/wait.h:122: note: expected 'struct wait_queue_head_t *' but argument is of type 'int'
[  116s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c:584: error: passing argument 1 of 'remove_wait_queue' makes pointer from integer without a cast
[  116s] /usr/src/linux-2.6.32-431.el6_1.0000.8785/include/linux/wait.h:124: note: expected 'struct wait_queue_head_t *' but argument is of type 'int'
[  116s] make[8]: *** [/home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.o] Error 1
Comment by Chris Horn [ 24/Jun/15 ]

FWIW, we're trying to build commit 0b868add80281c085ce1b297d1cb078deaab802a + your patch from this ticket. I'll try a later master commit.

Comment by James A Simmons [ 24/Jun/15 ]

Ah I see what I missed. lib-socket.c needs to include compact-2.6.h as well.

Comment by James A Simmons [ 24/Jun/15 ]

Give it try now.

Comment by Chris Horn [ 24/Jun/15 ]

One thing after another. I'm not sure why o2iblnd is able to compile (o2iblnd.h also includes compat-2.6.h) but not lib-socket.c...

[ 223s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c:46:30: error: linux/compat-2.6.h: No such file or directory

Comment by Chris Horn [ 24/Jun/15 ]

I think the answer lies in the Makefile.in's. lnet/klnds/o2iblnd/Makefile.in has:

# Need to make sure that an external OFED source pool overrides
# any in-kernel OFED sources
NOSTDINC_FLAGS += @EXTRA_OFED_INCLUDE@

I think we'd need something similar in the lnet/lnet/Makefile.in

Edit - I'll add that automake and friends are sort of black magic to me, so there might be a different or better way of fixing this.

Comment by James A Simmons [ 24/Jun/15 ]

Added automagic stuff. Try it again.

Comment by James A Simmons [ 25/Jun/15 ]

Sorry I needed to update the patch. I discovered the Mellanox doesn't just provide a compatibility layer but it actually replaces core linux network headers. To handle this you need to have compact-2.6.h first in every file that has sock.h. I found just placing compact-2.6.h in lib-types.h caused other nightmares so the best solution was to remove sock.h in lib-types.h and place compact-2.6 at the top of every file that uses sock.h.

Comment by Peter Jones [ 03/Jul/15 ]

Amir

Could you please take care of this patch?

Thanks

Peter

Comment by Gerrit Updater [ 10/Jul/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15386/
Subject: LU-6763 lnet: test for sk_sleep presence in compact-2.6.h
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 87fe2c045ff07cadb3c2034618254a6acfe53180

Comment by Peter Jones [ 10/Jul/15 ]

Landed for 2.8

Generated at Sat Feb 10 02:03:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.