[LU-6763] redefinition of sk_sleep when using external OFED and CentOS 6.5 Created: 24/Jun/15 Updated: 18/Jul/16 Resolved: 10/Jul/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Justin Miller | Assignee: | Amir Shehata (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Environment: |
Lustre master on CentOS 6.5 with 2.6.32-431.el6 kernel and OFED 3.12-1 |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
While building Lustre master on CentOS 6.5 with 2.6.32-431.el6 kernel and OFED 3.12-1 we get a build error: [ 212s] In file included from /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/klnds/o2iblnd/o2iblnd.h:82, The kernel is not providing sk_sleep: So it appears that the compat header from OFED defines sk_sleep, then later lnet/include/lnet/lib-lnet.h checks if the kernel provides sk_sleep and defines sk_sleep again if the kernel doesn't. |
| Comments |
| Comment by James A Simmons [ 24/Jun/15 ] |
|
One of those funny corner cases. The solution is to move the sk_sleep test out of libcfs to lustre-lnet.m4. There we can use HAVE_COMPACT to include compact-2.6.h which will contain the OFED version of sk_sleep. |
| Comment by Gerrit Updater [ 24/Jun/15 ] |
|
James Simmons (uja.ornl@yahoo.com) uploaded a new patch: http://review.whamcloud.com/15386 |
| Comment by James A Simmons [ 24/Jun/15 ] |
|
Justin can you try the patch at http://review.whamcloud.com/#/c/15386. |
| Comment by Chris Horn [ 24/Jun/15 ] |
|
Hi James, I tried the patch and got this: [ 116s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c: In function 'lnet_sock_accept': [ 116s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c:575: error: implicit declaration of function 'sk_sleep' [ 116s] cc1: warnings being treated as errors [ 116s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c:575: error: passing argument 1 of 'add_wait_queue' makes pointer from integer without a cast [ 116s] /usr/src/linux-2.6.32-431.el6_1.0000.8785/include/linux/wait.h:122: note: expected 'struct wait_queue_head_t *' but argument is of type 'int' [ 116s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c:584: error: passing argument 1 of 'remove_wait_queue' makes pointer from integer without a cast [ 116s] /usr/src/linux-2.6.32-431.el6_1.0000.8785/include/linux/wait.h:124: note: expected 'struct wait_queue_head_t *' but argument is of type 'int' [ 116s] make[8]: *** [/home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.o] Error 1 |
| Comment by Chris Horn [ 24/Jun/15 ] |
|
FWIW, we're trying to build commit 0b868add80281c085ce1b297d1cb078deaab802a + your patch from this ticket. I'll try a later master commit. |
| Comment by James A Simmons [ 24/Jun/15 ] |
|
Ah I see what I missed. lib-socket.c needs to include compact-2.6.h as well. |
| Comment by James A Simmons [ 24/Jun/15 ] |
|
Give it try now. |
| Comment by Chris Horn [ 24/Jun/15 ] |
|
One thing after another. I'm not sure why o2iblnd is able to compile (o2iblnd.h also includes compat-2.6.h) but not lib-socket.c... [ 223s] /home/abuild/rpmbuild/BUILD/cray-lustre/lnet/lnet/lib-socket.c:46:30: error: linux/compat-2.6.h: No such file or directory |
| Comment by Chris Horn [ 24/Jun/15 ] |
|
I think the answer lies in the Makefile.in's. lnet/klnds/o2iblnd/Makefile.in has: # Need to make sure that an external OFED source pool overrides # any in-kernel OFED sources NOSTDINC_FLAGS += @EXTRA_OFED_INCLUDE@ I think we'd need something similar in the lnet/lnet/Makefile.in Edit - I'll add that automake and friends are sort of black magic to me, so there might be a different or better way of fixing this. |
| Comment by James A Simmons [ 24/Jun/15 ] |
|
Added automagic stuff. Try it again. |
| Comment by James A Simmons [ 25/Jun/15 ] |
|
Sorry I needed to update the patch. I discovered the Mellanox doesn't just provide a compatibility layer but it actually replaces core linux network headers. To handle this you need to have compact-2.6.h first in every file that has sock.h. I found just placing compact-2.6.h in lib-types.h caused other nightmares so the best solution was to remove sock.h in lib-types.h and place compact-2.6 at the top of every file that uses sock.h. |
| Comment by Peter Jones [ 03/Jul/15 ] |
|
Amir Could you please take care of this patch? Thanks Peter |
| Comment by Gerrit Updater [ 10/Jul/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15386/ |
| Comment by Peter Jones [ 10/Jul/15 ] |
|
Landed for 2.8 |