[LU-5628] Dealing with kernels that have lustre enabled already Created: 15/Sep/14 Updated: 23/Nov/17 Resolved: 20/Sep/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major |
| Reporter: | Oleg Drokin | Assignee: | Dmitry Eremin (Inactive) |
| Resolution: | Duplicate | Votes: | 1 |
| Labels: | usk | ||
| Issue Links: |
|
||||||||||||||||||||
| Rank (Obsolete): | 15743 | ||||||||||||||||||||
| Description |
|
Now that the kernels that have lustre (from that staging tree at the moment) included grows and distributions that ship it increase, we need to do something about all the problems this creates for us. Currently we cannot build our external lustre against such a kernel due to clash in config defines e.g.: make[1]: Entering directory `/home/green/bk/x86'
CC [M] /home/green/git/lustre-current/libcfs/libcfs/linux/linux-tracefile.o
In file included from <command-line>:0:0:
/home/green/git/lustre-current/config.h:26:0: error: "CONFIG_LNET_MAX_PAYLOAD" redefined [-Werror]
#define CONFIG_LNET_MAX_PAYLOAD LNET_MTU
^
In file included from /home/green/bk/linux/include/linux/kconfig.h:4:0,
from <command-line>:0:
include/generated/autoconf.h:1571:0: note: this is the location of the previous definition
#define CONFIG_LNET_MAX_PAYLOAD 1048576
^
cc1: all warnings being treated as errors
Once the lustre is moved out of staging tree, another problem will be added - clashing of symbols from lustre includes in the kernel tree (now hidden in secluded staging location so not a problem immediately). Once the config symbols clash is resolved - the other problem is the clash in module names between in-kernel lustre and out of kernel lustre. Due to in-kernel implementation mostly being geared towards clients and also lacking our debugging aids and such - these modules are not interchangeable really and we need to do something about it too - possibly consider renaming our out of tree modules? This will become a problem once distributions start to enable lustre by default in their kernels (so not a big problem yet too). Finally there are bound to be symbol clashes between in and out-of kernel lustre modules so we need to do something about that too I suspect, but not sure what so far. A wrapper to change the name a bit? |
| Comments |
| Comment by Robert Read (Inactive) [ 13/Oct/14 ] |
|
We'll also need to enable distributions to be able to package the client utilities so users can actually use the modules. |
| Comment by Minh Diep [ 14/Jan/15 ] |
|
also saw this /root/inkernel/debian/tmp/modules-deb/usr_src/modules/lustre/libcfs/include/libcfs/linux/linux-mem.h: In function ‘set_shrinker’: |
| Comment by James A Simmons [ 14/Jan/15 ] |
|
Those errors are due to procfs api changes upstream which should be resolved by the patches from |
| Comment by Minh Diep [ 16/Jan/15 ] |
|
I am testing on Ubuntu 14.04 |
| Comment by James Beal [ 22/Apr/15 ] |
|
Any news as I am seeing this with lustre 2.7 on Ubuntu 14.04 |
| Comment by James A Simmons [ 23/Apr/15 ] |
|
I'm also working with on Ubuntu 14.04 as well and just pushed some patches to make the intel branch of lustre functional. As for making it work with the upstream client that is included it will require a bit of work which I haven't had the time to do. So basically we have do something along the lines of OFED. Besides handing CONFIG_LNET_MAX_PAYLOAD we have to modify Module.symvers so that the correct lustre modules have to be updated. Currently make debs places the lustre modules in kernel/fs instead of updates. That needs to be fixed first. |
| Comment by James Beal [ 23/Apr/15 ] |
|
Thanks for that, in our use case we use a redhat kernel on our servers with the user space being ubuntu but for our clients we want to use the real client and the default kernel. We use dkms for our client modules so that works for us |
| Comment by James A Simmons [ 23/Apr/15 ] |
|
Looking at the module-assisant man pages it appears that KPKG_DEST_DIR can be used to place the lustre modules into the update directory. Perhaps that is not the best solution yet since I am not a debian package expert by any means. Anyone debian package gurues here? |
| Comment by Nathaniel Clark [ 17/Aug/15 ] |
|
WORKAROUND: ./configure --disable-server --enable-quota --with-max-payload-mb=1 edit config.h to replace ((1)<<20) with 1048576 |
| Comment by Dmitry Eremin (Inactive) [ 14/Sep/15 ] |
|
The patch http://review.whamcloud.com/16418 will also resolve this. The issue is common with |
| Comment by James A Simmons [ 15/Sep/15 ] |
|
Patch http://review.whamcloud.com/16418 will resolve the config.h issues with the upstream kernel but for OpenSFS/Intel lustre to run instead of the upstream client we need to modify Module.symvers to replace the symbols form the upstream clients with the master branch much like we do for the OFED external stacks. |
| Comment by Dmitry Eremin (Inactive) [ 16/Sep/15 ] |
|
Why we need this? Are we assume somebody will link with our modules? What the reason to provide our symbols versions for other? |
| Comment by James A Simmons [ 16/Sep/15 ] |
|
For LNet this is the case. In the wild exist external kernel modules that use LNet like DVS from Cray. Have you tried Intel Lustre on a Distro with upstream Lustre enabled? I have newer Ubuntu versions on the IBM PowerPC but for some mysterious reason Lustre is disabled unlike other Ubuntu systems |
| Comment by Robert Read (Inactive) [ 16/Sep/15 ] |
|
Amazon Linux includes upstream Lustre, and I've heard you can install el6 lustre-client rpm to be able to mount a filesystem. Haven't heard how well the it actually works, though. |
| Comment by Dmitry Eremin (Inactive) [ 17/Sep/15 ] |
So, this is additional ticket to provide Module.symvers for external programs that would like to link with our modules. I need more info about those programs. How they link now? What API they use? |
| Comment by James A Simmons [ 18/Sep/15 ] |
|
I know DVS from Cray is closed source so no one can help much there. Thinking about it it should be up to the external packages to handle the Module.symvers issue themselves. Better way to handle this is sync up libcfs/lnet upstream with master |
| Comment by Dmitry Eremin (Inactive) [ 19/Sep/15 ] |
|
Patch landed to master. |
| Comment by Peter Jones [ 20/Sep/15 ] |
|
Fix tracked under |