[LU-2262] unknown symbols when compiling lustre client without specifying --disable-server in ./configure command Created: 02/Nov/12 Updated: 08/Dec/13 Resolved: 08/Dec/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Frederik Ferner (Inactive) | Assignee: | Minh Diep |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 5411 |
| Description |
|
While evaluating Lustre 2.3 for our environment, I tried to recompile the lustre client for a later version of the Red Hat kernel. With the following commands git checkout -b b2_3 remotes/origin/b2_3 sh autogen.sh ./configure --with-linux=/usr/src/kernels/2.6.32-279.11.1.el6.x86_64 I got the warnings below during the build but the build completed. configure: WARNING: Disabling server because complete ext4 source does not exist. If you are building using kernel-devel packages and require ldiskfs server support then ensure that the matching kernel-debuginfo-common and kernel-debuginfo-common-<arch> packages are installed. [snip] WARNING: /lib/modules/2.6.32-279.11.1.el6.x86_64/updates/kernel/fs/lustre/obdecho.ko needs unknown symbol echo_obd_ops WARNING: /lib/modules/2.6.32-279.11.1.el6.x86_64/updates/kernel/fs/lustre/obdecho.ko needs unknown symbol echo_persistent_pages_fini WARNING: /lib/modules/2.6.32-279.11.1.el6.x86_64/updates/kernel/fs/lustre/obdecho.ko needs unknown symbol echo_persistent_pages_init WARNING: /lib/modules/2.6.32-279.11.1.el6.x86_64/updates/kernel/fs/lustre/ptlrpc.ko needs unknown symbol lut_boot_epoch_update WARNING: /lib/modules/2.6.32-279.11.1.el6.x86_64/updates/kernel/fs/lustre/ptlrpc.ko needs unknown symbol lut_mod_exit WARNING: /lib/modules/2.6.32-279.11.1.el6.x86_64/updates/kernel/fs/lustre/ptlrpc.ko needs unknown symbol lut_mod_init Trying to load the modules gave unknown symbol errors: Nov 1 16:17:53 cs04r-sc-serv-48 kernel: Lustre: Lustre: Build Version: 2.3.0--PRISTINE-2.6.32-279.11.1.el6.x86_64 Nov 1 16:17:53 cs04r-sc-serv-48 kernel: ptlrpc: Unknown symbol lut_boot_epoch_update Nov 1 16:17:53 cs04r-sc-serv-48 kernel: ptlrpc: Unknown symbol lut_mod_exit Nov 1 16:17:53 cs04r-sc-serv-48 kernel: ptlrpc: Unknown symbol lut_mod_init Nov 1 16:17:53 cs04r-sc-serv-48 modprobe: FATAL: Error inserting lustre (/lib/modules/2.6.32-279.11.1.el6.x86_64/updates/kernel/fs/lustre/lustre.ko): Unknown symbol in module, or unknown parameter (see dmesg) compiling the same code with additional --disable-servers added to the configure options compiles without warnings and loading the module works as expected. The same configure line without the --disable-servers option does work without problems for compiling the lustre client rpms on lustre 1.8.X |
| Comments |
| Comment by Peter Jones [ 02/Nov/12 ] |
|
Glad to see you checking out 2.3 Frederik! Minh could you please look into this one? Thanks! |
| Comment by James A Simmons [ 02/Nov/12 ] |
|
You would need a patch similar to |
| Comment by Minh Diep [ 06/Nov/12 ] |
|
http://review.whamcloud.com/#change,1873 remove the patched kernel detection which cause this issue. I am investigating further. |
| Comment by Brian Murrell (Inactive) [ 06/Nov/12 ] |
|
Per our discussion, find out which of the server patches (not including ldiskfs patches since in theory we can still patch and build ldiskfs with a patchless server kernel) is likely to be "last patch standing" and build an autoconf test to test for presence of that patch in the kernel {source|headers}that lustre was given to build for. That test will be your auto-detection for server build or not. As an aside: there really are not too many patches left for the server and at least half of them seem generic enough that they ought to go upstream with relative ease, I would think. |
| Comment by Brian Murrell (Inactive) [ 06/Nov/12 ] |
|
Ahh. Andreas and Oleg are watching this one. I wonder if one or both of them would like to advise which is likely to be the last server patch standing. |
| Comment by Andreas Dilger [ 06/Nov/12 ] |
|
The last patch that we require for Lustre is dev_read_only, which is only needed for testing. See However, I don't really think this problem is directly related to patchless kernels, but rather is a defect in the configure code that setting enable_server='no' in build/autoconf/lustre-build-ldiskfs.m4 may be too late to prevent HAVE_SERVER_SUPPORT from being set. Possibly this is already fixed in master? |
| Comment by Minh Diep [ 06/Nov/12 ] |
|
no, it's not fixed in master. |
| Comment by Brian Murrell (Inactive) [ 07/Nov/12 ] |
Yeah, that is actually the problem. However, simply using the can/will build ldiskfs y/n test as a test whether the lustre servers can be built against the provided kernel source tree doesn't seem sufficient. The can/will build ldiskfs y/n test is simply "if $EXT_DIR/dir.c, $EXT_DIR/file.c, and $EXT_DIR/inode.c exist in the kernel source, then we can build ldiskfs" and by extension we are asserting that we can build lustre servers, but lustre servers cannot be built unless the kernel source is also patched, right? It seems a further check that the source has been patched needs to be done doesn't it? |
| Comment by Andreas Dilger [ 07/Nov/12 ] |
|
No, in fact the current Lustre code does not require that the kernel be patched at all. This is required for Lustre testing (the dev_read_only patch in order to simulate server crashes at specific points in the code), but it isn't needed for normal operation. There is a separate check whether there are ldiskfs patches for the kernel in order to enable ldiskfs, but they are not needed if building a ZFS-only server. |
| Comment by Minh Diep [ 07/Nov/12 ] |
|
does this mean we should restore the LUSTRE_KERNEL_VERSION check? |
| Comment by Minh Diep [ 07/Nov/12 ] |
|
or make disable server by default and use --enable-server if we want to build the kernel |
| Comment by Andreas Dilger [ 08/Nov/12 ] |
|
I think we want to continue to default to building the server code, and definitely do not want to introduce version checks. If neither ldiskfs patches are available for the current kernel, nor ZFS libraries, then the server code could be disabled. That said, whether the server modules are built or not, that shouldn't prevent the client modules from being usable. Only the osd-ldiskfs and osd-zfs modules should really need to interact with the backing storage, and everything else should be "patchless" already. The real question is why ptlrpc is referencing symbols only in server modules, but those modules were apparently not built? The |
| Comment by Brian Murrell (Inactive) [ 08/Nov/12 ] |
Are there any penalties (i.e. performance, races, etc.) for not using the patches? If the only patch really necessary is for testing, and there are no performance or operational issues with not using any of the other patches (assuming they are not using RAID, or the fusion mpt controller, etc), why are we not advertising to end users that they can build lustre servers against their choice of stock (EL6) vendor kernel (within the limits of supported kernel versions)? This would be huge! |
| Comment by Andreas Dilger [ 10/Nov/12 ] |
|
There are some issues fixed by the patches that are not needed for performance/correctness on most systems:
It would definitely make sense to try and get these into the upstream kernels if we could. |
| Comment by Minh Diep [ 26/Jul/13 ] |
|
fyi, tried it on latest master and no issue |
| Comment by Minh Diep [ 26/Jul/13 ] |
|
hmm...I don't see this issue anymore on b2_3 Frederik, could you double check? - thanks |
| Comment by Minh Diep [ 02/Dec/13 ] |
|
Frederik, Please check if you still see this issue. I will close this in about a week. thanks |