Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
None
-
Linux arcV 6.7.0-arch3-1 #1 SMP PREEMPT_DYNAMIC Sat, 13 Jan 2024 14:37:14 +0000 x86_64 GNU/Linux (Host)
Container image rockylinux:8
Nvidia GPU on machine - cuda not installed in container
-
3
-
9223372036854775807
Description
Hello all, I was compiling lustre in a container and came across what I think is an error in the configure script.
I have a nvidia gpu installed on the machine. I am trying to replicate a minimal build environment in a container and thus am not installing cuda in it. When configuring, I set `--disable-server --disable-tests`. Got a Cuda error, no big deal, went to disable.
I tried the following: `-without-cuda --without-gds`, `-with-cuda=no --with-gds=no`, as the help page said they are equivalent.
Turns out they are, because both gave me the same error:
```
checking for /no/nv-p2p.h... no
configure: error: CUDA sources don't found. nv-p2p.h don't exit
```
Looking at the code, it seems disabling them sets their value to 'no', and the code checks for an empty parameter. It is set to 'no', so the code proceeds with them defined. `lnet/autoconf/lustre-lnet.m4`:
```c
AS_IF([test -n "${CUDA_PATH}" && test -n "${GDS_PATH}"],[
LB_CHECK_FILE([$CUDA_PATH/nv-p2p.h],
[
AC_MSG_RESULT([CUDA path is $CUDA_PATH])
AC_SUBST(CUDA_PATH)
],
[AC_MSG_ERROR([CUDA sources not found. nv-p2p.h doesn't exist])]
)
```
Note the `test -n`. So yeah I thought that was worth reporting if there are no objections I would love to patch this out let me know how to proceed !