Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.10.4
-
None
-
Testing on VMs
Client: CentOS 7.5 (3.10.0-862.3.3.el7.x86_64)
2.10.4 downloaded from Lustre.org
Servers: CentOS 7.5 (3.10.0-862.2.3.el7_lustre.x86_64) with 2.10.4
-
3
-
9223372036854775807
Description
Reporting this one just in case... no production, no data, just a POC on VMs...
We were testing the latest Lustre version with nodemap and several LNETs in a virtual cluster used just as POC and we hit a crash extremely similar to LU-6991.
This VM is configured with 2 LNETs:
[root@tony-client4 ~]# cat /etc/modprobe.d/lustre.conf
options lnet networks=tcp1(enp0s9),tcp2(enp0s10)
[root@tony-client4 ~]# ip ad show dev enp0s9 | grep inet
inet 172.16.1.34/24 brd 172.16.1.255 scope global noprefixroute enp0s9
inet6 fe80::a00:27ff:fec9:f4cc/64 scope link
[root@tony-client4 ~]# ip ad show dev enp0s10 | grep inet
inet 182.16.1.34/24 brd 182.16.1.255 scope global noprefixroute enp0s10
inet6 fe80::a00:27ff:fe12:2cae/64 scope link
We were just testing the robustness of a multi-tenancy configuration by trying to mount the entire filesystem over an LNET(tcp0) not configured on this:
[root@tony-client4 ~]# mount -t lustre -o network=tcp0 172.16.1.11@tcp1:/tonyfs /lustre/tonyfs/general/
This triggered the crash attached in the image but which corresponds to LU-6991. Unfortunately, I had not yet configured crashdump on this node.
If this ever gets reproduced I will try to provide the crashdump.