Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
2.12 and master
-
3
-
9223372036854775807
Description
Currently, when if client changed multi-rail to non multi-rail setting, client can't mount filesystem unless current client's peer nid state on servers removed.
options lnet networks="o2ib10(ib0,ib2)"
[root@s184 ~]# mount -t lustre 10.0.11.90@o2ib10:/cache1 /cache1
[root@s184 ~]# lnetctl net show
net:
- net type: lo
local NI(s):
- nid: 0@lo
status: up
- net type: o2ib10
local NI(s):
- nid: 10.0.10.184@o2ib10
status: up
interfaces:
0: ib0
- nid: 10.2.10.184@o2ib10
status: up
interfaces:
0: ib2
if NID state changed and remount lustre on client fails unless clear all that client state on all servers.
options lnet networks="o2ib10(ib0)" [root@s184 ~]# umount -t lustre -a [root@s184 ~]# lustre_rmmod [root@s184 ~]# mount -t lustre 10.0.11.90@o2ib10:/cache1 /cache1 mount.lustre: mount 10.0.11.90@o2ib10:/cache1 at /cache1 failed: Input/output error Is the MGS running?
Server side, client peer state is still multi-rail.
[root@es14k-vm1 ~]# lnetctl peer show
peer:
- primary nid: 0@lo
Multi-Rail: False
peer ni:
- nid: 0@lo
state: NA
- primary nid: 10.0.11.92@o2ib10
Multi-Rail: True
peer ni:
- nid: 10.0.11.92@o2ib10
state: NA
- nid: 10.1.11.92@o2ib10
state: NA
- primary nid: 10.0.11.91@o2ib10
Multi-Rail: True
peer ni:
- nid: 10.0.11.91@o2ib10
state: NA
- nid: 10.1.11.91@o2ib10
state: NA
- primary nid: 10.0.11.93@o2ib10
Multi-Rail: True
peer ni:
- nid: 10.0.11.93@o2ib10
state: NA
- nid: 10.1.11.93@o2ib10
state: NA
- primary nid: 10.0.10.184@o2ib10
Multi-Rail: True <------ Still Multi-rail
peer ni:
- nid: 10.0.10.184@o2ib10
state: NA
- nid: 10.2.10.184@o2ib10
state: NA
a workaround is removing nid state on all servers, then mount it again. that works, but perfer automated peer state update.
[root@es14k-vm1 ~]# clush -g oss lnetctl peer del --prim_nid 10.0.10.184@o2ib10 --nid 10.0.10.184@o2ib10 [root@es14k-vm1 ~]# clush -g oss lnetctl peer del --prim_nid 10.0.10.184@o2ib10 --nid 10.2.10.184@o2ib10 [root@s184 ~]# mount -t lustre 10.0.11.90@o2ib10:/cache1 /cache1 [root@s184 ~]#