Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
3
-
9223372036854775807
Description
Mount option -o network is used to restrict client network to access servers. It filters networks during 'setup' configure command and skips all 'add_conn' commands if connection UUID has no correct network mention. Meawhile that UUID in add_conn is just a name, and real NIDs attached to it may be on correct network. Skipping such add_conn skips also possible failover NIDs and leave client without knowledge about failover nodes.
E.g. in configuration below:
- { index: 68, event: add_uuid, nid: 10.160.44.6@tcp(0x200000aa02c06), node: 10.160.44.6@tcp } - { index: 69, event: add_uuid, nid: 10.160.44.38@tcp1(0x200010aa02c26), node: 10.160.44.6@tcp } - { index: 85, event: attach, device: lustre-MDT0002-mdc, type: mdc, UUID: lus27-clilmv_UUID } - { index: 86, event: setup, device: lustre-MDT0002-mdc, UUID: lustre-MDT0002_UUID, node: 10.160.44.6@tcp } ### here after setup both @tcp and @tcp1 NIDs are filtered and the latter is kept, note that connection UUID used in 'setup' has "@tcp" network and that is not taken into account properly ### here below result of --servicenode option, first there are nids and finally add_conn with then: - { index: 87, event: add_uuid, nid: 10.160.44.6@tcp(0x200000aa02c06), node: 10.160.44.6@tcp } - { index: 88, event: add_uuid, nid: 10.160.44.38@tcp1(0x200010aa02c26), node: 10.160.44.6@tcp } - { index: 103, event: add_conn, device: lustre-MDT0002-mdc, node: 10.160.44.6@tcp } ### second node - { index: 104, event: add_uuid, nid: 10.160.44.7@tcp(0x200000aa02c07), node: 10.160.44.7@tcp } - { index: 105, event: add_uuid, nid: 10.160.44.39@tcp1(0x200010aa02c27), node: 10.160.44.7@tcp } - { index: 120, event: add_conn, device: lustre-MDT0002-mdc, node: 10.160.44.7@tcp }
Each 'add_conn' was configured with NID on restricted network "@tcp1" but 'add_conn' is skipped because it has no mention of "tcp1" in own name. Therefore client mounted without second node at address 10.160.44.39@tcp1 and can't connect to server during failover.
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50187/
Subject:
LU-16557client: -o network needs add_conn processingProject: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 0543381b2f0ea6e2980315765ad34ae37411d36a