Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16557

don't skip add_conn with -o network mount option

Details

    • 3
    • 9223372036854775807

    Description

      Mount option -o network is used to restrict client network to access servers. It filters networks during 'setup'  configure command and skips all 'add_conn' commands if connection UUID has no correct network mention. Meawhile that UUID in add_conn is just a name, and real NIDs attached to it may be on correct network. Skipping such add_conn skips also possible failover NIDs and leave client without knowledge about failover nodes.

      E.g. in configuration below:

       - { index: 68, event: add_uuid, nid: 10.160.44.6@tcp(0x200000aa02c06), node: 10.160.44.6@tcp }
      - { index: 69, event: add_uuid, nid: 10.160.44.38@tcp1(0x200010aa02c26), node: 10.160.44.6@tcp }
      - { index: 85, event: attach, device: lustre-MDT0002-mdc, type: mdc, UUID: lus27-clilmv_UUID }
      - { index: 86, event: setup, device: lustre-MDT0002-mdc, UUID: lustre-MDT0002_UUID, node: 10.160.44.6@tcp }
      
      ### here after setup both @tcp and @tcp1 NIDs are filtered and the latter is kept, note that connection UUID used in 'setup' has "@tcp" network and that is not taken into account properly
      
      ### here below result of --servicenode option, first there are nids and finally add_conn with then:
      
      - { index: 87, event: add_uuid, nid: 10.160.44.6@tcp(0x200000aa02c06), node: 10.160.44.6@tcp }
      - { index: 88, event: add_uuid, nid: 10.160.44.38@tcp1(0x200010aa02c26), node: 10.160.44.6@tcp }
      - { index: 103, event: add_conn, device: lustre-MDT0002-mdc, node: 10.160.44.6@tcp }
      
      ### second node
      - { index: 104, event: add_uuid, nid: 10.160.44.7@tcp(0x200000aa02c07), node: 10.160.44.7@tcp }
      - { index: 105, event: add_uuid, nid: 10.160.44.39@tcp1(0x200010aa02c27), node: 10.160.44.7@tcp }
      - { index: 120, event: add_conn, device: lustre-MDT0002-mdc, node: 10.160.44.7@tcp }
      
      

      Each 'add_conn' was configured with NID on restricted network "@tcp1" but 'add_conn' is skipped because it has no mention of "tcp1" in own name. Therefore client mounted without second node at address 10.160.44.39@tcp1 and can't connect to server during failover.

      Attachments

        Issue Links

          Activity

            [LU-16557] don't skip add_conn with -o network mount option

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50187/
            Subject: LU-16557 client: -o network needs add_conn processing
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 0543381b2f0ea6e2980315765ad34ae37411d36a

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50187/ Subject: LU-16557 client: -o network needs add_conn processing Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 0543381b2f0ea6e2980315765ad34ae37411d36a

            "Mikhail Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50187
            Subject: LU-16557 client: -o network needs add_conn processing
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 1f3f62eb8b74e6c29c8b9b62bcfc1884855a15a8

            gerrit Gerrit Updater added a comment - "Mikhail Pershin <mpershin@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50187 Subject: LU-16557 client: -o network needs add_conn processing Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 1f3f62eb8b74e6c29c8b9b62bcfc1884855a15a8
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49986/
            Subject: LU-16557 client: -o network needs add_conn processing
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: c508c9426838f16256223ab0bbd648bfbec25e46

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49986/ Subject: LU-16557 client: -o network needs add_conn processing Project: fs/lustre-release Branch: master Current Patch Set: Commit: c508c9426838f16256223ab0bbd648bfbec25e46

            https://review.whamcloud.com/#/c/fs/lustre-release/+/49986/ - proposed patch saves restricted network info in import during 'setup' command processing, so it is possible to apply restriction each time when import_set_conn() is called. Therefore it is applied on 'add_conn' in the same manner and I assume that will allow to mount with -o network and Dynamic Discovery LNet enabled, since it is using the same code to add connections

            tappro Mikhail Pershin added a comment - https://review.whamcloud.com/#/c/fs/lustre-release/+/49986/ - proposed patch saves restricted network info in import during 'setup' command processing, so it is possible to apply restriction each time when import_set_conn() is called. Therefore it is applied on 'add_conn' in the same manner and I assume that will allow to mount with -o network and Dynamic Discovery LNet enabled, since it is using the same code to add connections

            People

              tappro Mikhail Pershin
              tappro Mikhail Pershin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: