Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16557

don't skip add_conn with -o network mount option

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      Mount option -o network is used to restrict client network to access servers. It filters networks during 'setup'  configure command and skips all 'add_conn' commands if connection UUID has no correct network mention. Meawhile that UUID in add_conn is just a name, and real NIDs attached to it may be on correct network. Skipping such add_conn skips also possible failover NIDs and leave client without knowledge about failover nodes.

      E.g. in configuration below:

       - { index: 68, event: add_uuid, nid: 10.160.44.6@tcp(0x200000aa02c06), node: 10.160.44.6@tcp }
      - { index: 69, event: add_uuid, nid: 10.160.44.38@tcp1(0x200010aa02c26), node: 10.160.44.6@tcp }
      - { index: 85, event: attach, device: lustre-MDT0002-mdc, type: mdc, UUID: lus27-clilmv_UUID }
      - { index: 86, event: setup, device: lustre-MDT0002-mdc, UUID: lustre-MDT0002_UUID, node: 10.160.44.6@tcp }
      
      ### here after setup both @tcp and @tcp1 NIDs are filtered and the latter is kept, note that connection UUID used in 'setup' has "@tcp" network and that is not taken into account properly
      
      ### here below result of --servicenode option, first there are nids and finally add_conn with then:
      
      - { index: 87, event: add_uuid, nid: 10.160.44.6@tcp(0x200000aa02c06), node: 10.160.44.6@tcp }
      - { index: 88, event: add_uuid, nid: 10.160.44.38@tcp1(0x200010aa02c26), node: 10.160.44.6@tcp }
      - { index: 103, event: add_conn, device: lustre-MDT0002-mdc, node: 10.160.44.6@tcp }
      
      ### second node
      - { index: 104, event: add_uuid, nid: 10.160.44.7@tcp(0x200000aa02c07), node: 10.160.44.7@tcp }
      - { index: 105, event: add_uuid, nid: 10.160.44.39@tcp1(0x200010aa02c27), node: 10.160.44.7@tcp }
      - { index: 120, event: add_conn, device: lustre-MDT0002-mdc, node: 10.160.44.7@tcp }
      
      

      Each 'add_conn' was configured with NID on restricted network "@tcp1" but 'add_conn' is skipped because it has no mention of "tcp1" in own name. Therefore client mounted without second node at address 10.160.44.39@tcp1 and can't connect to server during failover.

      Attachments

        Issue Links

          Activity

            People

              tappro Mikhail Pershin
              tappro Mikhail Pershin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: