Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15746

UDSP: udsp_single_net_04 failed as "uneven tx traffic distribution across interfaces"

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      version=2.15.0_RC2_6_g09fe899

      This is from LUTF output

      lutf>>> suites['udsp'].scripts['udsp_single_net_04'].run()
      
      nids:  ['10.240.43.102@tcp', '10.240.43.109@tcp', '10.240.43.110@tcp', '10.240.43.117@tcp']
      
      None
      
      [\{'net type': 'tcp', 'local NI(s)': [{'nid': '10.240.43.102@tcp', 'statistics': {'send_count': 2, 'recv_count': 2, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': 0}, 'sent_stats': \{'put': 1, 'get': 1, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 1, 'ack': 1, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.109@tcp', 'statistics': {'send_count': 1, 'recv_count': 1, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': 0}, 'sent_stats': \{'put': 1, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 1, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.110@tcp', 'statistics': {'send_count': 0, 'recv_count': 0, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.117@tcp', 'statistics': {'send_count': 0, 'recv_count': 0, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}]}]
      
      [\{'net type': 'tcp', 'local NI(s)': [{'nid': '10.240.43.102@tcp', 'statistics': {'send_count': 7, 'recv_count': 7, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': 0}, 'sent_stats': \{'put': 1, 'get': 6, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 6, 'ack': 1, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.109@tcp', 'statistics': {'send_count': 6, 'recv_count': 6, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': 0}, 'sent_stats': \{'put': 1, 'get': 5, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 1, 'get': 0, 'reply': 5, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.110@tcp', 'statistics': {'send_count': 0, 'recv_count': 0, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.117@tcp', 'statistics': {'send_count': 0, 'recv_count': 0, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}]}]
      
      {0: 2, 1: 1} \{0: 7, 1: 6}
      
      [\{'net type': 'tcp', 'local NI(s)': [{'nid': '10.240.43.102@tcp', 'statistics': {'send_count': 7, 'recv_count': 7, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 1, 'get': 6, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 6, 'ack': 1, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.109@tcp', 'statistics': {'send_count': 6, 'recv_count': 6, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 1, 'get': 5, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 1, 'get': 0, 'reply': 5, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.110@tcp', 'statistics': {'send_count': 0, 'recv_count': 0, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.117@tcp', 'statistics': {'send_count': 0, 'recv_count': 0, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}]}]
      
      [\{'net type': 'tcp', 'local NI(s)': [{'nid': '10.240.43.102@tcp', 'statistics': {'send_count': 9, 'recv_count': 9, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 1, 'get': 8, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 8, 'ack': 1, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.109@tcp', 'statistics': {'send_count': 8, 'recv_count': 8, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 1, 'get': 7, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 1, 'get': 0, 'reply': 7, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.110@tcp', 'statistics': {'send_count': 3, 'recv_count': 3, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 0, 'get': 3, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 3, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}, \{'nid': '10.240.43.117@tcp', 'statistics': {'send_count': 3, 'recv_count': 3, 'drop_count': 0}, 'udsp info': \{'net priority': -1, 'nid priority': -1}, 'sent_stats': \{'put': 0, 'get': 3, 'reply': 0, 'ack': 0, 'hello': 0}, 'received_stats': \{'put': 0, 'get': 0, 'reply': 3, 'ack': 0, 'hello': 0}, 'dropped_stats': \{'put': 0, 'get': 0, 'reply': 0, 'ack': 0, 'hello': 0}, 'health stats': \{'fatal_error': 0, 'health value': 1000, 'interrupts': 0, 'dropped': 0, 'aborted': 0, 'no route': 0, 'timeouts': 0, 'error': 0, 'ping_count': 0, 'next_ping': 0}, 'lnd tunables': \{'conns_per_peer': 1}, 'dev cpt': -1, 'CPT': '[0]'}]}]
      
      {0: 7, 1: 6, 2: 0, 3: 0} \{0: 9, 1: 8, 2: 3, 3: 3}
      
      uneven tx traffic distribution across interfaces
      

      According to UDSP test plan, single_net_04 tests:

      Setup: configure single network, 3 NIDs on the network 
      Add UDSP rule that gives two of the interfaces highest priority
      Start traffic
      Stop traffic
      Verify that two NIDs with the highest priority were used
      Add UDSP that lowers the priority of both of the NIDs with the highest priority back to default
      Start traffic
      Stop traffic
      Verify that all NIDs were used
      

      Attachments

        Issue Links

          Activity

            People

              cbordage Cyril Bordage
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: