Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17475

sanity test_432 fails with "mgs and active mismatch, 10 attempts" with IPv6

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      == sanity test 432: mv dir from outside Lustre =========== 05:03:34 (1706007814)
      On MGS 2601:8c1:c180:2000::cbdd, active = nodemap.active=1
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      On el8-mds2 2601:8c1:c180:2000::cbde, active =
      MGS
      nodemap.active=1
      OTHER - IP: 2601:8c1:c180:2000::cbde
      
       sanity test_432: @@@@@@ FAIL: mgs and active  mismatch, 10 attempts
      

      wait_nm_sync() in test-framework.sh uses the IP address as an argument to do_node() :

              # wait up to 10 seconds for other servers to sync with mgs
              for i in $(seq 1 10); do
                      for node in $(all_server_nodes); do
                              local node_ip=$(host_nids_address $node $NETTYPE |
                                              cut -d' ' -f1)
      
                              is_sync=true
                              if [ -z "$value" ]; then
                                      [ $node_ip == $mgs_ip ] && continue
                              fi
      
                              out2=$(do_node $node_ip $LCTL get_param $opt \
                                     nodemap.$proc_param 2>/dev/null)
                              echo "On $node ${node_ip}, ${proc_param} = $out2"
                              [ "$out1" != "$out2" ] && is_sync=false && break
                      done
                      $is_sync && break
                      sleep 1
              done
      

      If do_node resolves to pdsh (likely?) then this will not work with IPv6 because pdsh mis-interprets the ':' in an IPv6 address as specifying an rcmd type:

      A list of hosts may also be preceded by ... "rcmd_type:" to specify an alternate rcmd connection type for these hosts.

      Attachments

        Activity

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: