Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12688

sanity-sec test 31 fails with 'unable to configure NID o2ib999'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.13.0, Lustre 2.12.3
    • None
    • IB network
    • 3
    • 9223372036854775807

    Description

      sanity-sec test_31 fails to configure a network on IB networks.

      In the client test_log, we see

      == sanity-sec test 31: client mount option '-o network' ============================================== 04:29:59 (1565756999)
      192.168.5.148@o2ib:/lustre /mnt/lustre lustre rw,flock,user_xattr,lazystatfs 0 0
      CMD: onyx-64vm1.onyx.whamcloud.com grep -c /mnt/lustre' ' /proc/mounts
      Stopping client onyx-64vm1.onyx.whamcloud.com /mnt/lustre (opts:)
      CMD: onyx-64vm1.onyx.whamcloud.com lsof -t /mnt/lustre
      CMD: onyx-64vm1.onyx.whamcloud.com umount  /mnt/lustre 2>&1
      CMD: onyx-64vm4 lctl get_param -n *.MGS*.exports.'192.168.5.145@o2ib'.uuid 2>/dev/null |
      		      grep -q -
      CMD: onyx-64vm3,onyx-64vm4 lctl get_param -n *.lustre*.exports.'192.168.5.145@o2ib'.uuid 		  2>/dev/null | grep -q -
      CMD: onyx-64vm1.onyx.whamcloud.com,onyx-64vm2,onyx-64vm3,onyx-64vm4 /usr/sbin/lnetctl lnet configure && /usr/sbin/lnetctl net add --if 		  $(/usr/sbin/lnetctl net show --net o2ib | awk 'BEGIN{inf=0} 		  {if (inf==1) print $2; fi; inf=0} /interfaces/{inf=1}') 		  --net o2ib999
      onyx-64vm1: add:
      onyx-64vm1:     - net:
      onyx-64vm1:           errno: -100
      onyx-64vm1:           descr: "cannot add network: Network is down"
      onyx-64vm2: add:
      onyx-64vm2:     - net:
      onyx-64vm2:           errno: -100
      onyx-64vm2:           descr: "cannot add network: Network is down"
      onyx-64vm4: add:
      onyx-64vm4:     - net:
      onyx-64vm4:           errno: -100
      onyx-64vm4:           descr: "cannot add network: Network is down"
      onyx-64vm3: add:
      onyx-64vm3:     - net:
      onyx-64vm3:           errno: -100
      onyx-64vm3:           descr: "cannot add network: Network is down"
       sanity-sec test_31: @@@@@@ FAIL: unable to configure NID o2ib999 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:5829:error()
        = /usr/lib64/lustre/tests/sanity-sec.sh:2238:test_31()
      

      We see very similar output on the console logs for all nodes. For example on a client console, we see

      [31452.506785] Lustre: DEBUG MARKER: == sanity-sec test 31: client mount option '-o network' ============================================== 04:29:59 (1565756999)
      [31452.635972] Lustre: DEBUG MARKER: grep -c /mnt/lustre' ' /proc/mounts
      [31452.645394] Lustre: DEBUG MARKER: lsof -t /mnt/lustre
      [31452.773901] Lustre: DEBUG MARKER: umount /mnt/lustre 2>&1
      [31452.817588] Lustre: Unmounted lustre-client
      [31452.818414] Lustre: Skipped 4 previous similar messages
      [31453.822622] Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure && /usr/sbin/lnetctl net add --if 		  ib0 		  --net o2ib999
      [31454.062442] LNetError: 10968:0:(o2iblnd.c:2766:kiblnd_dev_failover()) Failed to bind ib0:192.168.5.145 to device(ffff93f2f7830000): -98
      [31454.064594] LNetError: 10968:0:(o2iblnd.c:3256:kiblnd_startup()) ko2iblnd: Can't initialize device: rc = -98
      [31454.066261] LNetError: 105-4: Error -100 starting up LNI o2ib
      [31454.297769] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity-sec test_31: @@@@@@ FAIL: unable to configure NID o2ib999 
      

      We just started seeing this issue because we started running autotesting with IB networks.

      Here are logs for a few failures
      https://testing.whamcloud.com/test_sets/d9f710b0-b662-11e9-9f36-52540065bddc
      https://testing.whamcloud.com/test_sets/fedcdcb8-bb0b-11e9-97d5-52540065bddc
      https://testing.whamcloud.com/test_sets/44209a52-bc3e-11e9-98c8-52540065bddc

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: