Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9527

Interop 2.9.0<->master conf-sanity test_77: start fs2ost failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.10.0
    • None
    • trevis-40, interop
        EL7, master branch, v2.9.57, b3575 clients
        EL7, ldiskfs, b2_9 branch, v2.9.0, b22 servers
    • 3
    • 9223372036854775807

    Description

      https://testing.hpdd.intel.com/test_sessions/2209839b-42d4-4fe6-91f1-96f9ce3c5a69

      Looks like network errors might have stopped the mount:

      From OST console:

      18:17:28:[ 9857.577430] LNetError: 120-3: Refusing connection from 127.0.0.1 for 0.0.0.0@tcp: No matching NI
      18:17:28:[ 9857.579761] LNetError: 3964:0:(socklnd_cb.c:1723:ksocknal_recv_hello()) Error -104 reading HELLO from 127.0.0.1
      18:17:28:[ 9857.582176] LNetError: 11b-b: Connection to 0.0.0.0@tcp at host 0.0.0.0 on port 7988 was reset: is it running a compatible version of Lustre and is 0.0.0.0@tcp one of its NIDs?
      18:17:28:[ 9867.576301] LNetError: 120-3: Refusing connection from 127.0.0.1 for 0.0.0.0@tcp: No matching NI
      18:17:28:[ 9867.578687] LNetError: 3965:0:(socklnd_cb.c:1723:ksocknal_recv_hello()) Error -104 reading HELLO from 127.0.0.1
      18:17:28:[ 9867.581126] LNetError: 11b-b: Connection to 0.0.0.0@tcp at host 0.0.0.0 on port 7988 was reset: is it running a compatible version of Lustre and is 0.0.0.0@tcp one of its NIDs?
      18:17:28:[ 9868.603119] LustreError: 15f-b: test1234-OST0000: cannot register this server with the MGS: rc = -110. Is the MGS running?
      18:17:28:[ 9868.607631] LustreError: 25737:0:(obd_mount_server.c:1844:server_fill_super()) Unable to start targets: -110
      18:17:28:[ 9868.611737] LustreError: 25737:0:(obd_mount_server.c:1558:server_put_super()) no obd test1234-OST0000
      18:17:28:[ 9868.614276] LustreError: 25737:0:(obd_mount_server.c:136:server_deregister_mount()) test1234-OST0000 not registered
      18:17:28:[ 9868.630914] LustreError: 25737:0:(obd_mount.c:1449:lustre_fill_super()) Unable to mount  (-110)
      

      From test_log:

      Starting fs2ost:   /dev/lvm-Role_OSS/S1 /mnt/lustre-fs2ost
      CMD: trevis-40vm8 mkdir -p /mnt/lustre-fs2ost; mount -t lustre   		                   /dev/lvm-Role_OSS/S1 /mnt/lustre-fs2ost
      trevis-40vm8: mount.lustre: mount /dev/mapper/lvm--Role_OSS-S1 at /mnt/lustre-fs2ost failed: Connection timed out
      Start of /dev/lvm-Role_OSS/S1 on fs2ost failed 110
       conf-sanity test_77: @@@@@@ FAIL: start fs2ost failed 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:4939:error()
        = /usr/lib64/lustre/tests/conf-sanity.sh:5377:test_77()
        = /usr/lib64/lustre/tests/test-framework.sh:5215:run_one()
        = /usr/lib64/lustre/tests/test-framework.sh:5254:run_one_logged()
        = /usr/lib64/lustre/tests/test-framework.sh:5101:run_test()
        = /usr/lib64/lustre/tests/conf-sanity.sh:5384:main()
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jcasper James Casper
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: