Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8428

Very long recovery times for MDTs after MDS restart

Details

    • 3
    • 9223372036854775807

    Description

      The effect happens during soak testing of build '20160713' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160713)
      OSTs have been configured using zfs, MDTs using ldiskfs
      Environment consist of 4 MDSes with 1 MDT each and 6 OSSes with 4 OSTs each.
      DNE is enabled.
      MDS and OSS nodes are configured in active-active HA configuration.

      The effect of very long recovery times occurred after the restart of two MDS nodes.
      lola-9 : 89:38 min recovery time
      lola-10 : 193:55 min recovery time

      An OSS failover/failback injected by the framework, while the MDTs of the restarted MDS nodes were in recovery, finished successful in 48 seconds.

      Sequence of events:

      • 2016-07-19 22:57:52 - lola-10 mds_restart
      • 2016-07-19 23:07:21 - lola-10 , recovery started
      • 2016-07-20 00:41:55 - lola-9 mds_restart
      • 2016-07-20 00:51:55 - lola-9 , recovery started
      • 2016-07-20 00:53:07 - lola-3 oss_failover
      • 2016-07-20 00:58:55 - lola-3 recovery started
      • 2016-07-20 00:59:43 - lola-3 kernel: Lustre: soaked-OST0007: Recovery over after 0:48, of 23 clients 23 recovered and 0 were evicted.
      • 2016-07-20 02:21:35 - lola-9 kernel: Lustre: soaked-MDT0001: Recovery over after 89:38, of 22 clients 21 recovered and 1 was evicted.
      • 2016-07-20 02:21:17 - lola-10 kernel: Lustre: soaked-MDT0002: Recovery over after 193:55, of 22 clients 20 recovered and 2 were evicted.

      Clients were in the following state during recovery:

      [root@lola-16 ~]# pdsh -g clients 'lctl get_param *.*.state | grep -A 2 -E "MDT0001|MDT0002" ' | dshbak -c | 
      less -i
      ----------------  
      lola-13
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff88028add7800.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff88028add7800.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------
      lola-16
      ----------------
      mdc.soaked-MDT0001-mdc-ffff880415a56400.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff880415a56400.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------
      lola-19
      ----------------
      mdc.soaked-MDT0001-mdc-ffff88021b0fa800.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff88021b0fa800.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------
      lola-20
      ----------------
      mdc.soaked-MDT0001-mdc-ffff880c9b13f400.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff880c9b13f400.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-21
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff881033e32400.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff881033e32400.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------
      ----------------  
      lola-22
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff88082e455000.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff88082e455000.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-23
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff8808e6af4400.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff8808e6af4400.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-24
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff88102e910800.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff88102e910800.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-25
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff88102f38a400.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff88102f38a400.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-26
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff880f7857fc00.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff880f7857fc00.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-27
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff88081a0eec00.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff88081a0eec00.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-29
      ----------------  
      ----------------  
      lola-13
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff88028add7800.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff88028add7800.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-16
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff880415a56400.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff880415a56400.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-19
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff88021b0fa800.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff88021b0fa800.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-20
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff880c9b13f400.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff880c9b13f400.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-21
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff881033e32400.state=
      current_state: REPLAY_WAIT
      state_history:
      --
      mdc.soaked-MDT0002-mdc-ffff881033e32400.state=
      current_state: REPLAY_WAIT
      state_history:
      ----------------  
      lola-22
      ----------------  
      mdc.soaked-MDT0001-mdc-ffff88082e455000.state=
      current_state: REPLAY_WAIT
      state_history:
      

      Attached files:
      Console, messages logs of MDS nodes lola-[9,10]
      debug kernel logs of MDS nodes lola-[9,10] and single Lustre client node lola-19

      Attachments

        Issue Links

          Activity

            [LU-8428] Very long recovery times for MDTs after MDS restart
            pjones Peter Jones added a comment -

            Di believes this issue is a duplicate of LU-8250

            pjones Peter Jones added a comment - Di believes this issue is a duplicate of LU-8250
            yong.fan nasf (Inactive) added a comment - - edited

            Don't wait for recovery to complete on secondary; switch back to primary node immediately after MDT mount completes

            How did you do that? umount the MDT just after the MDT mount complete? With or without "-f"? It seems that from the lola-11 view, the connection from the MDT0002 was not cleanup, as to refused the new connection from the MDT0002 on the lola-10. One possible solution is allow the new connection to replace the old. I wondered why the patch http://review.whamcloud.com/#/c/18800/ did not take effect (or some side-effect?) under such case.

            yong.fan nasf (Inactive) added a comment - - edited Don't wait for recovery to complete on secondary; switch back to primary node immediately after MDT mount completes How did you do that? umount the MDT just after the MDT mount complete? With or without "-f"? It seems that from the lola-11 view, the connection from the MDT0002 was not cleanup, as to refused the new connection from the MDT0002 on the lola-10. One possible solution is allow the new connection to replace the old. I wondered why the patch http://review.whamcloud.com/#/c/18800/ did not take effect (or some side-effect?) under such case.

            The HA configuration reads as follows:

            • lola-8, lola-9 - failover cluster
              • lola-8: MDT0000 (primary resource)
              • lola-9: MDT0001 (primary resource)
            • lola-10,lola-11 - failover cluster
              • lola-10: MDT0002 (primary resource)
              • lola-11: MDT0003 (primary resource)

            MDTs are configured for failover during format. HA is controlled by the soak test framework. For the scenario above the following process flow were executed

            • powercyle of (primary) MDS
            • mount of MDT on secondary MDS
            • Don't wait for recovery to complete on secondary; switch back to primary node immediately after MDT mount completes
            • mount MDT on primary MDS again

            The console and message files for lola-11 aren't available anymore. Because of capacity reason all files older than 6 weeks are deleted on the cluster's share FS.

            For build '20160916' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160916) this effect didn't occurred
            till now although the effect might be eclipsed by LU-8580.

            heckes Frank Heckes (Inactive) added a comment - The HA configuration reads as follows: lola-8, lola-9 - failover cluster lola-8: MDT0000 (primary resource) lola-9: MDT0001 (primary resource) lola-10,lola-11 - failover cluster lola-10: MDT0002 (primary resource) lola-11: MDT0003 (primary resource) MDTs are configured for failover during format. HA is controlled by the soak test framework. For the scenario above the following process flow were executed powercyle of (primary) MDS mount of MDT on secondary MDS Don't wait for recovery to complete on secondary; switch back to primary node immediately after MDT mount completes mount MDT on primary MDS again The console and message files for lola-11 aren't available anymore. Because of capacity reason all files older than 6 weeks are deleted on the cluster's share FS. For build '20160916' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160916 ) this effect didn't occurred till now although the effect might be eclipsed by LU-8580 .

            According to the given log (lustre-log-lola-10-20160720_0136), the lola-10 was trying to connect MDT0003 for many many times:

            00000020:00000080:23.0:1468994840.970247:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf003
            00000004:00080000:23.0:1468994840.970280:0:6167:0:(osp_dev.c:1121:osp_init0()) soaked-MDT0003-osp-MDT0002: connect to soaked-MDT0002-osd (soaked-MDT0003-osp-MDT0002)
            00080000:01000000:23.0:1468994840.970287:0:6167:0:(osd_handler.c:6820:osd_obd_connect()) connect #5
            00000020:00000080:23.0:1468994840.970290:0:6167:0:(genops.c:1169:class_connect()) connect: client soaked-MDT0002-osd_UUID, cookie 0x24fc9dac120838ee
            00010000:00080000:23.0:1468994840.970307:0:6167:0:(ldlm_lib.c:112:import_set_conn()) imp ffff8803fc39b800@soaked-MDT0003-osp-MDT0002: add connection 192.168.1.111@o2ib10 at head
            00000020:00000080:23.0:1468994840.971931:0:6167:0:(obd_config.c:552:class_setup()) finished setup of obd soaked-MDT0003-osp-MDT0002 (uuid soaked-MDT0002-mdtlov_UUID)
            00000020:00000080:23.0:1468994840.971939:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf005
            00000020:00000080:23.0:1468994840.971943:0:6167:0:(obd_config.c:1159:class_process_config()) adding mapping from uuid 192.168.1.111@o2ib10 to nid 0x5000ac0a8016f (192.168.1.111@o2ib10)
            00000020:00000080:23.0:1468994840.971948:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf00b
            00010000:00080000:23.0:1468994840.971957:0:6167:0:(ldlm_lib.c:95:import_set_conn()) imp ffff8803fc39b800@soaked-MDT0003-osp-MDT0002: found existing conn 192.168.1.111@o2ib10
            00000020:00000080:23.0:1468994840.971962:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf005
            00000020:00000080:23.0:1468994840.971967:0:6167:0:(obd_config.c:1159:class_process_config()) adding mapping from uuid 192.168.1.110@o2ib10 to nid 0x5000ac0a8016e (192.168.1.110@o2ib10)
            00000020:00000080:23.0:1468994840.971971:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf00b
            00010000:00080000:23.0:1468994840.971976:0:6167:0:(ldlm_lib.c:112:import_set_conn()) imp ffff8803fc39b800@soaked-MDT0003-osp-MDT0002: add connection 192.168.1.110@o2ib10 at tail
            00000020:00000080:23.0:1468994840.971980:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf014
            00000004:01000000:23.0:1468994840.971993:0:6167:0:(lod_lov.c:225:lod_add_device()) osp:soaked-MDT0003_UUID idx:3 gen:1
            00000004:01000000:23.0:1468994840.971997:0:6167:0:(osp_dev.c:1411:osp_obd_connect()) connect #0
            00000020:00000080:23.0:1468994840.972004:0:6167:0:(genops.c:1169:class_connect()) connect: client soaked-MDT0002-mdtlov_UUID, cookie 0x24fc9dac120838fc
            00000100:00080000:23.0:1468994840.972006:0:6167:0:(import.c:679:ptlrpc_connect_import()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from NEW to CONNECTING
            00000100:00080000:23.0:1468994840.972010:0:6167:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 192.168.1.111@o2ib10 last attempt 0
            00000100:00080000:23.0:1468994840.972014:0:6167:0:(import.c:601:import_select_connection()) soaked-MDT0003-osp-MDT0002: import ffff8803fc39b800 using connection 192.168.1.111@o2ib10/192.168.1.111@o2ib10
            ...
            00000100:02020000:1.0:1468994840.996289:0:4562:0:(client.c:1266:ptlrpc_check_status()) 11-0: soaked-MDT0003-osp-MDT0002: operation mds_connect to node 192.168.1.111@o2ib10 failed: rc = -114
            00000100:00080000:1.0:1468994840.996295:0:4562:0:(import.c:1290:ptlrpc_connect_interpret()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from CONNECTING to DISCONN
            00000100:00080000:1.0:1468994840.996298:0:4562:0:(import.c:1336:ptlrpc_connect_interpret()) recovery of soaked-MDT0003_UUID on 192.168.1.111@o2ib10 failed (-114)
            ...
            00000100:00080000:17.0:1468994866.072408:0:4595:0:(recover.c:62:ptlrpc_initiate_recovery()) soaked-MDT0003_UUID: starting recovery
            00000100:00080000:17.0:1468994866.072410:0:4595:0:(import.c:679:ptlrpc_connect_import()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from DISCONN to CONNECTING
            00000100:00080000:17.0:1468994866.072415:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 192.168.1.111@o2ib10 last attempt 4295034989
            00000100:00080000:17.0:1468994866.072418:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 0@lo last attempt 0
            00000100:00080000:17.0:1468994866.072428:0:4595:0:(import.c:593:import_select_connection()) soaked-MDT0003-osp-MDT0002: Connection changing to soaked-MDT0003 (at 0@lo)
            00000100:00080000:17.0:1468994866.072431:0:4595:0:(import.c:601:import_select_connection()) soaked-MDT0003-osp-MDT0002: import ffff8803fc39b800 using connection 192.168.1.110@o2ib10/0@lo
            00000100:00080000:17.0:1468994866.072578:0:4595:0:(pinger.c:221:ptlrpc_pinger_process_import()) soaked-MDT0000-lwp-MDT0002_UUID->soaked-MDT0000_UUID: level DISCONN/3 force 0 force_next 0 deactive 0 pingable 0 suppress 0
            00000100:00080000:17.0:1468994866.072584:0:4595:0:(recover.c:62:ptlrpc_initiate_recovery()) soaked-MDT0000_UUID: starting recovery
            00000100:00080000:17.0:1468994866.072586:0:4595:0:(import.c:679:ptlrpc_connect_import()) ffff880832815000 soaked-MDT0000_UUID: changing import state from DISCONN to CONNECTING
            00000100:00080000:17.0:1468994866.072592:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0000-lwp-MDT0002: connect to NID 192.168.1.108@o2ib10 last attempt 4295035090
            00000100:00080000:17.0:1468994866.072597:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0000-lwp-MDT0002: connect to NID 192.168.1.109@o2ib10 last attempt 0
            00000100:00080000:17.0:1468994866.072602:0:4595:0:(import.c:593:import_select_connection()) soaked-MDT0000-lwp-MDT0002: Connection changing to soaked-MDT0000 (at 192.168.1.109@o2ib10)
            00000100:00080000:17.0:1468994866.072607:0:4595:0:(import.c:601:import_select_connection()) soaked-MDT0000-lwp-MDT0002: import ffff880832815000 using connection 192.168.1.109@o2ib10/192.168.1.109@o2ib10
            00010000:02020000:1.0:1468994866.072759:0:6154:0:(ldlm_lib.c:962:target_handle_connect()) 137-5: soaked-MDT0003_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
            00000100:00080000:17.0:1468994866.089880:0:4562:0:(import.c:1290:ptlrpc_connect_interpret()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from CONNECTING to DISCONN
            00000100:00080000:17.0:1468994866.089885:0:4562:0:(import.c:1336:ptlrpc_connect_interpret()) recovery of soaked-MDT0003_UUID on 192.168.1.110@o2ib10 failed (-19)
            ...
            00000100:00080000:17.0:1468994891.072459:0:4595:0:(recover.c:62:ptlrpc_initiate_recovery()) soaked-MDT0003_UUID: starting recovery
            00000100:00080000:17.0:1468994891.072461:0:4595:0:(import.c:679:ptlrpc_connect_import()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from DISCONN to CONNECTING
            00000100:00080000:17.0:1468994891.072466:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 192.168.1.111@o2ib10 last attempt 4295034989
            00000100:00080000:17.0:1468994891.072470:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 0@lo last attempt 4295060090
            00000100:00080000:17.0:1468994891.072475:0:4595:0:(import.c:567:import_select_connection()) soaked-MDT0003-osp-MDT0002: tried all connections, increasing latency to 6s
            00000100:00080000:17.0:1468994891.072488:0:4595:0:(import.c:593:import_select_connection()) soaked-MDT0003-osp-MDT0002: Connection changing to soaked-MDT0003 (at 192.168.1.111@o2ib10)
            00000100:00080000:17.0:1468994891.072492:0:4595:0:(import.c:601:import_select_connection()) soaked-MDT0003-osp-MDT0002: import ffff8803fc39b800 using connection 192.168.1.111@o2ib10/192.168.1.111@o2ib10
            00000100:02020000:2.0:1468994891.072768:0:4562:0:(client.c:1266:ptlrpc_check_status()) 11-0: soaked-MDT0000-osp-MDT0002: operation mds_connect to node 192.168.1.108@o2ib10 failed: rc = -114
            00000100:00080000:2.0:1468994891.089071:0:4562:0:(import.c:1290:ptlrpc_connect_interpret()) ffff8804256c8000 soaked-MDT0000_UUID: changing import state from CONNECTING to DISCONN
            00000100:00080000:2.0:1468994891.089075:0:4562:0:(import.c:1336:ptlrpc_connect_interpret()) recovery of soaked-MDT0000_UUID on 192.168.1.108@o2ib10 failed (-114)
            00000100:02020000:2.0:1468994891.089104:0:4562:0:(client.c:1266:ptlrpc_check_status()) 11-0: soaked-MDT0003-osp-MDT0002: operation mds_connect to node 192.168.1.111@o2ib10 failed: rc = -114
            00000100:00080000:2.0:1468994891.089108:0:4562:0:(import.c:1290:ptlrpc_connect_interpret()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from CONNECTING to DISCONN
            00000100:00080000:2.0:1468994891.089110:0:4562:0:(import.c:1336:ptlrpc_connect_interpret()) recovery of soaked-MDT0003_UUID on 192.168.1.111@o2ib10 failed (-114)
            ...
            

            As shown, at the beginning (timestamps: 1468994840), the MDT0002 on the lola-10 selected lola-11 (192.168.1.111@o2ib10) for the soaked-MDT0003-osp-MDT0002 connection. Unfortunately, the lola-11 returned -114 to the lola-10. That means another instance of soaked-MDT0003-osp-MDT0002 has already connected (or in connecting) to the MDT0003 on the lola-11. Because lola-10 and lola-11 were HA pairs, it was quite possible that the lola-11 took lola-10's role to server as MDT0002 during the lola-10 failover. Since MDT0002 had been mounted on the lola-11, the connection between MDT0002 and MDT0003 had been established at that time, then it is normal that the MDT0002 on the lola-10 got -114 failure.

            Because lola-10 and lola-11 were HA pairs, when hit -114 failure, the MDT0002 on the lola-10 selected another peer for the soaked-MDT0003-osp-MDT0002 connection, it is lola-10 (192.168.1.111@o2ib10) itself. But it is obviously that MDT0003 was running on the lola-11, so the MDT0002 on the lola-10 got -19 failure.

            Then, the MDT0002 has to find other peer for the soaked-MDT0003-osp-MDT0002 connection. There were only two choices, either lola-11 or lola-10, so it selected lola-11, the same as the first choice, but still hit -114 failure. And then try to connect lola-10 again, failed as -19 again.

            So the MDT0002 on the lola-10 tried to connect to MDT0003 repeatedly for a long time (from the timestamps 1468994840 to 1469003821), 8981 seconds passed, but still cannot connect to the MDT0003. Such connection failure blocked the cross-MDT recovery. From the given logs, I cannot find when the soaked-MDT0003-osp-MDT0002 connection was established successfully. Because we have no logs on the lola-11, we do not know what happened for the lola-11. But it is suspected that it is related with another MDT0002 instance on the lola-11 because of the HA pairs configuration. Since the HA is controlled by some third-part HA software, such as heartbeat or pack-maker, can we check whether it works well or not under the failover case? Can we switch the resource by force under such case?

            Anyway, offer me more logs on the lola-11 will be helpful. Thanks!

            yong.fan nasf (Inactive) added a comment - According to the given log (lustre-log-lola-10-20160720_0136), the lola-10 was trying to connect MDT0003 for many many times: 00000020:00000080:23.0:1468994840.970247:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf003 00000004:00080000:23.0:1468994840.970280:0:6167:0:(osp_dev.c:1121:osp_init0()) soaked-MDT0003-osp-MDT0002: connect to soaked-MDT0002-osd (soaked-MDT0003-osp-MDT0002) 00080000:01000000:23.0:1468994840.970287:0:6167:0:(osd_handler.c:6820:osd_obd_connect()) connect #5 00000020:00000080:23.0:1468994840.970290:0:6167:0:(genops.c:1169:class_connect()) connect: client soaked-MDT0002-osd_UUID, cookie 0x24fc9dac120838ee 00010000:00080000:23.0:1468994840.970307:0:6167:0:(ldlm_lib.c:112:import_set_conn()) imp ffff8803fc39b800@soaked-MDT0003-osp-MDT0002: add connection 192.168.1.111@o2ib10 at head 00000020:00000080:23.0:1468994840.971931:0:6167:0:(obd_config.c:552:class_setup()) finished setup of obd soaked-MDT0003-osp-MDT0002 (uuid soaked-MDT0002-mdtlov_UUID) 00000020:00000080:23.0:1468994840.971939:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf005 00000020:00000080:23.0:1468994840.971943:0:6167:0:(obd_config.c:1159:class_process_config()) adding mapping from uuid 192.168.1.111@o2ib10 to nid 0x5000ac0a8016f (192.168.1.111@o2ib10) 00000020:00000080:23.0:1468994840.971948:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf00b 00010000:00080000:23.0:1468994840.971957:0:6167:0:(ldlm_lib.c:95:import_set_conn()) imp ffff8803fc39b800@soaked-MDT0003-osp-MDT0002: found existing conn 192.168.1.111@o2ib10 00000020:00000080:23.0:1468994840.971962:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf005 00000020:00000080:23.0:1468994840.971967:0:6167:0:(obd_config.c:1159:class_process_config()) adding mapping from uuid 192.168.1.110@o2ib10 to nid 0x5000ac0a8016e (192.168.1.110@o2ib10) 00000020:00000080:23.0:1468994840.971971:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf00b 00010000:00080000:23.0:1468994840.971976:0:6167:0:(ldlm_lib.c:112:import_set_conn()) imp ffff8803fc39b800@soaked-MDT0003-osp-MDT0002: add connection 192.168.1.110@o2ib10 at tail 00000020:00000080:23.0:1468994840.971980:0:6167:0:(obd_config.c:1148:class_process_config()) processing cmd: cf014 00000004:01000000:23.0:1468994840.971993:0:6167:0:(lod_lov.c:225:lod_add_device()) osp:soaked-MDT0003_UUID idx:3 gen:1 00000004:01000000:23.0:1468994840.971997:0:6167:0:(osp_dev.c:1411:osp_obd_connect()) connect #0 00000020:00000080:23.0:1468994840.972004:0:6167:0:(genops.c:1169:class_connect()) connect: client soaked-MDT0002-mdtlov_UUID, cookie 0x24fc9dac120838fc 00000100:00080000:23.0:1468994840.972006:0:6167:0:(import.c:679:ptlrpc_connect_import()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from NEW to CONNECTING 00000100:00080000:23.0:1468994840.972010:0:6167:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 192.168.1.111@o2ib10 last attempt 0 00000100:00080000:23.0:1468994840.972014:0:6167:0:(import.c:601:import_select_connection()) soaked-MDT0003-osp-MDT0002: import ffff8803fc39b800 using connection 192.168.1.111@o2ib10/192.168.1.111@o2ib10 ... 00000100:02020000:1.0:1468994840.996289:0:4562:0:(client.c:1266:ptlrpc_check_status()) 11-0: soaked-MDT0003-osp-MDT0002: operation mds_connect to node 192.168.1.111@o2ib10 failed: rc = -114 00000100:00080000:1.0:1468994840.996295:0:4562:0:(import.c:1290:ptlrpc_connect_interpret()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:1.0:1468994840.996298:0:4562:0:(import.c:1336:ptlrpc_connect_interpret()) recovery of soaked-MDT0003_UUID on 192.168.1.111@o2ib10 failed (-114) ... 00000100:00080000:17.0:1468994866.072408:0:4595:0:(recover.c:62:ptlrpc_initiate_recovery()) soaked-MDT0003_UUID: starting recovery 00000100:00080000:17.0:1468994866.072410:0:4595:0:(import.c:679:ptlrpc_connect_import()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from DISCONN to CONNECTING 00000100:00080000:17.0:1468994866.072415:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 192.168.1.111@o2ib10 last attempt 4295034989 00000100:00080000:17.0:1468994866.072418:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 0@lo last attempt 0 00000100:00080000:17.0:1468994866.072428:0:4595:0:(import.c:593:import_select_connection()) soaked-MDT0003-osp-MDT0002: Connection changing to soaked-MDT0003 (at 0@lo) 00000100:00080000:17.0:1468994866.072431:0:4595:0:(import.c:601:import_select_connection()) soaked-MDT0003-osp-MDT0002: import ffff8803fc39b800 using connection 192.168.1.110@o2ib10/0@lo 00000100:00080000:17.0:1468994866.072578:0:4595:0:(pinger.c:221:ptlrpc_pinger_process_import()) soaked-MDT0000-lwp-MDT0002_UUID->soaked-MDT0000_UUID: level DISCONN/3 force 0 force_next 0 deactive 0 pingable 0 suppress 0 00000100:00080000:17.0:1468994866.072584:0:4595:0:(recover.c:62:ptlrpc_initiate_recovery()) soaked-MDT0000_UUID: starting recovery 00000100:00080000:17.0:1468994866.072586:0:4595:0:(import.c:679:ptlrpc_connect_import()) ffff880832815000 soaked-MDT0000_UUID: changing import state from DISCONN to CONNECTING 00000100:00080000:17.0:1468994866.072592:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0000-lwp-MDT0002: connect to NID 192.168.1.108@o2ib10 last attempt 4295035090 00000100:00080000:17.0:1468994866.072597:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0000-lwp-MDT0002: connect to NID 192.168.1.109@o2ib10 last attempt 0 00000100:00080000:17.0:1468994866.072602:0:4595:0:(import.c:593:import_select_connection()) soaked-MDT0000-lwp-MDT0002: Connection changing to soaked-MDT0000 (at 192.168.1.109@o2ib10) 00000100:00080000:17.0:1468994866.072607:0:4595:0:(import.c:601:import_select_connection()) soaked-MDT0000-lwp-MDT0002: import ffff880832815000 using connection 192.168.1.109@o2ib10/192.168.1.109@o2ib10 00010000:02020000:1.0:1468994866.072759:0:6154:0:(ldlm_lib.c:962:target_handle_connect()) 137-5: soaked-MDT0003_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. 00000100:00080000:17.0:1468994866.089880:0:4562:0:(import.c:1290:ptlrpc_connect_interpret()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:17.0:1468994866.089885:0:4562:0:(import.c:1336:ptlrpc_connect_interpret()) recovery of soaked-MDT0003_UUID on 192.168.1.110@o2ib10 failed (-19) ... 00000100:00080000:17.0:1468994891.072459:0:4595:0:(recover.c:62:ptlrpc_initiate_recovery()) soaked-MDT0003_UUID: starting recovery 00000100:00080000:17.0:1468994891.072461:0:4595:0:(import.c:679:ptlrpc_connect_import()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from DISCONN to CONNECTING 00000100:00080000:17.0:1468994891.072466:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 192.168.1.111@o2ib10 last attempt 4295034989 00000100:00080000:17.0:1468994891.072470:0:4595:0:(import.c:523:import_select_connection()) soaked-MDT0003-osp-MDT0002: connect to NID 0@lo last attempt 4295060090 00000100:00080000:17.0:1468994891.072475:0:4595:0:(import.c:567:import_select_connection()) soaked-MDT0003-osp-MDT0002: tried all connections, increasing latency to 6s 00000100:00080000:17.0:1468994891.072488:0:4595:0:(import.c:593:import_select_connection()) soaked-MDT0003-osp-MDT0002: Connection changing to soaked-MDT0003 (at 192.168.1.111@o2ib10) 00000100:00080000:17.0:1468994891.072492:0:4595:0:(import.c:601:import_select_connection()) soaked-MDT0003-osp-MDT0002: import ffff8803fc39b800 using connection 192.168.1.111@o2ib10/192.168.1.111@o2ib10 00000100:02020000:2.0:1468994891.072768:0:4562:0:(client.c:1266:ptlrpc_check_status()) 11-0: soaked-MDT0000-osp-MDT0002: operation mds_connect to node 192.168.1.108@o2ib10 failed: rc = -114 00000100:00080000:2.0:1468994891.089071:0:4562:0:(import.c:1290:ptlrpc_connect_interpret()) ffff8804256c8000 soaked-MDT0000_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:2.0:1468994891.089075:0:4562:0:(import.c:1336:ptlrpc_connect_interpret()) recovery of soaked-MDT0000_UUID on 192.168.1.108@o2ib10 failed (-114) 00000100:02020000:2.0:1468994891.089104:0:4562:0:(client.c:1266:ptlrpc_check_status()) 11-0: soaked-MDT0003-osp-MDT0002: operation mds_connect to node 192.168.1.111@o2ib10 failed: rc = -114 00000100:00080000:2.0:1468994891.089108:0:4562:0:(import.c:1290:ptlrpc_connect_interpret()) ffff8803fc39b800 soaked-MDT0003_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:2.0:1468994891.089110:0:4562:0:(import.c:1336:ptlrpc_connect_interpret()) recovery of soaked-MDT0003_UUID on 192.168.1.111@o2ib10 failed (-114) ... As shown, at the beginning (timestamps: 1468994840), the MDT0002 on the lola-10 selected lola-11 (192.168.1.111@o2ib10) for the soaked-MDT0003-osp-MDT0002 connection. Unfortunately, the lola-11 returned -114 to the lola-10. That means another instance of soaked-MDT0003-osp-MDT0002 has already connected (or in connecting) to the MDT0003 on the lola-11. Because lola-10 and lola-11 were HA pairs, it was quite possible that the lola-11 took lola-10's role to server as MDT0002 during the lola-10 failover. Since MDT0002 had been mounted on the lola-11, the connection between MDT0002 and MDT0003 had been established at that time, then it is normal that the MDT0002 on the lola-10 got -114 failure. Because lola-10 and lola-11 were HA pairs, when hit -114 failure, the MDT0002 on the lola-10 selected another peer for the soaked-MDT0003-osp-MDT0002 connection, it is lola-10 (192.168.1.111@o2ib10) itself. But it is obviously that MDT0003 was running on the lola-11, so the MDT0002 on the lola-10 got -19 failure. Then, the MDT0002 has to find other peer for the soaked-MDT0003-osp-MDT0002 connection. There were only two choices, either lola-11 or lola-10, so it selected lola-11, the same as the first choice, but still hit -114 failure. And then try to connect lola-10 again, failed as -19 again. So the MDT0002 on the lola-10 tried to connect to MDT0003 repeatedly for a long time (from the timestamps 1468994840 to 1469003821), 8981 seconds passed, but still cannot connect to the MDT0003. Such connection failure blocked the cross-MDT recovery. From the given logs, I cannot find when the soaked-MDT0003-osp-MDT0002 connection was established successfully. Because we have no logs on the lola-11, we do not know what happened for the lola-11. But it is suspected that it is related with another MDT0002 instance on the lola-11 because of the HA pairs configuration. Since the HA is controlled by some third-part HA software, such as heartbeat or pack-maker, can we check whether it works well or not under the failover case? Can we switch the resource by force under such case? Anyway, offer me more logs on the lola-11 will be helpful. Thanks!

            We are still seeing the long recovery times, example:

            lola-10
            ----------------
            mdt.soaked-MDT0002.recovery_status=
            status: COMPLETE
            recovery_start: 1471407102
            recovery_duration: 1644
            completed_clients: 21/21
            replayed_requests: 0
            last_transno: 150323855360
            VBR: ENABLED
            IR: ENABLED
            
            cliffw Cliff White (Inactive) added a comment - We are still seeing the long recovery times, example: lola-10 ---------------- mdt.soaked-MDT0002.recovery_status= status: COMPLETE recovery_start: 1471407102 recovery_duration: 1644 completed_clients: 21/21 replayed_requests: 0 last_transno: 150323855360 VBR: ENABLED IR: ENABLED

            debug kernel files with suffix '_0136' had been created during recovery, those with suffix '_0225' after completion of recovery.

            heckes Frank Heckes (Inactive) added a comment - debug kernel files with suffix '_0136' had been created during recovery, those with suffix '_0225' after completion of recovery.

            People

              yong.fan nasf (Inactive)
              heckes Frank Heckes (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: