Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9867

MDT crashes on failback, attempting to umount

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.11.0
    • Soak performance cluster
    • 3
    • 9223372036854775807

    Description

      Soak-11 MDT is failed over to soak-10.
      Failover completes, soak-10 attempts to unmount

      2017-08-11 07:04:42,819:fsmgmt.fsmgmt:INFO     Failing back soaked-MDT0003 ...
      2017-08-11 07:04:42,819:fsmgmt.fsmgmt:INFO     Unmounting soaked-MDT0003 on soak-10 ...
      

      soak-10 wedges on unmount, and then crashes.

      Aug 11 07:04:42 soak-10 kernel: Lustre: Failing over soaked-MDT0003
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22116:0:(osp_precreate.c:619:osp_precreate_send()) soaked-OST0000-osc-MDT0003: can't precreate: rc = -5
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22116:0:(osp_precreate.c:1259:osp_precreate_thread()) soaked-OST0000-osc-MDT0003: cannot precreate objects: rc = -5
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22118:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) soaked-OST0001-osc-MDT0003: cannot cleanup orphans: rc = -5
      Aug 11 07:04:48 soak-10 kernel: LustreError: 3751:0:(osp_precreate.c:1311:osp_precreate_ready_condition()) soaked-OST000e-osc-MDT0003: precreate failed opd_pre_status -108
      Aug 11 07:04:48 soak-10 kernel: LustreError: 3457:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88039b436f00 x1575403953520176/t0(0) o13->soaked-OST0011-osc-MDT0003@192.168.1.107@o2ib:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
      Aug 11 07:04:48 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.144@o2ib (stopping)
      Aug 11 07:04:48 soak-10 kernel: LustreError: 3460:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff8803bf7e4200 x1575403953576704/t0(0) o13->soaked-OST0009-osc-MDT0003@192.168.1.105@o2ib:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
      Aug 11 07:04:48 soak-10 kernel: LustreError: 3460:0:(client.c:1166:ptlrpc_import_delay_req()) Skipped 1 previous similar message
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22162:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) soaked-OST0017-osc-MDT0003: cannot cleanup orphans: rc = -5
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22162:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) Skipped 4 previous similar messages
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1094:ldlm_resource_complain()) mdt-soaked-MDT0003_UUID: namespace resource [0x2c000040c:0x8907:0x0].0x0 (ffff880765a7d2c0) refcount nonzero (2) after lock cleanup; forcing cleanup.
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c000040c:0x8907:0x0].0x0 (ffff880765a7d2c0) refcount = 3
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order):
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1682:ldlm_resource_dump()) ### ### ns: mdt-soaked-MDT0003_UUID lock: ffff88067bc04200/0xe70ae3776920e59b lrc: 2/0,1 mode: CW/CW res: [0x2c000040c:0x8907:0x0].0x0 bits 0x2/0x0 rrc: 4 type: IBT flags: 0x40316400000000 nid: local remote: 0x0 expref: -99 pid: 3751 timeout: 0 lvb_type: 0
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c0000bea:0xb74a:0x0].0x0 (ffff8806abf93140) refcount = 3
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order):
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c0000bea:0xd06a:0x0].0x54cb7170 (ffff8803e4f5f5c0) refcount = 17
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order):
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1697:ldlm_resource_dump()) Waiting locks:
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1699:ldlm_resource_dump()) ### ### ns: mdt-soaked-MDT0003_UUID lock: ffff88038cd75400/0xe70ae377690c01a9 lrc: 2/0,1 mode: --/PW res: [0x2c0000bea:0xd06a:0x0].0x54cb7170 bits 0x2/0x0 rrc: 18 type: IBT flags: 0x40316400000020 nid: local remote: 0x0 expref: -99 pid: 11146 timeout: 0 lvb_type: 0
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c000040c:0x8907:0x0].0x31 (ffff880765a7d500) refcount = 2
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order):
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c0000bea:0xd06a:0x0].0x0 (ffff88069e2c6900) refcount = 33
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order):
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c0000bea:0xb74a:0x0].0x7d67b49 (ffff8806abf92900) refcount = 2
      Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order):
      Aug 11 07:04:48 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.109@o2ib (stopping)
      Aug 11 07:04:48 soak-10 kernel: LustreError: 3459:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88035e6e4500 x1575403953642112/t0(0) o13->soaked-OST0013-osc-MDT0003@192.168.1.103@o2ib:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
      Aug 11 07:04:48 soak-10 kernel: LustreError: 3459:0:(client.c:1166:ptlrpc_import_delay_req()) Skipped 16 previous similar messages
      Aug 11 07:04:48 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.122@o2ib (stopping)
      Aug 11 07:04:48 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.104@o2ib (stopping)
      Aug 11 07:04:48 soak-10 kernel: Lustre: Skipped 14 previous similar messages
      Aug 11 07:04:52 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.108@o2ib (stopping)
      Aug 11 07:04:52 soak-10 kernel: Lustre: Skipped 8 previous similar messages
      Aug 11 07:04:56 soak-10 kernel: LustreError: 11-0: soaked-MDT0003-osp-MDT0002: operation obd_ping to node 0@lo failed: rc = -107
      Aug 11 07:04:56 soak-10 kernel: LustreError: Skipped 2 previous similar messages
      Aug 11 07:04:56 soak-10 kernel: Lustre: soaked-MDT0003-osp-MDT0002: Connection to soaked-MDT0003 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
      Aug 11 07:04:56 soak-10 kernel: Lustre: Skipped 3 previous similar messages
      Aug 11 07:05:01 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.102@o2ib (stopping)
      Aug 11 07:05:01 soak-10 kernel: Lustre: Skipped 15 previous similar messages
      Aug 11 07:05:09 soak-10 kernel: LustreError: 0-0: Forced cleanup waiting for mdt-soaked-MDT0003_UUID namespace with 5 resources in use, (rc=-110)
      Aug 11 07:05:18 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 172.16.1.45@o2ib1 (stopping)
      ...
      Aug 11 07:05:59 soak-10 kernel: LustreError: 0-0: Forced cleanup waiting for mdt-soaked-MDT0003_UUID namespace with 5 resources in use, (rc=-110)
      Aug 11 07:06:00 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.117@o2ib (stopping)
      Aug 11 07:06:00 soak-10 kernel: Lustre: Skipped 4 previous similar messages
      Aug 11 07:06:12 soak-10 kernel: LustreError: 3706:0:(lod_qos.c:208:lod_statfs_and_check()) soaked-MDT0003-mdtlov: statfs: rc = -108
      Aug 11 07:06:12 soak-10 kernel: LustreError: 3706:0:(ldlm_lockd.c:1415:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8803806e9400 ns: mdt-soaked-MDT0003_UUID lock: ffff88037df50400/0xe70ae3776a0f6960 lrc: 3/0,0 mode: CR/CR res: [0x2c0000bda:0x338a:0x0].0x0 bits 0x9/0x9 rrc: 2 type: IBT flags: 0x50200000000000 nid: 192.168.1.131@o2ib remote: 0x26342f5db48b4138 expref: 3 pid: 3706 timeout: 0 lvb_type: 0
      Aug 11 07:06:15 soak-10 kernel: LustreError: 3762:0:(lod_qos.c:208:lod_statfs_and_check()) soaked-MDT0003-mdtlov: statfs: rc = -108
      Aug 11 07:06:15 soak-10 kernel: LustreError: 3762:0:(lod_qos.c:208:lod_statfs_and_check()) Skipped 44 previous similar messages
      Aug 11 07:06:23 soak-10 kernel: LustreError: 3751:0:(lod_qos.c:208:lod_statfs_and_check()) soaked-MDT0003-mdtlov: statfs: rc = -108
      Aug 11 07:06:23 soak-10 kernel: LustreError: 3751:0:(lod_qos.c:208:lod_statfs_and_check()) Skipped 46 previous similar messages
      Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) header@ffff8803ddfc5bd8[0x0, 1, [0x200000007:0x1:0x0] hash exist]{
      Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) ....mdt@ffff8803ddfc5c28mdt-object@ffff8803ddfc5bd8( , writecount=0)
      Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) ....mdd@ffff8803a16d3870mdd-object@ffff8803a16d3870(open_count=0, valid=0, cltime=0, flags=0)
      Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) ....lod@ffff8806a13e4208lod-object@ffff8806a13e4208
      Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) ....osp@ffff88038baf4910osp-object@ffff88038baf48c0
      Aug 11 07:06:26 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) } header@ffff8803ddfc5bd8
      Aug 11 07:10:33 soak-10 rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="1691" x-info="http://www.rsyslog.com"] start
      Aug 11 07:09:59 soak-10 kernel: microcode: microcode updated early to revision 0x710, date = 2013-06-17
      Aug 11 07:09:59 soak-10 kernel: Initializing cgroup subsys cpuset
      

      Attachments

        Activity

          People

            laisiyao Lai Siyao
            cliffw Cliff White (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: