Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.11.0
-
Soak performance cluster
-
3
-
9223372036854775807
Description
Soak-11 MDT is failed over to soak-10.
Failover completes, soak-10 attempts to unmount
2017-08-11 07:04:42,819:fsmgmt.fsmgmt:INFO Failing back soaked-MDT0003 ... 2017-08-11 07:04:42,819:fsmgmt.fsmgmt:INFO Unmounting soaked-MDT0003 on soak-10 ...
soak-10 wedges on unmount, and then crashes.
Aug 11 07:04:42 soak-10 kernel: Lustre: Failing over soaked-MDT0003 Aug 11 07:04:48 soak-10 kernel: LustreError: 22116:0:(osp_precreate.c:619:osp_precreate_send()) soaked-OST0000-osc-MDT0003: can't precreate: rc = -5 Aug 11 07:04:48 soak-10 kernel: LustreError: 22116:0:(osp_precreate.c:1259:osp_precreate_thread()) soaked-OST0000-osc-MDT0003: cannot precreate objects: rc = -5 Aug 11 07:04:48 soak-10 kernel: LustreError: 22118:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) soaked-OST0001-osc-MDT0003: cannot cleanup orphans: rc = -5 Aug 11 07:04:48 soak-10 kernel: LustreError: 3751:0:(osp_precreate.c:1311:osp_precreate_ready_condition()) soaked-OST000e-osc-MDT0003: precreate failed opd_pre_status -108 Aug 11 07:04:48 soak-10 kernel: LustreError: 3457:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88039b436f00 x1575403953520176/t0(0) o13->soaked-OST0011-osc-MDT0003@192.168.1.107@o2ib:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Aug 11 07:04:48 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.144@o2ib (stopping) Aug 11 07:04:48 soak-10 kernel: LustreError: 3460:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff8803bf7e4200 x1575403953576704/t0(0) o13->soaked-OST0009-osc-MDT0003@192.168.1.105@o2ib:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Aug 11 07:04:48 soak-10 kernel: LustreError: 3460:0:(client.c:1166:ptlrpc_import_delay_req()) Skipped 1 previous similar message Aug 11 07:04:48 soak-10 kernel: LustreError: 22162:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) soaked-OST0017-osc-MDT0003: cannot cleanup orphans: rc = -5 Aug 11 07:04:48 soak-10 kernel: LustreError: 22162:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) Skipped 4 previous similar messages Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1094:ldlm_resource_complain()) mdt-soaked-MDT0003_UUID: namespace resource [0x2c000040c:0x8907:0x0].0x0 (ffff880765a7d2c0) refcount nonzero (2) after lock cleanup; forcing cleanup. Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c000040c:0x8907:0x0].0x0 (ffff880765a7d2c0) refcount = 3 Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order): Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1682:ldlm_resource_dump()) ### ### ns: mdt-soaked-MDT0003_UUID lock: ffff88067bc04200/0xe70ae3776920e59b lrc: 2/0,1 mode: CW/CW res: [0x2c000040c:0x8907:0x0].0x0 bits 0x2/0x0 rrc: 4 type: IBT flags: 0x40316400000000 nid: local remote: 0x0 expref: -99 pid: 3751 timeout: 0 lvb_type: 0 Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c0000bea:0xb74a:0x0].0x0 (ffff8806abf93140) refcount = 3 Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order): Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c0000bea:0xd06a:0x0].0x54cb7170 (ffff8803e4f5f5c0) refcount = 17 Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order): Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1697:ldlm_resource_dump()) Waiting locks: Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1699:ldlm_resource_dump()) ### ### ns: mdt-soaked-MDT0003_UUID lock: ffff88038cd75400/0xe70ae377690c01a9 lrc: 2/0,1 mode: --/PW res: [0x2c0000bea:0xd06a:0x0].0x54cb7170 bits 0x2/0x0 rrc: 18 type: IBT flags: 0x40316400000020 nid: local remote: 0x0 expref: -99 pid: 11146 timeout: 0 lvb_type: 0 Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c000040c:0x8907:0x0].0x31 (ffff880765a7d500) refcount = 2 Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order): Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c0000bea:0xd06a:0x0].0x0 (ffff88069e2c6900) refcount = 33 Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order): Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2c0000bea:0xb74a:0x0].0x7d67b49 (ffff8806abf92900) refcount = 2 Aug 11 07:04:48 soak-10 kernel: LustreError: 22313:0:(ldlm_resource.c:1679:ldlm_resource_dump()) Granted locks (in reverse order): Aug 11 07:04:48 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.109@o2ib (stopping) Aug 11 07:04:48 soak-10 kernel: LustreError: 3459:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88035e6e4500 x1575403953642112/t0(0) o13->soaked-OST0013-osc-MDT0003@192.168.1.103@o2ib:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 Aug 11 07:04:48 soak-10 kernel: LustreError: 3459:0:(client.c:1166:ptlrpc_import_delay_req()) Skipped 16 previous similar messages Aug 11 07:04:48 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.122@o2ib (stopping) Aug 11 07:04:48 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.104@o2ib (stopping) Aug 11 07:04:48 soak-10 kernel: Lustre: Skipped 14 previous similar messages Aug 11 07:04:52 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.108@o2ib (stopping) Aug 11 07:04:52 soak-10 kernel: Lustre: Skipped 8 previous similar messages Aug 11 07:04:56 soak-10 kernel: LustreError: 11-0: soaked-MDT0003-osp-MDT0002: operation obd_ping to node 0@lo failed: rc = -107 Aug 11 07:04:56 soak-10 kernel: LustreError: Skipped 2 previous similar messages Aug 11 07:04:56 soak-10 kernel: Lustre: soaked-MDT0003-osp-MDT0002: Connection to soaked-MDT0003 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Aug 11 07:04:56 soak-10 kernel: Lustre: Skipped 3 previous similar messages Aug 11 07:05:01 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.102@o2ib (stopping) Aug 11 07:05:01 soak-10 kernel: Lustre: Skipped 15 previous similar messages Aug 11 07:05:09 soak-10 kernel: LustreError: 0-0: Forced cleanup waiting for mdt-soaked-MDT0003_UUID namespace with 5 resources in use, (rc=-110) Aug 11 07:05:18 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 172.16.1.45@o2ib1 (stopping) ... Aug 11 07:05:59 soak-10 kernel: LustreError: 0-0: Forced cleanup waiting for mdt-soaked-MDT0003_UUID namespace with 5 resources in use, (rc=-110) Aug 11 07:06:00 soak-10 kernel: Lustre: soaked-MDT0003: Not available for connect from 192.168.1.117@o2ib (stopping) Aug 11 07:06:00 soak-10 kernel: Lustre: Skipped 4 previous similar messages Aug 11 07:06:12 soak-10 kernel: LustreError: 3706:0:(lod_qos.c:208:lod_statfs_and_check()) soaked-MDT0003-mdtlov: statfs: rc = -108 Aug 11 07:06:12 soak-10 kernel: LustreError: 3706:0:(ldlm_lockd.c:1415:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8803806e9400 ns: mdt-soaked-MDT0003_UUID lock: ffff88037df50400/0xe70ae3776a0f6960 lrc: 3/0,0 mode: CR/CR res: [0x2c0000bda:0x338a:0x0].0x0 bits 0x9/0x9 rrc: 2 type: IBT flags: 0x50200000000000 nid: 192.168.1.131@o2ib remote: 0x26342f5db48b4138 expref: 3 pid: 3706 timeout: 0 lvb_type: 0 Aug 11 07:06:15 soak-10 kernel: LustreError: 3762:0:(lod_qos.c:208:lod_statfs_and_check()) soaked-MDT0003-mdtlov: statfs: rc = -108 Aug 11 07:06:15 soak-10 kernel: LustreError: 3762:0:(lod_qos.c:208:lod_statfs_and_check()) Skipped 44 previous similar messages Aug 11 07:06:23 soak-10 kernel: LustreError: 3751:0:(lod_qos.c:208:lod_statfs_and_check()) soaked-MDT0003-mdtlov: statfs: rc = -108 Aug 11 07:06:23 soak-10 kernel: LustreError: 3751:0:(lod_qos.c:208:lod_statfs_and_check()) Skipped 46 previous similar messages Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) header@ffff8803ddfc5bd8[0x0, 1, [0x200000007:0x1:0x0] hash exist]{ Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) ....mdt@ffff8803ddfc5c28mdt-object@ffff8803ddfc5bd8( , writecount=0) Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) ....mdd@ffff8803a16d3870mdd-object@ffff8803a16d3870(open_count=0, valid=0, cltime=0, flags=0) Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) ....lod@ffff8806a13e4208lod-object@ffff8806a13e4208 Aug 11 07:06:25 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) ....osp@ffff88038baf4910osp-object@ffff88038baf48c0 Aug 11 07:06:26 soak-10 kernel: LustreError: 3408:0:(osp_dev.c:1276:osp_device_free()) } header@ffff8803ddfc5bd8 Aug 11 07:10:33 soak-10 rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="1691" x-info="http://www.rsyslog.com"] start Aug 11 07:09:59 soak-10 kernel: microcode: microcode updated early to revision 0x710, date = 2013-06-17 Aug 11 07:09:59 soak-10 kernel: Initializing cgroup subsys cpuset