Details
-
Bug
-
Resolution: Not a Bug
-
Blocker
-
Lustre 2.8.0, Lustre 2.9.0
-
lola
build: 2.8 GA + patches
-
3
-
9223372036854775807
Description
Description
Error happens during soak testing of build '20160324' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160324). DNE is enabled. MDTs had been formatted with ldiskfs, OSTs using zfs. MDS and OSS nodes are configured in HA active-active failover configuration. MDS nodes operated wiht 1 MDT per MDS, while OSSes running 4 OST per node.
Nodes lola-4 and lola-5 form a HA cluster.
Event history
- 2016-03-29 07:48:04,307:fsmgmt.fsmgmt:INFO triggering fault oss_failover of node lola-4
powercyle node - 2016-03-29 07:52:27,876:fsmgmt.fsmgmt:INFO lola-4 is up
- 2016-03-29 07:53:09,557:fsmgmt.fsmgmt:INFO zpool import and mount of lola-4's OSTs complete
- Recovery don't complete after recovery_time is zero
- 2016-03-29 08:03 (approximately) Aborted recovery manually (lctl --device ... abort_recovery)
Attached files:
messages, console and debug log before (lustre-log-20160329-0759-recovery-stalled) and after recovery was aborted (lustre-log-20160329-0803-recovery-aborted)
Attachments
Activity
Resolution | New: Not a Bug [ 6 ] | |
Status | Original: In Progress [ 3 ] | New: Resolved [ 5 ] |
Attachment | New: ost-failover-setttings-20160726_0702 [ 22357 ] |
Attachment | New: lustre-log-lola-3-2016-07-25_0859-after-ost-recovery-aborted.bz2 [ 22352 ] | |
Attachment | New: lustre-log-lola-30-2016-07-25_0855-ost-recovery-stalled.bz2 [ 22353 ] | |
Attachment | New: lustre-log-lola-30-2016-07-25_0859-after-ost-recovery-aborted.bz2 [ 22354 ] |
Attachment | New: lustre-log-lola-8-2016-07-25_0855-ost-recovery-stalled.bz2 [ 22342 ] | |
Attachment | New: lustre-log-lola-8-2016-07-25_0859-after-ost-recovery-aborted.bz2 [ 22343 ] | |
Attachment | New: lustre-log-lola-3-2016-07-25_0855-ost-recovery-stalled.bz2 [ 22344 ] |
Status | Original: Open [ 1 ] | New: In Progress [ 3 ] |
Attachment | New: messages-lola-5.log-20160722.bz2 [ 22313 ] | |
Attachment | New: console-lola-5.log-20160722.bz2 [ 22314 ] | |
Attachment | New: lustre-log-lola-5-20160722_0434_ost_recovery_stalled.bz2 [ 22315 ] | |
Attachment | New: lustre-log-lola-5-20160722_0438_after_ost_recovery_aborted.bz2 [ 22316 ] |
Assignee | Original: WC Triage [ wc-triage ] | New: nasf [ yong.fan ] |
Link | New: This issue is related to LDEV-398 [ LDEV-398 ] |
Fix Version/s | New: Lustre 2.9.0 [ 11891 ] |
Attachment | New: lola-4-lustre-log-20160415-0420.bz2 [ 21152 ] |