Loading...

XML

Word

Printable

Type: Bug
Resolution: Not a Bug
Priority: Blocker
Fix Version/s: Lustre 2.9.0
Affects Version/s: Lustre 2.8.0, Lustre 2.9.0
Labels:
- soak
Environment:
lola
build: 2.8 GA + patches

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

Error happens during soak testing of build '20160324' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160324). DNE is enabled. MDTs had been formatted with ldiskfs, OSTs using zfs. MDS and OSS nodes are configured in HA active-active failover configuration. MDS nodes operated wiht 1 MDT per MDS, while OSSes running 4 OST per node.
Nodes lola-4 and lola-5 form a HA cluster.

Event history

2016-03-29 07:48:04,307:fsmgmt.fsmgmt:INFO triggering fault oss_failover of node lola-4
powercyle node
2016-03-29 07:52:27,876:fsmgmt.fsmgmt:INFO lola-4 is up
2016-03-29 07:53:09,557:fsmgmt.fsmgmt:INFO zpool import and mount of lola-4's OSTs complete
Recovery don't complete after recovery_time is zero
2016-03-29 08:03 (approximately) Aborted recovery manually (lctl --device ... abort_recovery)

Attached files:
messages, console and debug log before (lustre-log-20160329-0759-recovery-stalled) and after recovery was aborted (lustre-log-20160329-0803-recovery-aborted)

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

console-lola-5.log.bz2
40 kB
29/Mar/16 3:59 PM
console-lola-5.log-20160722.bz2
59 kB
22/Jul/16 12:24 PM
lola-4-console-20160415.bz2
95 kB
15/Apr/16 12:21 PM
lola-4-lustre-log-20160415-0420.bz2
0.3 kB
15/Apr/16 12:30 PM
lola-4-lustre-log-20160415-0430.bz2
936 kB
15/Apr/16 12:23 PM
lola-4-messages-20160415.bz2
215 kB
15/Apr/16 12:21 PM
lustre-log-20160329-0759-recovery-stalled.bz2
0.3 kB
30/Mar/16 8:49 AM
lustre-log-20160329-0803-recovery-aborted.bz2
8 kB
29/Mar/16 3:59 PM
lustre-log-lola-30-2016-07-25_0855-ost-recovery-stalled.bz2
0.3 kB
26/Jul/16 5:39 AM
lustre-log-lola-30-2016-07-25_0859-after-ost-recovery-aborted.bz2
0.3 kB
26/Jul/16 5:39 AM
lustre-log-lola-3-2016-07-25_0855-ost-recovery-stalled.bz2
0.3 kB
25/Jul/16 6:38 PM
lustre-log-lola-3-2016-07-25_0859-after-ost-recovery-aborted.bz2
0.3 kB
26/Jul/16 5:39 AM
lustre-log-lola-5-20160722_0434_ost_recovery_stalled.bz2
0.3 kB
22/Jul/16 12:24 PM
lustre-log-lola-5-20160722_0438_after_ost_recovery_aborted.bz2
176 kB
22/Jul/16 12:24 PM
lustre-log-lola-8-2016-07-25_0855-ost-recovery-stalled.bz2
0.3 kB
25/Jul/16 6:38 PM
lustre-log-lola-8-2016-07-25_0859-after-ost-recovery-aborted.bz2
0.3 kB
25/Jul/16 6:38 PM
messages-lola-5.log.bz2
49 kB
29/Mar/16 3:59 PM
messages-lola-5.log-20160722.bz2
220 kB
22/Jul/16 12:24 PM
ost-failover-setttings-20160726_0702
16 kB
26/Jul/16 3:38 PM

Assignee:: nasf (Inactive)

Reporter:: Frank Heckes (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 29/Mar/16 3:53 PM

Updated:: 27/Jul/16 1:12 PM

Resolved:: 27/Jul/16 1:12 PM

Details

Description

Attachments

Attachments

Activity

People

Dates