Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
lola
build: tip of master(df6cf859bbb29392064e6ddb701f3357e01b3a13) + patches
-
3
-
9223372036854775807
Description
The error occurred during soak testing of build '20151113' (see https://wiki.hpdd.intel.com/pages/viewpage.action?title=Soak+Testing+on+Lola&spaceKey=Releases#SoakTestingonLola-20151113) and earlier already when testing build '20151109'.
DNE is enabled. OSTs had been formatted using zfs, MDTs using ldiskfs. MDS nodes are configured in HA active-active failover configuration.
At three moments in time:
date | node | build ID | soak event | |
---|---|---|---|---|
Nov 9 18:10:01 | lola-9 | build: 20151109 | no fault; only job execution | |
Nov 13 14:30:02 | lola-10 | build 20151113 | during stopping of soak | |
Nov 14 05:35:01 | lola-11 | build 20151113 | no fault ; only job execution | |
Nov 14 05:45:01 | {{ lola-9}} | build 20151113 | no fault ; only job execution |
the oom - killer had been invoked on the nodes specified. (All events happened at times where no fault was injected.)
Attached files: console and syslog of nodes affected.
Unfortunately collectl wasn't running to gather performance counters.
The tool has been enabled on all soak nodes to be able get memory, especially slab stats during one of the next sessions.
Attachments
Issue Links
- is related to
-
LU-7455 Tracking tickets to make DNE pass soak-test.
- Resolved