[LU-10138] recovery-mds-scale test_failover_ost: test_failover_ost returned 1 Created: 18/Oct/17  Updated: 18/Oct/17

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Casper Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

trevis, failover
servers: CentOS7.4, zfs, branch master, v2.10.54, b3652
clients: CentOS7.4, branch master, v2.10.54, b3652


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

https://testing.hpdd.intel.com/test_sessions/46566d21-2975-4c62-8527-438ab3c1663f

A client hit a kernel panic during the second subtest of the failover session. This was followed by all remaining tests not running.

From client console:

Kernel panic - not syncing: softlockup: hung tasks

dump_stack+0x19/0x1b
panic+0xe8/0x20d
? show_regs+0x5f/0x220
watchdog_timer_fn+0x221/0x230
? watchdog+0x40/0x40
__hrtimer_run_queues+0xd4/0x260
hrtimer_interrupt+0xaf/0x1d0
local_apic_timer_interrupt+0x35/0x60
smp_apic_timer_interrupt+0x3d/0x50
apic_timer_interrupt+0x6d/0x80
? sync_inodes_sb+0x11c/0x1f0

Generated at Sat Feb 10 02:32:23 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.