[LU-3733] Test failure on test suite conf-sanity Created: 10/Aug/13  Updated: 22/Dec/17  Resolved: 22/Dec/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9633

 Description   

This issue was created by maloo for Swapnil Pimpale <spimpale@ddn.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/b4c585da-00c7-11e3-b06c-52540035b04c.

The OST dmesg shows the following error:

LustreError: 166-1: MGC10.10.17.15@tcp: Connection to MGS (at 10.10.17.15@tcp) was lost; in progress operations using this service will fail
LustreError: 137-5: lustre-OST0000_UUID: not available for connect from 10.10.17.13@tcp (no target)
LustreError: Skipped 620 previous similar messages
INFO: task ldiskfslazyinit:2059 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.



 Comments   
Comment by Swapnil Pimpale (Inactive) [ 10/Aug/13 ]

The failed sub test is test_76

Comment by Oleg Drokin [ 14/Aug/13 ]

There's a sign of a deadlock on the ost side where unmount and lazyinit are shoiwng signs of transaction being held open by something and then there's parallel unmount that they might be racing with (that holds the transaction?) that waits for everything to shut down.

Comment by Andreas Dilger [ 22/Dec/17 ]

Close old bug that has not been hit in a long time.

Generated at Sat Feb 10 01:36:27 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.