[LU-479] Test failure on test suite sanity, subtest test_124a Created: 03/Jul/11  Updated: 28/May/17  Resolved: 28/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Lai Siyao
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 4965

 Description   

This issue was created by maloo for yujian <yujian@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/282aae9a-a451-11e0-b786-52540025f9af.

The sub-test test_124a failed with the following error:

No locks dropped in 50s. LRU size: 2003



 Comments   
Comment by Jian Yu [ 04/Jul/11 ]

More failure instances:
https://maloo.whamcloud.com/test_sets/4b97a594-a448-11e0-b786-52540025f9af
https://maloo.whamcloud.com/test_sets/c597e884-a44f-11e0-b786-52540025f9af

Comment by Lai Siyao [ 04/Jul/11 ]

Maloo test logs shows:
LRU=2003
LVF=0

Herein LVF should be a big number (around 10000), but it's 0 here, I will print more tunables in the script to find why.

Comment by Andreas Dilger [ 06/Jul/11 ]

This problem is causing quite a lot of sanity test failures lately. Has any investigation been done to determine when this test started failing, and what changes were made around that time?

Comment by Lai Siyao [ 06/Jul/11 ]

This doesn't always fail, and a possible caused may be autotest environment change (moving from physical machines to VM, and using lvm as backend instead of /dev/sdx).

Comment by Peter Jones [ 08/Jul/11 ]

Landed for Lustre 2.1

Comment by Peter Jones [ 08/Jul/11 ]

Oops. I think that the patch landed is actually a diagnostic patch rather than a fix so reopening

Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » i686,client,el6,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » x86_64,client,el5,ofa #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » i686,client,el5,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » i686,client,el5,ofa #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » x86_64,server,el5,ofa #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » i686,server,el5,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » i686,server,el5,ofa #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 08/Jul/11 ]

Integrated in lustre-master » i686,server,el6,inkernel #199
LU-479 sanity 124a failed

Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
Files :

  • lustre/tests/sanity.sh
Comment by Peter Jones [ 13/Jul/11 ]

Does it make sense to try and reproduce this manually in a vm environment as with LU482?

Comment by Lai Siyao [ 13/Jul/11 ]

Okay, I will test it manually.

Comment by Lai Siyao [ 18/Jul/11 ]

Hi Chris,

I ran sanity and test_124a for 2 days on client16-vm[1-3], but couldn't reproduce this failure.

Last weekend Toro nodes were reserved, and today these VMs can't be accessed yet, Chris, could you help me?

thanks,

  • Lai
Comment by Chris Gearing (Inactive) [ 18/Jul/11 ]

I've rebuilt the VM's after Sam's work. They should be usable now.

Chris

Comment by Lai Siyao [ 20/Jul/11 ]

In maloo test result, one log message are repeated from the beginning of test_124a:

(ldlm_request.c:1218:ldlm_cli_update_pool()) @@@ Zero SLV or Limit found (SLV: 0, Limit: 102336)

Because zero SLV is invalid, client pool limit and slv won't be updated.

It's strange that MDS replies zero SLV to client, because the only chance obd->obd_pool_slv equals 0 is that ldlm_pool_recalc() is never called.

Comment by Lai Siyao [ 23/Jul/11 ]

Hi Peter, I couldn't reproduce this failure (for 5 days), and to make debug easier, I'd suggest to enable this test on regression to help collect debug logs.

Comment by Peter Jones [ 23/Jul/11 ]

Chris

Can you take care of this?

Peter

Comment by Peter Jones [ 25/Jul/11 ]

Dropping as blocker for now. May re-add if it starts reoccurring regularly.

Comment by Zhenyu Xu [ 07/Sep/11 ]

got another hit on https://maloo.whamcloud.com/test_sets/5ecde210-d96b-11e0-8d02-52540025f9af

Comment by Kit Westneat (Inactive) [ 19/Sep/12 ]

It looks like these hit it as well:
https://maloo.whamcloud.com/test_sessions/7c897254-fdde-11e1-a1b4-52540035b04c https://maloo.whamcloud.com/test_sessions/cb9c8b6c-f6e3-11e1-b320-52540035b04c

Comment by Andreas Dilger [ 28/May/17 ]

Close old issue.

Generated at Sat Feb 10 01:07:26 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.