[LU-479] Test failure on test suite sanity, subtest test_124a Created: 03/Jul/11 Updated: 28/May/17 Resolved: 28/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Lai Siyao |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 4965 |
| Description |
|
This issue was created by maloo for yujian <yujian@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/282aae9a-a451-11e0-b786-52540025f9af. The sub-test test_124a failed with the following error:
|
| Comments |
| Comment by Jian Yu [ 04/Jul/11 ] |
|
More failure instances: |
| Comment by Lai Siyao [ 04/Jul/11 ] |
|
Maloo test logs shows: Herein LVF should be a big number (around 10000), but it's 0 here, I will print more tunables in the script to find why. |
| Comment by Andreas Dilger [ 06/Jul/11 ] |
|
This problem is causing quite a lot of sanity test failures lately. Has any investigation been done to determine when this test started failing, and what changes were made around that time? |
| Comment by Lai Siyao [ 06/Jul/11 ] |
|
This doesn't always fail, and a possible caused may be autotest environment change (moving from physical machines to VM, and using lvm as backend instead of /dev/sdx). |
| Comment by Peter Jones [ 08/Jul/11 ] |
|
Landed for Lustre 2.1 |
| Comment by Peter Jones [ 08/Jul/11 ] |
|
Oops. I think that the patch landed is actually a diagnostic patch rather than a fix so reopening |
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Build Master (Inactive) [ 08/Jul/11 ] |
|
Integrated in Oleg Drokin : c4cb5e5f0e316f986eae3b11be4f5d81756b85b6
|
| Comment by Peter Jones [ 13/Jul/11 ] |
|
Does it make sense to try and reproduce this manually in a vm environment as with LU482? |
| Comment by Lai Siyao [ 13/Jul/11 ] |
|
Okay, I will test it manually. |
| Comment by Lai Siyao [ 18/Jul/11 ] |
|
Hi Chris, I ran sanity and test_124a for 2 days on client16-vm[1-3], but couldn't reproduce this failure. Last weekend Toro nodes were reserved, and today these VMs can't be accessed yet, Chris, could you help me? thanks,
|
| Comment by Chris Gearing (Inactive) [ 18/Jul/11 ] |
|
I've rebuilt the VM's after Sam's work. They should be usable now. Chris |
| Comment by Lai Siyao [ 20/Jul/11 ] |
|
In maloo test result, one log message are repeated from the beginning of test_124a: (ldlm_request.c:1218:ldlm_cli_update_pool()) @@@ Zero SLV or Limit found (SLV: 0, Limit: 102336) Because zero SLV is invalid, client pool limit and slv won't be updated. It's strange that MDS replies zero SLV to client, because the only chance obd->obd_pool_slv equals 0 is that ldlm_pool_recalc() is never called. |
| Comment by Lai Siyao [ 23/Jul/11 ] |
|
Hi Peter, I couldn't reproduce this failure (for 5 days), and to make debug easier, I'd suggest to enable this test on regression to help collect debug logs. |
| Comment by Peter Jones [ 23/Jul/11 ] |
|
Chris Can you take care of this? Peter |
| Comment by Peter Jones [ 25/Jul/11 ] |
|
Dropping as blocker for now. May re-add if it starts reoccurring regularly. |
| Comment by Zhenyu Xu [ 07/Sep/11 ] |
|
got another hit on https://maloo.whamcloud.com/test_sets/5ecde210-d96b-11e0-8d02-52540025f9af |
| Comment by Kit Westneat (Inactive) [ 19/Sep/12 ] |
|
It looks like these hit it as well: |
| Comment by Andreas Dilger [ 28/May/17 ] |
|
Close old issue. |