[LU-4442] Failure on test suite replay-vbr test_7g: Test 7g.3 failed Created: 06/Jan/14 Updated: 14/Jul/15 Resolved: 11/Feb/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.6, Lustre 2.6.0, Lustre 2.5.1, Lustre 2.4.3 |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.5.1 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Emoly Liu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | mn1, mn4 | ||
| Environment: |
client and server: lustre-master build 1823 RHEL6 ldiskfs |
||
| Severity: | 3 |
| Rank (Obsolete): | 12186 |
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/930d9194-74df-11e3-96b0-52540035b04c. The sub-test test_7g failed with the following error:
test log shows: CMD: client-31vm7 /usr/sbin/lctl list_param osp.*osc*.old_sync_processed 2> /dev/null osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed osp.lustre-OST0002-osc-MDT0000.old_sync_processed osp.lustre-OST0003-osc-MDT0000.old_sync_processed osp.lustre-OST0004-osc-MDT0000.old_sync_processed osp.lustre-OST0005-osc-MDT0000.old_sync_processed osp.lustre-OST0006-osc-MDT0000.old_sync_processed wait mds1 secs maximumly for client-31vm7 mds-ost sync done. /usr/lib64/lustre/tests/test-framework.sh: line 2135: [: mds1: integer expression expected CMD: client-31vm7 /usr/sbin/lctl get_param -n osp.*osc*.old_sync_processed 1 1 1 1 1 1 1 recovery node iozone not done in mds1 sec. replay-vbr test_7g: @@@@@@ FAIL: Test 7g.3 failed |
| Comments |
| Comment by Oleg Drokin [ 09/Jan/14 ] |
|
there's certainly some parsing error somewhee that makes us pick out of mds name instead of some timeout and so thngs go downhill fro there: /usr/lib64/lustre/tests/test-framework.sh: line 2135: [: mds1: integer expression expected |
| Comment by Peter Jones [ 09/Jan/14 ] |
|
Emoly Could you please look into this one? Thanks Peter |
| Comment by Emoly Liu [ 10/Jan/14 ] |
|
I can reproduce it easily. I will investigate and fix it. |
| Comment by Emoly Liu [ 10/Jan/14 ] |
|
patch at: http://review.whamcloud.com/8796, which fixes the test script issue. |
| Comment by Jian Yu [ 10/Jan/14 ] |
|
Lustre build: http://build.whamcloud.com/job/lustre-reviews/20841/ The same failure occurred: |
| Comment by Emoly Liu [ 13/Jan/14 ] |
|
The maloo test report https://maloo.whamcloud.com/test_logs/2a98cbc0-7bfa-11e3-a7b6-52540035b04c/show_text shows that test_7g has another problem besides test script issue to be fixed by http://review.whamcloud.com/8796. I will investigate and provide another patch for it. |
| Comment by Emoly Liu [ 17/Jan/14 ] |
|
By searching maloo, I notice this error has occurred since Dec. 21, and finally I find it is related to I am working on the patch. |
| Comment by Emoly Liu [ 23/Jan/14 ] |
|
The root cause of this failure is that since mdt_object_exists() was added to mdt_reint_link() in http://review.whamcloud.com/#/c/8371, if the child object doesn't exist, there is no chance to do object version check and client1 will not be evicted. I create the following two patches to fix this problem, and I am not sure which is better:
Tappro, could you please give any advice? Thanks. |
| Comment by Emoly Liu [ 24/Jan/14 ] |
|
Thanks, Tappro, I saw your choice of http://review.whamcloud.com/8973 . |
| Comment by Emoly Liu [ 11/Feb/14 ] |
|
Both patches have been landed to 2.6. |
| Comment by Jian Yu [ 17/Feb/14 ] |
|
Patches for Lustre b2_5 branch: |
| Comment by Jian Yu [ 17/Feb/14 ] |
|
Landing http://review.whamcloud.com/8371 on Lustre b2_5 build #25 also caused this regression failure on Lustre b2_5 branch: |
| Comment by Sarah Liu [ 25/Mar/14 ] |
|
Also hit on interop test between 2.5.1 server and master client: https://maloo.whamcloud.com/test_sets/8e4f6b3a-b244-11e3-a93f-52540035b04c |
| Comment by Jian Yu [ 17/Apr/14 ] |
This is because http://review.whamcloud.com/9213 was landed on Lustre b2_5 branch. We need change Lustre version number 2.5.52 to 2.5.1 in replay-vbr test_7g(). |
| Comment by Jian Yu [ 17/Apr/14 ] |
|
Back-ported patch for Lustre b2_4 branch: http://review.whamcloud.com/9987 |