[LU-1745] Test failure on test suite recovery-small, subtest test_105 Created: 14/Aug/12 Updated: 17/Sep/12 Resolved: 17/Aug/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.2.0, Lustre 2.3.0, Lustre 2.1.2 |
| Fix Version/s: | Lustre 2.3.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Bob Glossman (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Server: lustre-master-tag-2.2.92-RHEL6 |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 4489 | ||||||||||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/822950b2-e650-11e1-afac-52540035b04c. The sub-test test_105 failed with the following error:
== recovery-small test 105: IR: NON IR clients support == 22:47:10 (1344923230) mgs.MGS.ir_timeout Stopping client client-2 /mnt/lustre (opts:) Starting client: client-2: -o flock,user_xattr,acl,noir fat-amd-1@tcp:/lustre /mnt/lustre recovery-small test_105: @@@@@@ FAIL: IR state must be OFF at client-2 Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:3614:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:3636:error() = /usr/lib64/lustre/tests/recovery-small.sh:1446:test_105() = /usr/lib64/lustre/tests/test-framework.sh:3869:run_one() = /usr/lib64/lustre/tests/test-framework.sh:3898:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:3772:run_test() = /usr/lib64/lustre/tests/recovery-small.sh:1477:main() I checked the IR state on client-2 and it's enabled
|
| Comments |
| Comment by Andreas Dilger [ 15/Aug/12 ] |
|
It isn't clear from the comments what Lustre version the client2 node is running. If one node is running master, but the other is running a versions without LU-1095 (http://review.whamcloud.com/2853) applied, the I think it would cause this problem. What is suspicious is that it reports that the IR state should be "OFF" instead of "DISABLED" as was introduced with the new patch. What is needed here is for the LU-1095 patch to be landed on b2_2 as well so that interop tests can pass. |
| Comment by Sarah Liu [ 15/Aug/12 ] |
|
Both client-1 and client-2 are running master which contain this commit |
| Comment by James A Simmons [ 15/Aug/12 ] |
|
If that is the case then how did it pass maloo before. Something is strange here. |
| Comment by James A Simmons [ 15/Aug/12 ] |
|
Looking at the above log shows that recovery-small.sh was not updated. With current master the error output will always be ENABLED/DISABLED. I bet if you do a diff between the test recovery-small.sh and the one in the git repo will show them out of sync. Andreas is right about needing a patch for b2_2. I will wipe up a patch for you. |
| Comment by James A Simmons [ 15/Aug/12 ] |
|
Doh! I see where code was not updated in master. Patch is at http://review.whamcloud.com/#change,3667 |
| Comment by James A Simmons [ 16/Aug/12 ] |
|
Patch for b2_2 to pass inter-op test. http://review.whamcloud.com/#change,3698 |
| Comment by Peter Jones [ 16/Aug/12 ] |
|
Bob will take care of this one |
| Comment by Bob Glossman (Inactive) [ 16/Aug/12 ] |
|
James, Thanks for the patch to b2_2. That branch is closed for update right now, so we won't be landing that right away. This is not an issue for inter-op between 2.1 and current, as the changes are all in subtests that didn't exist in the 2.1 version of recovery-small.sh |
| Comment by Peter Jones [ 17/Aug/12 ] |
|
Landed for 2.3 |