Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1193

test script incompatibility when running server as 2.1 and client as 2.2

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.3.0
    • None
    • None
    • 3
    • 4244

    Description

      When running conf-sanity.sh with 2.1 server and 2.2 client, sub-test 61 failed, since the test script is different between server and client. It actually ran the client script on the server side which caused error. After copying conf-sanity from server to client, the test passed. The similar issue also found in some sanity sub-tests, such as 133a,133d,160

      Attachments

        Issue Links

          Activity

            [LU-1193] test script incompatibility when running server as 2.1 and client as 2.2
            ys Yang Sheng added a comment -

            Close it, please reopen if any works need.

            ys Yang Sheng added a comment - Close it, please reopen if any works need.

            Yang Sheng,

            Could you create a new ticket for each of these tests, or possibly groups of tests. We don't group source code issues like this and shouldn't do so with test code issues either.

            Thanks

            Chris

            chris Chris Gearing (Inactive) added a comment - Yang Sheng, Could you create a new ticket for each of these tests, or possibly groups of tests. We don't group source code issues like this and shouldn't do so with test code issues either. Thanks Chris

            Yang Sheng,
            having a list is the starting point, but what still needs to be done is to fix the test scripts so that they are skipped if the server does not have the right functionality to run the test from the client. This can be checked at the start of these failing tests by looking at "lctl get_param mdc.*.connect_flags" (for features that have a connect flag) or by "do_facet $SINGLEMDS

            {some check}

            " for other features. The check might be looking at the presence of "lctl list_param mdt.*.rename_stats" for sanity.sh test_133d, or simply "lctl get_param version" for others.

            Having just a list of failing tests here in Jira does not stop those tests from failing, and it pollutes the test results with failures, which wastes everyone's time.

            adilger Andreas Dilger added a comment - Yang Sheng, having a list is the starting point, but what still needs to be done is to fix the test scripts so that they are skipped if the server does not have the right functionality to run the test from the client. This can be checked at the start of these failing tests by looking at "lctl get_param mdc.*.connect_flags" (for features that have a connect flag) or by "do_facet $SINGLEMDS {some check} " for other features. The check might be looking at the presence of "lctl list_param mdt.*.rename_stats" for sanity.sh test_133d, or simply "lctl get_param version" for others. Having just a list of failing tests here in Jira does not stop those tests from failing, and it pollutes the test results with failures, which wastes everyone's time.
            sarah Sarah Liu added a comment -

            recovery-small, subtest 100 to 105 are new in 2.2 script, should not be run under 2.1

            sarah Sarah Liu added a comment - recovery-small, subtest 100 to 105 are new in 2.2 script, should not be run under 2.1
            sarah Sarah Liu added a comment - - edited

            sanity-225a sanity-225b, these two tests should not be run on 2.1.x, it will cause oops.

            sarah Sarah Liu added a comment - - edited sanity-225a sanity-225b, these two tests should not be run on 2.1.x, it will cause oops.
            ys Yang Sheng added a comment -

            Ok, So we have a list as below:

            conf-sanity-61
            sanity-133a,133d,160

            Please comment other if not in the list.

            ys Yang Sheng added a comment - Ok, So we have a list as below: conf-sanity-61 sanity-133a,133d,160 Please comment other if not in the list.

            This is causing many tests (14% or more) to fail during autotest. It makes sense to add a simple check to each of the failing subtests to verify whether the server is capable of running this test properly.

            The easiest way would be something like "do_facet mds lctl get_param version" and check if it is new enough.

            A better solution would be to have a test-specific check, like for sanity test_133d to see if the MDS has the right proc stats or not. In other cases it might not be so easy and a version check may be needed.

            adilger Andreas Dilger added a comment - This is causing many tests (14% or more) to fail during autotest. It makes sense to add a simple check to each of the failing subtests to verify whether the server is capable of running this test properly. The easiest way would be something like "do_facet mds lctl get_param version" and check if it is new enough. A better solution would be to have a test-specific check, like for sanity test_133d to see if the MDS has the right proc stats or not. In other cases it might not be so easy and a version check may be needed.
            pjones Peter Jones added a comment -

            Yangsheng

            Could you please look into this one?

            THanks

            Peter

            pjones Peter Jones added a comment - Yangsheng Could you please look into this one? THanks Peter

            People

              ys Yang Sheng
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: