Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4341

Failure on test suite sanity test_170: expected 31 bad lines, but got 34

Details

    • Bug
    • Resolution: Unresolved
    • Blocker
    • None
    • Lustre 2.5.0, Lustre 2.6.0, Lustre 2.5.1, Lustre 2.7.0, Lustre 2.5.3, Lustre 2.8.0
    • server and client: lustre-master build # 1784
      client is running SLES11 SP3
    • 3
    • 11880

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/7756e5f2-5bb9-11e3-8d79-52540035b04c.

      The sub-test test_170 failed with the following error:

      expected 31 bad lines, but got 34

      == sanity test 170: test lctl df to handle corrupted log ============================================= 00:50:22 (1385974222)
       sanity test_170: @@@@@@ FAIL: expected 31 bad lines, but got 34 
      

      Attachments

        Issue Links

          Activity

            [LU-4341] Failure on test suite sanity test_170: expected 31 bad lines, but got 34

            Can someone please definitively understand why this test is failing for SLES, and either fix it or add it to the ALWAYS_EXCEPT list for SLES. It doesn't make sense to exclude this via envdefinitions for SLES patches, when it will still fail when someone forgets to except it.

            adilger Andreas Dilger added a comment - Can someone please definitively understand why this test is failing for SLES, and either fix it or add it to the ALWAYS_EXCEPT list for SLES. It doesn't make sense to exclude this via envdefinitions for SLES patches, when it will still fail when someone forgets to except it.
            sarah Sarah Liu added a comment - another instance: https://testing.hpdd.intel.com/test_sets/02e36236-fe29-11e4-be9d-5254006e85c2
            sarah Sarah Liu added a comment - another instance https://testing.hpdd.intel.com/test_sets/834ace12-d75c-11e4-a678-5254006e85c2
            bogl Bob Glossman (Inactive) added a comment - another seen in master: https://testing.hpdd.intel.com/test_sets/83277a58-d3cd-11e4-8c98-5254006e85c2
            sarah Sarah Liu added a comment - hit this error in tag-2.6.94 test: https://testing.hpdd.intel.com/test_sets/c53f0196-b22a-11e4-af8e-5254006e85c2
            bogl Bob Glossman (Inactive) added a comment - seen in master with sles11sp3 client/server: https://testing.hpdd.intel.com/test_sets/14d6386e-8c7e-11e4-b81b-5254006e85c2
            yujian Jian Yu added a comment - Lustre Build: https://build.hpdd.intel.com/job/lustre-b2_5/86/ (2.5.3 RC1) The same failure occurred: https://testing.hpdd.intel.com/test_sets/b2a57a6e-30a3-11e4-9f57-5254006e85c2
            yujian Jian Yu added a comment -

            Thanks, Di. I can reproduce the failure every time with debug="rpctrace". I'll look into "$TMP/${tfile}_log_good".

            yujian Jian Yu added a comment - Thanks, Di. I can reproduce the failure every time with debug="rpctrace". I'll look into "$TMP/${tfile}_log_good".
            di.wang Di Wang added a comment -

            Hi, Yujian

            test_170 is supposed to verify "lctl df can identify and skip corrupted debug records", instead of abandon the whole debug log file. I guess there are some debug format problem for "rpctrace", though not sure what is the real reason here, I think you need get "$TMP/${file}_log_good" to have a look or follow the test step to repeat the test locally? Thanks.

            di.wang Di Wang added a comment - Hi, Yujian test_170 is supposed to verify "lctl df can identify and skip corrupted debug records", instead of abandon the whole debug log file. I guess there are some debug format problem for "rpctrace", though not sure what is the real reason here, I think you need get "$TMP/${file}_log_good" to have a look or follow the test step to repeat the test locally? Thanks.
            yujian Jian Yu added a comment -

            Hi Di,

            I saw that sanity test_170() was added by you in commit d9bf86ae95a599bf10bbb05818317b48eb71db1b. Could you please give me some hints about why debug="rpctrace" affects the test results of sanity test 170 on SLES client? Thanks a lot.

            yujian Jian Yu added a comment - Hi Di, I saw that sanity test_170() was added by you in commit d9bf86ae95a599bf10bbb05818317b48eb71db1b. Could you please give me some hints about why debug="rpctrace" affects the test results of sanity test 170 on SLES client? Thanks a lot.
            yujian Jian Yu added a comment -

            It turns out that sanity test 170 is affected by the lctl debug value. With debug=-1, it passed, and with debug="rpctrace", the test failed.

            In sanity.sh, debug=-1 is set before running sub-tests, which is why only running test 170 passed. In test 150, "set_default_debug_nodes $client" made the debug value change to debug="vfstrace rpctrace dlmtrace neterror ha config ioctl super", which caused test 170 fail.

            yujian Jian Yu added a comment - It turns out that sanity test 170 is affected by the lctl debug value. With debug=-1, it passed, and with debug="rpctrace", the test failed. In sanity.sh, debug=-1 is set before running sub-tests, which is why only running test 170 passed. In test 150, "set_default_debug_nodes $client" made the debug value change to debug="vfstrace rpctrace dlmtrace neterror ha config ioctl super", which caused test 170 fail.

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated: