Details
-
Bug
-
Resolution: Unresolved
-
Blocker
-
None
-
Lustre 2.5.0, Lustre 2.6.0, Lustre 2.5.1, Lustre 2.7.0, Lustre 2.5.3, Lustre 2.8.0
-
server and client: lustre-master build # 1784
client is running SLES11 SP3
-
3
-
11880
Description
This issue was created by maloo for sarah <sarah@whamcloud.com>
This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/7756e5f2-5bb9-11e3-8d79-52540035b04c.
The sub-test test_170 failed with the following error:
expected 31 bad lines, but got 34
== sanity test 170: test lctl df to handle corrupted log ============================================= 00:50:22 (1385974222) sanity test_170: @@@@@@ FAIL: expected 31 bad lines, but got 34
Attachments
Issue Links
- mentioned in
-
Page No Confluence page found with the given URL.
-
Page No Confluence page found with the given URL.
-
Page No Confluence page found with the given URL.
-
Page No Confluence page found with the given URL.
-
Page No Confluence page found with the given URL.
-
Page Loading...
-
Page Loading...
-
Page Loading...
This appears to be failing on master on SLES11.3 tests in the past week:
https://testing.hpdd.intel.com/sub_tests/05b5baa6-73f7-11e5-ada9-5254006e85c2
https://testing.hpdd.intel.com/sub_tests/07fb912c-73df-11e5-ab44-5254006e85c2
https://testing.hpdd.intel.com/sub_tests/9c5611ca-73ea-11e5-ab44-5254006e85c2
https://testing.hpdd.intel.com/sub_tests/bae5eb2c-722f-11e5-b344-5254006e85c2
https://testing.hpdd.intel.com/sub_tests/49587898-7073-11e5-b705-5254006e85c2
https://testing.hpdd.intel.com/sub_tests/8bfb6e00-7071-11e5-b705-5254006e85c2
https://testing.hpdd.intel.com/sub_tests/31313890-6f24-11e5-83a9-5254006e85c2
The patch http://review.whamcloud.com/16146 has landed, but that doesn't solve the problem itself. An improved patch would only skip the "expected N bad lines, but got M" check in test_170 for SLES11.3+ and RHEL7 as well rather than the whole test_170. While in there, it should also remove the "-rf" from rm -rf $DIR/$tfile since that is a file and not a directory and shouldn't fail in any case.
One possible source of the bug is that the first cat $TMP/${tfile}_log_good >> $TMP/${tfile}_logs_corrupt is appending to a file (>>) instead of first truncating it (>) so if the ${tfile}_logs_corrupt file is lingering around from a previous test run for some reason it might cause problems. It does seem like the number of bad lines is always higher than the number of expected lines, so this seems like a candidate.
In any case, this bug cannot be closed until the actual test failure is understood and fixed.