Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.6.0, Lustre 2.8.0, Lustre 2.12.0
-
server: lustre-master build # 1752 RHEL6 ldiskfs
client: 2.5.0 RHEL6 ldiskfs
Also seen in review-dne with both server and client are RHEL6 ldiskfs
-
3
-
11613
Description
This issue was created by maloo for sarah <sarah@whamcloud.com>
This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/45f323be-49e3-11e3-8efa-52540035b04c.
The sub-test test_2b failed with the following error:
test failed to respond and timed out
test log shows:
lustre_rsync took 22 seconds Changelog records consumed: 926 Only in /mnt/lustre/d0.lustre-rsync-test/d2/clients/client0/~dmtmp/WORDPRO: BENCHS1A.PRN lustre-rsync-test test_2b: @@@@@@ FAIL: Failure in replication; differences found.
Attachments
Issue Links
- is duplicated by
-
LU-4978 Failure on test suite lustre-rsync-test test_2b: Failure in replication; differences found.
-
- Resolved
-
- is related to
-
LU-17361 lustre-rsync-test test_2a: Timeout occurred after 97 minutes, last suite running was lustre-rsync-test
-
- Open
-
- is related to
-
LU-4781 lustre-rsync-test test_2b: Replication of operation failed(-17)
-
- Resolved
-
- mentioned in
-
Page No Confluence page found with the given URL.
-
Page No Confluence page found with the given URL.
-
Page Loading...
This failed about 11x per week for the past 4 weeks.
I suspect the test timeout is because the cleanup_src_tgt step is taking a long time to do rm -rf $DIR/$tdir after running dbench, though that doesn't absolve the original error. At least there aren't any Lustre errors in the console logs for any of the nodes.
The recent failures have a lot of the following errors, though I'm not sure if these are new or not: