[LU-4256] lustre-rsync-test test_2b: Failure in replication; differences found Created: 15/Nov/13 Updated: 04/Sep/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0, Lustre 2.8.0, Lustre 2.12.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | always_except | ||
| Environment: |
server: lustre-master build # 1752 RHEL6 ldiskfs Also seen in review-dne with both server and client are RHEL6 ldiskfs |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 11613 | ||||||||||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/45f323be-49e3-11e3-8efa-52540035b04c. The sub-test test_2b failed with the following error:
test log shows: lustre_rsync took 22 seconds Changelog records consumed: 926 Only in /mnt/lustre/d0.lustre-rsync-test/d2/clients/client0/~dmtmp/WORDPRO: BENCHS1A.PRN lustre-rsync-test test_2b: @@@@@@ FAIL: Failure in replication; differences found. |
| Comments |
| Comment by nasf (Inactive) [ 06/Jan/14 ] |
|
I found the same failure under non interoperability mode. |
| Comment by Sarah Liu [ 11/Feb/14 ] |
|
also seen in lustre-master build #1876 https://maloo.whamcloud.com/test_sets/7edd3618-90d0-11e3-91ee-52540035b04c |
| Comment by Andreas Dilger [ 18/Mar/14 ] |
|
This is related to, but looks different than, |
| Comment by nasf (Inactive) [ 25/Apr/14 ] |
|
Another failure instance: https://maloo.whamcloud.com/test_sets/24f9ee20-cc4e-11e3-bda1-52540035b04c |
| Comment by Bob Glossman (Inactive) [ 19/Nov/14 ] |
|
I think this is another: It reports as a TIMEOUT, not a FAIL, but the test log says: Only in /mnt/lustre/d2b.lustre-rsync-test/clients/client1/~dmtmp/PARADOX: __QB4.MB |
| Comment by James Nunez (Inactive) [ 21/Jul/15 ] |
|
A few more recent occurrences (non-interop, review-dne-part-1): |
| Comment by Saurabh Tandan (Inactive) [ 10/Feb/16 ] |
|
Another instance found for interop tag 2.7.66 -2.5.5 Server/EL6.7 Client, build# 3316 |
| Comment by Niu Yawei (Inactive) [ 21/Sep/16 ] |
|
+1 on master review: https://testing.hpdd.intel.com/test_sets/3c38bf1a-7f4f-11e6-8a8c-5254006e85c2 |
| Comment by Emoly Liu [ 28/Apr/18 ] |
|
+1 on master: |
| Comment by Hongchao Zhang [ 29/May/18 ] |
|
+1 on master |
| Comment by Mikhail Pershin [ 26/Jul/18 ] |
|
on master: |
| Comment by Gerrit Updater [ 15/Aug/18 ] |
|
John L. Hammond (jhammond@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33006 |
| Comment by Andreas Dilger [ 15/Aug/18 ] |
|
This failed about 11x per week for the past 4 weeks. I suspect the test timeout is because the cleanup_src_tgt step is taking a long time to do rm -rf $DIR/$tdir after running dbench, though that doesn't absolve the original error. At least there aren't any Lustre errors in the console logs for any of the nodes. The recent failures have a lot of the following errors, though I'm not sure if these are new or not: Error replicating xattr for /tmp/target/d2b.lustre-rsync-test/clients/client0/~dmtmp/WORD/TIPS.DOC: 2 Error replicating xattr for /tmp/target/d2b.lustre-rsync-test/clients/client0/~dmtmp/WORD/TIPS.DOC: 2 Error replicating xattr for /tmp/target/d2b.lustre-rsync-test/clients/client0/~dmtmp/WORD/TIPS.DOC: 2 |
| Comment by Gerrit Updater [ 23/Aug/18 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33006/ |