[LU-6233] recovery-small test_10d failed with 'file contents differ' Created: 11/Feb/15 Updated: 20/Jan/17 Resolved: 20/Jan/17 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
OpenSFS Cluster with two MDSs each with one MDT, three OSSs each with two OSTs and three clients running lustre-master tag 2.6.93 build 2835 |
||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 17460 | ||||||||||||||||||||
| Description |
|
recovery-small test 10d failed with error message 'file contents differ'. Results and logs are at https://testing.hpdd.intel.com/test_sets/48de3eb8-ade9-11e4-a0b6-5254006e85c2 . From the client test log, the test output is as expected until: ... ldlm.namespaces.scratch-OST0005-osc-ffff8807dc5d1000.early_lock_cancel=1 ldlm.namespaces.scratch-OST0005-osc-ffff88080bd5ac00.early_lock_cancel=1 Connected clients: c13 c12 c11 c13 cmp: /lustre/scratch/f10d.recovery-small: Cannot send after transport endpoint shutdown recovery-small test_10d: @@@@@@ FAIL: file contents differ |
| Comments |
| Comment by Andreas Dilger [ 11/Feb/15 ] |
|
This test was added in http://review.whamcloud.com/11752 " |
| Comment by James Nunez (Inactive) [ 12/Feb/15 ] |
|
I've reproduced this issue with lustre-master tag 2.6.94 and captured logs with full debug from the two MDSs, test10d_mds01_log.txt and test10d_mds02_log.txt, and from the client running recovery-small, test10d_client_log.txt, attached here. I added cat of the files when this error is hit. You can see below that I can't read /lustre/scratch/f10d.recovery-small ($DIR/$tfile); I get "Cannot send after transport endpoint shutdown" error. ... Connected clients: c13 c13 c12 c11 cmp: /lustre/scratch/f10d.recovery-small: Cannot send after transport endpoint shutdown cat /lustre/scratch/f10d.recovery-small: cat: /lustre/scratch/f10d.recovery-small: Cannot send after transport endpoint shutdown end /lustre/scratch/f10d.recovery-small cat /lustre/scratch2/f10d.recovery-small: , worldend /lustre/scratch2/f10d.recovery-small recovery-small test_10d: @@@@@@ FAIL: file contents differ I can reproduce this error about one in 10 times running recovery-small. |
| Comment by Andreas Dilger [ 20/Jan/17 ] |
|
I did a check and recovery-small 10d has passed about 250 times in a row on master. |