[LU-10347] sanity-hsm test_252: archive request fails rather than canceling out Created: 07/Dec/17 Updated: 28/Mar/18 Resolved: 28/Mar/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.11.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
This issue was created by maloo for John Hammond <john.hammond@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/42d77f44-db21-11e7-a066-52540065bddc. The sub-test test_252 failed with the following error: request on 0x200000405:0x133:0x0 is not CANCELED on mds1 Info required for matching: sanity-hsm 252 |
| Comments |
| Comment by John Hammond [ 07/Dec/17 ] |
|
The CT calls ct_begin() before opening the file to be archived so there is a small race in this test: $LFS hsm_archive --archive $HSM_ARCHIVE_NUMBER $f
wait_request_state $fid ARCHIVE STARTED
rm -f $f
which cause the archive request to be failed rather than canceled. |
| Comment by Gerrit Updater [ 07/Dec/17 ] |
|
John L. Hammond (john.hammond@intel.com) uploaded a new patch: https://review.whamcloud.com/30434 |
| Comment by Quentin Bouget [ 07/Dec/17 ] |
|
No, adding a delay is not reliable enough. The copytool just needs to be suspended until the request times out (although in that case you will hit |
| Comment by Bob Glossman (Inactive) [ 08/Dec/17 ] |
|
another on master: |
| Comment by Gerrit Updater [ 12/Dec/17 ] |
|
Quentin Bouget (quentin.bouget@cea.fr) uploaded a new patch: https://review.whamcloud.com/30492 |
| Comment by Gerrit Updater [ 09/Feb/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30492/ |
| Comment by Joseph Gmitter (Inactive) [ 28/Mar/18 ] |
|
Fix to suspend has landed for 2.11.0 |