[LU-7105] sanityn test_28 fails with 'error() without useful message, please fix' Created: 04/Sep/15 Updated: 18/Feb/22 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | always_except, easy, tests | ||
| Environment: |
autotest |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Bugzilla ID: | 9,977 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
sanityn test 28 was recently removed from the ALWAYS_EXCEPT list by accident and is still failing. There is no real error message, but the output from the test on failure is 'error() without useful message, please fix' Recently, there are many examples of this test failing and, thus, many logs of the failures. Here are just a couple: From the test log output, it’s clear that this test needs to be updated; newdev was removed as an option to lctl many years ago: == sanityn test 28: read/write/truncate file with lost stripes == 08:31:03 (1441355463) 2+0 records in 2+0 records out 2097152 bytes (2.1 MB) copied, 0.0383377 s, 54.7 MB/s No such command, type help error: setup: Operation already in progress error: destroy: invalid objid '12745:0' destroy OST object <objid> [num [verbose]] usage: destroy <num> objects, starting at objid <objid> run <command> after connecting to device <devno> --device <devno> <command [args ...]> Until we fix the obvious issues, we don’t really know if the original bug/reason for ALWAYS_EXCEPT test 28 is still valid. In sanityn, the reason for putting this test on the ALWAYS_EXCEPT list is due to bz=9977. |
| Comments |
| Comment by Andreas Dilger [ 01/Oct/15 ] |
|
It is possible to use fail_loc added for LFSCK to create files that are missing stripes. That would be a lot less heavyweight than configuring the echo_client to delete one object. |
| Comment by Mikhail Pershin [ 18/Feb/22 ] |
|
While working on unrelated test fixes I was trying to reanimate test_28 by deleting OST object with debugfs but test is still failing. So in general the idea of test is that missing stripe should return error while reading from it but can be recreated by writing to it. It also says something about truncate in test name but there is no truncate in test actually. By using debugfs I remove stripe #2 of file and then get the following: # read from stripe #1, successful 1048576 bytes (1,0 MB) copied, 0,00574064 s, 183 MB/s # read from stripe #2 failed as expected dd: cannot fstat '/mnt/lustre2/f28.sanityn': No such file or directory # write to both stripes again fails also with ENOENT dd: failed to open '/mnt/lustre/f28.sanityn': No such file or directory sanityn test_28: @@@@@@ FAIL: re-creating write failed I am not sure how it should really work actually. Is that really error that write is failed or maybe it shouldn't work and test is just obsolete patch for test is attached to the ticket |