[LU-7603] sanity test_69: read succeeded, expect -ENOENT Created: 23/Dec/15 Updated: 09/Jan/17 Resolved: 09/Jan/17 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Yang Sheng |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
This issue was created by maloo for Bob Glossman <bob.glossman@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/8c62edae-a975-11e5-b85c-5254006e85c2. The sub-test test_69 failed with the following error: read succeeded, expect -ENOENT only a small number of these found on a maloo search Info required for matching: sanity 69 |
| Comments |
| Comment by Bob Glossman (Inactive) [ 29/Dec/15 ] |
|
another on master, sles12 client: |
| Comment by Bob Glossman (Inactive) [ 12/Jan/16 ] |
|
another on master with sles12: starting to wonder if this is a 100% fail on sles12 |
| Comment by Peter Jones [ 12/Jan/16 ] |
|
Yang Sheng Could you please look into this one? Thanks Peter |
| Comment by James Nunez (Inactive) [ 13/Jan/16 ] |
|
Another failure on master: |
| Comment by Yang Sheng [ 14/Jan/16 ] |
|
This is interesting. Looks like range_lock was invoked: 170574:00000001:00000001:1.0:1452583769.626693:0:9171:0:(range_lock.c:200:range_lock()) Process entered 170581:00000001:00000001:1.0:1452583769.626703:0:9171:0:(range_lock.c:237:range_lock()) Process leaving (rc=0 : 0 : 0) But ll_direct_IO was not. 00000080:00000002:1.0:1452583769.626962:0:9171:0:(vvp_io.c:816:vvp_io_read_start()) header@ffff88005f1c3768[0x0, 2049, [0x200000404:0xeab8:0x0] hash] 00000080:00000002:1.0:1452583769.626964:0:9171:0:(vvp_io.c:816:vvp_io_read_start()) Read ino 144115205322894008, 4194304 bytes, offset 4194304, size 8388608 00020000:00000001:1.0:1452583769.628779:0:9171:0:(lov_io.c:531:lov_io_start()) Process entered 00020000:00000001:1.0:1452583769.628781:0:9171:0:(lov_io.c:509:lov_io_call()) Process entered 00000020:00000001:1.0:1452583769.628782:0:9171:0:(cl_io.c:571:cl_io_start()) Process entered It is very strange. I'll produce a debug patch to see what happened. |
| Comment by Gerrit Updater [ 14/Jan/16 ] |
|
Yang Sheng (yang.sheng@intel.com) uploaded a new patch: http://review.whamcloud.com/18008 |
| Comment by Yang Sheng [ 05/Feb/16 ] |
|
I haven't got a instance from autotest system since debug patch. Can other one hit it? |
| Comment by Yang Sheng [ 17/Feb/16 ] |
|
There some interesting log catch with debug patch while test_69 passed. 02:21:29:[ 5165.102167] Lustre: DEBUG MARKER: == sanity test 69: verify oa2dentry return -ENOENT doesn't LBUG ====================================== 02:20:46 (1455445246) 02:21:29:[ 5166.932255] LustreError: 5490:0:(vvp_io.c:831:vvp_io_read_start()) LU7603: type: 0 02:21:29:[ 5166.937738] LustreError: 5490:0:(vvp_io.c:831:vvp_io_read_start()) Skipped 102 previous similar messages 02:21:29:[ 5166.941704] LustreError: 5490:0:(vvp_io.c:838:vvp_io_read_start()) LU7603: file: f69.sanity, type: 0, flags=0, ns=1 --------------------------------------------------------------------------------------------------------------------------------------------------------^^^^ 02:21:29:[ 5166.944069] LustreError: 5490:0:(vvp_io.c:838:vvp_io_read_start()) Skipped 102 previous similar messages 02:21:29:[ 5166.944071] LustreError: 5490:0:(vvp_io.c:843:vvp_io_read_start()) LU7603: scnt=4194304, nr_segs=1, rv=0, rs = 0 02:21:29:[ 5166.944071] LustreError: 5490:0:(vvp_io.c:843:vvp_io_read_start()) Skipped 102 previous similar messages 02:21:29:[ 5167.848196] Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 fail_val=0 2>/dev/null || true 02:21:29:[ 5168.213028] Lustre: DEBUG MARKER: rc=0; 02:21:29:[ 5168.213028] val=$(/usr/sbin/lctl get_param -n catastrophe 2>&1); 02:21:29:[ 5168.213028] if [[ $? -eq 0 && $val -ne 0 ]]; then 02:21:29:[ 5168.213028] echo $(hostname -s): $val; 02:21:29:[ 5168.213028] rc=$val; 02:21:29:[ 5168.213028] fi; 02:21:29:[ 5168.213028] exit $rc 02:21:29:[ 5168.642126] Lustre: DEBUG MARKER: /usr/sbin/lctl mark SKIP: sanity test_71 skipping SLOW test 71 02:21:29:[ 5168.782172] Lustre: DEBUG MARKER: SKIP: sanity test_71 skipping SLOW test 71 02:21:29:[ 5169.109483] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity test 72a: Test that remove suid works properly \(bug5695\) =================================== 02:20:50 \(1455445250\) As mark line reveal the flags = 0. It means kiocb->ki_ctx == NULL. That say direct IO flag is disappeared. It is abnormal though test case is passed. I have update debug patch trying to catch detail information for further work. |
| Comment by Yang Sheng [ 09/Jan/17 ] |
|
Please feel free to reopen it. |