[LU-394] LND failure casued by discontiguous KIOV pages Created: 06/Jun/11 Updated: 26/Jul/11 Resolved: 26/Jul/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0 |
| Fix Version/s: | Lustre 2.1.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Wally Wang (Inactive) | Assignee: | Jinshan Xiong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
| Severity: | 2 |
| Rank (Obsolete): | 4943 |
| Description |
|
Cray's gnilnd is running into a hole in kiov list in Lustre 2.1: LustreError: 17837:0:(gnilnd_cb.c:594:kgnilnd_setup_phys_buffer()) Can't make payload It used to be that only the first and last page in an IOV were allowed It doesn't have this problem with 1.8 client and 2.1 server. This problem can be reproduced by "fsx-linux -WR -dn -N 10000 junkfile". The osc_brw() is never called and the unfragmented pages logic is not exercised in 2.1 |
| Comments |
| Comment by Oleg Drokin [ 06/Jun/11 ] |
|
The partial pages in the middle of a transfer is strange indeed and should not be happening. Teh diagnostic printed is somewhat strange. For a page 17 with offset 0 there is no wat nob could be 6350, that's bigger than 4096 that could fit into a page and there is an assertion in osc that enforces that. Can you obtain a fuller list of what sort of IOV was passed in? I'll try to run fsx with some extra debug locally soon, but I think existing LNDs would fail very similarly if such partial pages were ever sent through them. |
| Comment by Wally Wang (Inactive) [ 07/Jun/11 ] |
|
Attached is the client debug log. I added a CDEBUG at the end of ptlrpc_add_bulk_page() <marked by ww -"> and also "turned on" the CDEBUB<put page..> in osc_build_req() for more information. Ignore the osc_brw comment where I thought used to be the place for unframenting the list. |
| Comment by Oleg Drokin [ 07/Jun/11 ] |
|
Ok, I reviewed the log and I do see a problem of sorts. Basically we have setup like this: Now since there is page 49 already, that means either truncate failed to chop it off or the page 47 was not extended like it should have been. Digging a bit more into the log I see there is a "seek" between page 47 write and page 49 write, so it seems to be the case of ap_make_ready not extending the (now middle) page 47 to the end of the page boundary. |
| Comment by Oleg Drokin [ 07/Jun/11 ] |
|
Jinshan, please see my previous comment. |
| Comment by Jinshan Xiong (Inactive) [ 07/Jun/11 ] |
|
When it was writing page #49, it ran out of quota, so a sync write was issued and triggered cached pages to be written as well. Since page # 49 is a sync write, it won't update the file size until the write succeeds, so that in ap_refresh_count() of page # 47, it didn;t return full page size - this is all right. The problem is that we cannot combine cached partial pages with sync page in one RPC. For example, in this bug as what we have seen, we shouldn't issue page #47 and #49 in the same RPC. Otherwise, we don't know what's the exact kms in case the write of page #49 fails. I realize there is one more issue with quota at ost side, where in filter_commitrw_write(): if ((flags & OBD_BRW_NOQUOTA) ||
(flags & (OBD_BRW_FROM_GRANT | OBD_BRW_SYNC)) ==
OBD_BRW_FROM_GRANT)
iobuf->dr_ignore_quota = 1;
it checks if the page is written in sync, otherwise ignore_quota is set. This means we cannot mix SYNC and ASYNC pages. I've got another idea to fix this issue: check the OBD_BRW_SYNC flag in osc_send_oap_rpc() and do not mix sync and async pages. |
| Comment by Jinshan Xiong (Inactive) [ 07/Jun/11 ] |
|
Hi Wally, Can you please try patch: http://review.whamcloud.com/911 Thanks. |
| Comment by Wally Wang (Inactive) [ 07/Jun/11 ] |
|
Hit assertion soon after client mount: [2011-06-07 19:01:09][c0-0c1s7n3]LustreError: 3606:0:(osc_request.c:2586:osc_send_oap_rpc()) ASSERTION(ergo(grant, oap->oap_cmd == OBD_BRW_WRITE)) failed |
| Comment by Jinshan Xiong (Inactive) [ 08/Jun/11 ] |
|
my fault. I pushed another on http://review.whamcloud.com/911, please try it again. |
| Comment by Wally Wang (Inactive) [ 09/Jun/11 ] |
|
It works fine now with the patch. |
| Comment by Peter Jones [ 22/Jun/11 ] |
|
For the record, the patch has been updated since Cray confirmed that it worked. Cray will re-test to confirm that the latest version still works. |
| Comment by Jinshan Xiong (Inactive) [ 22/Jun/11 ] |
|
Hi Wally, Can you please repull the patch and do the test again? Thanks, |
| Comment by Wally Wang (Inactive) [ 23/Jun/11 ] |
|
Will test patch set 6 tonight. I tested the patch set 5 and got LBUG: c0-0c1s6n3 LustreError: 3680:0:(osc_request.c:2593:osc_send_oap_rpc()) Uncontiguous: max_off:278528off:0, cnt:1576 |
| Comment by Jinshan Xiong (Inactive) [ 23/Jun/11 ] |
|
Hi Wally, please try patch set 7, thanks. |
| Comment by Wally Wang (Inactive) [ 24/Jun/11 ] |
|
Patch set 7 works. Thanks! |
| Comment by Peter Jones [ 11/Jul/11 ] |
|
Dropping priority to remove from 2.1 blockers list. It is being tracked as blocker under LU437 |
| Comment by Wally Wang (Inactive) [ 15/Jul/11 ] |
|
I just tried patchset 14 and the problem is reoccurring: 2011-07-15T17:52:24.194477-05:00 c0-0c1s5n2 LNetError: 22816:0:(gnilnd_cb.c:594:kgnilnd_setup_phys_buffer()) Can't make payload contiguous in I/O VM:page 17, offset 0, nob 6350, kiov_offset 0 kiov_len 2254 |
| Comment by Jinshan Xiong (Inactive) [ 16/Jul/11 ] |
|
Hi Wally, Patch set 14 is supposed to catch this kind of error in osc layer, but unfortunately there was an defect in the patch. Can you please try patch set 15 and collect log for me when you hit the bug again? Thanks, |
| Comment by Wally Wang (Inactive) [ 18/Jul/11 ] |
|
debug log is attached, the ost console shows: 2011-07-18T15:15:28.004617-05:00 c0-0c1s5n2 LNetError: 15810:0:(gnilnd_cb.c:594:kgnilnd_setup_phys_buffer()) Can't make payload contiguous in I/O VM:page 17, offset 0, nob 6350, kiov_offset 0 kiov_len 2254 |
| Comment by Jinshan Xiong (Inactive) [ 18/Jul/11 ] |
|
From the log, it looks like the last page(index 49) was still added to the request. Are you sure you're using the correct patch? Can you please run the test case sanity:219 to check if it works? I'm going to provide you a new patch with more debug info. |
| Comment by Wally Wang (Inactive) [ 19/Jul/11 ] |
|
Sorry, it actually works. I must have mis-labeled the image. After a total rebuild/install, it all works fine. |
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Peter Jones [ 26/Jul/11 ] |
|
Landed for 2.1 |
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 419016ac3e53e798453106ec04412a4843620916
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|
| Comment by Build Master (Inactive) [ 26/Jul/11 ] |
|
Integrated in Oleg Drokin : 0cdae78b8751fee4f9da67750ffda488d6ea2638
|