[LU-485] Remove CPT_TRANSIENT pages out of radix tree Created: 05/Jul/11 Updated: 08/Jul/11 Resolved: 08/Jul/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Jinshan Xiong (Inactive) | Assignee: | Jinshan Xiong (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Rank (Obsolete): | 9722 |
| Description |
|
In clio, we currently put both cache and transient pages into radix tree, this makes the code complex and ugly. We should remove transient pages out of radix tree. |
| Comments |
| Comment by Jinshan Xiong (Inactive) [ 05/Jul/11 ] |
|
patch is at http://review.whamcloud.com/1055. |
| Comment by Johann Lombardi (Inactive) [ 06/Jul/11 ] |
|
hm, it might conflict with the new grant approach in Sequoia. We consume grant for direct I/O so we might need to keep those pages in the radix tree ... need to think further |
| Comment by Niu Yawei (Inactive) [ 06/Jul/11 ] |
|
Hi, Johann I don't quite understand why dio need to consume grant? Storing those disposable 'transient' pages in the radix tree doesn't make sense to me, and it makes unaligned dio hard to be implemented (in |
| Comment by Johann Lombardi (Inactive) [ 06/Jul/11 ] |
|
> I don't quite understand why dio need to consume grant? To prevent DIOs from failing with ENOSPC while the client which sent the request still has available grants. > Storing those disposable 'transient' pages in the radix tree doesn't make sense to me I tend to agree. The problem is that the new RPC engine is supposed to use the radix tree to select pages to put together in a BRW, instead of the LRU list. > and it makes unaligned dio hard to be implemented (in I see, let me think about it ... |
| Comment by Jinshan Xiong (Inactive) [ 06/Jul/11 ] |
|
Actually, I tend to think DIO would be fine with radix tree grant. This is because: |
| Comment by Johann Lombardi (Inactive) [ 06/Jul/11 ] |
|
> Actually, I tend to think DIO would be fine with radix tree grant. This is because: Right. > 2. for DIO pages, they are submitted via cl_io_submit() function and should not have FROM_GRANT set. For DIO, grant space is consumed in osc_io_submit_page() via a call to osc_enter_cache_try(). > Do we really have to consume grant under this case? I don't think this is a strong requirement. > I mean application is able to see ENOSPC error in this case. Even we need to hold grant for DIO pages, Unlike b1_8, CLIO uses the same path to send RPCs for buffered & DIO writes (through osc_oap_to_pending()). The result is that a bulk write can have a mix of pages for direct I/O and pages from buffered writes. |
| Comment by Jinshan Xiong (Inactive) [ 06/Jul/11 ] |
|
>> Unlike b1_8, CLIO uses the same path to send RPCs for buffered & DIO writes (through osc_oap_to_pending()). The result is that a Niu is working on a patch where DIO pages are written independently. |
| Comment by Niu Yawei (Inactive) [ 07/Jul/11 ] |
|
I think it's possible that there are both 'transient' page and cached page at the same time for a given file offset (imagine there are cocurrent dio and buffered read), but we store the 'transient' page and the cached page in the same radix tree with same key(page offset), then I don't see how can we handle the cocurrent dio and buffered read? I think it has caused trouble for us (see |
| Comment by Johann Lombardi (Inactive) [ 07/Jul/11 ] |
|
> Niu is working on a patch where DIO pages are written independently. That's good news actually. As long as those pages are not submitted through osc_oap_to_pending(), it is fine with me. We can handle grants for DIO separately. > I think it's possible that there are both 'transient' page and cached page at the same time We cannot have a transient & a cached page at the same offset in the radix tree (at least, not with the current CLIO code) because ll_direct_rw_pages() looks up the index in the radix tree and uses the cache page for the transfer if there is one. But what you can have is two contiguous pages, one being transient and the other one cached, both bundled in the same bulk write. |
| Comment by Jinshan Xiong (Inactive) [ 07/Jul/11 ] |
|
to Niu: actually we cannot have transient and caching pages for the same offset at the same time, with current clio implementation or you're working on(where dio pages should be submitted separately). If a node is doing dio and cache write at the same time, the result is undefined, because it may make cache inconsistent, or stale data to be read. |
| Comment by Niu Yawei (Inactive) [ 08/Jul/11 ] |
|
Will fix it in |