Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-247

Lustre client slow performance on BG/P IONs: unaligned DIRECT_IO

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • Lustre 2.0.0
    • None
    • 3
    • 18,801
    • 9105

    Description

      Port the fix of bug 18801 to 2.1

      Attachments

        Issue Links

          Activity

            [LU-247] Lustre client slow performance on BG/P IONs: unaligned DIRECT_IO
            paf0186 Patrick Farrell added a comment - See LU-13805
            spitzcor Cory Spitz added a comment -

            If we follow Niu's suggestion to land change #980 and then follow it up, can it land now, or must we wait until after we branch for b2_5?

            spitzcor Cory Spitz added a comment - If we follow Niu's suggestion to land change #980 and then follow it up, can it land now, or must we wait until after we branch for b2_5?

            I don't like the idea of adding new users of the obd_brw() APIs. These are pretty much obsolete on both the client and server, and we should be removing them entirely instead of adding new users.

            adilger Andreas Dilger added a comment - I don't like the idea of adding new users of the obd_brw() APIs. These are pretty much obsolete on both the client and server, and we should be removing them entirely instead of adding new users.

            The patch in http://review.whamcloud.com/980 uses obd_brw() API in osc_io_dio(), since it requires less code changes and easyer for landing, however, the best way to implement osc_io_dio() should be: introduce a new API which can naturally accept (file offset, bytes, offset in first page, incontiguous pages) and build io requests based on these information directly (brw_page should not be used for direct io, 'pshift' hacking should be eliminated), and osc_io_dio() calls the new API to build io reqeusts and sends them asynchronously.

            The proper implementaion requires more code changes, let's fix it in a follow-up patch.

            niu Niu Yawei (Inactive) added a comment - The patch in http://review.whamcloud.com/980 uses obd_brw() API in osc_io_dio(), since it requires less code changes and easyer for landing, however, the best way to implement osc_io_dio() should be: introduce a new API which can naturally accept (file offset, bytes, offset in first page, incontiguous pages) and build io requests based on these information directly (brw_page should not be used for direct io, 'pshift' hacking should be eliminated), and osc_io_dio() calls the new API to build io reqeusts and sends them asynchronously. The proper implementaion requires more code changes, let's fix it in a follow-up patch.
            niu Niu Yawei (Inactive) added a comment - Patch is at http://review.whamcloud.com/980
            pjones Peter Jones added a comment -

            I think that we will defer this task for now

            pjones Peter Jones added a comment - I think that we will defer this task for now
            green Oleg Drokin added a comment -

            Frankly I am not even sure why do we need radix trees or anything like that for directio or otherwise uncached pages.
            It's a purely one-shot job anyway.

            Lai, your description of patchless io partial page write in clio where unlocked page is marked as clean sounds totally broken.
            Good thing there is no easy way to trigger this mode or we'd have another blocker on our hands.

            green Oleg Drokin added a comment - Frankly I am not even sure why do we need radix trees or anything like that for directio or otherwise uncached pages. It's a purely one-shot job anyway. Lai, your description of patchless io partial page write in clio where unlocked page is marked as clean sounds totally broken. Good thing there is no easy way to trigger this mode or we'd have another blocker on our hands.

            I realize we should give it a try because this may be not a problem in clio. Clio has a finer grained lockless IO control, it can do lockless IO per ost object, instead of per file as b18.

            jay Jinshan Xiong (Inactive) added a comment - I realize we should give it a try because this may be not a problem in clio. Clio has a finer grained lockless IO control, it can do lockless IO per ost object, instead of per file as b18.

            Hi Niu,

            I composed a patch to bypass cl_page cache for TRANSIENT pages at: http://review.whamcloud.com/#change,495, just for a reference.

            jay Jinshan Xiong (Inactive) added a comment - Hi Niu, I composed a patch to bypass cl_page cache for TRANSIENT pages at: http://review.whamcloud.com/#change,495 , just for a reference.

            Hi Niu,

            Thanks for the points. Yes, this is a good chance for us to fix the directIO problem in one shot.

            I still tend to do this upon the infrastructure of clio. Also the reason to have a radix tree in clio is for porting purpose - we've done a lot of work to decouple linux vfs/vm in both MDT and client stack.

            Here is my idea: let's fix CPT_TRANSIENT page implementation to have arbitrary buffers for this kind of pages. However, this will not fit into cl_page cache any more which means stale data may be read if applications are doing regular IO and directIO on the same node. This may violate posix semantics a little bit, but I tend to think this is fine since linux kernel is doing the same thing.

            I'm going to compose a draft patch for this. It should be easy.

            jay Jinshan Xiong (Inactive) added a comment - Hi Niu, Thanks for the points. Yes, this is a good chance for us to fix the directIO problem in one shot. I still tend to do this upon the infrastructure of clio. Also the reason to have a radix tree in clio is for porting purpose - we've done a lot of work to decouple linux vfs/vm in both MDT and client stack. Here is my idea: let's fix CPT_TRANSIENT page implementation to have arbitrary buffers for this kind of pages. However, this will not fit into cl_page cache any more which means stale data may be read if applications are doing regular IO and directIO on the same node. This may violate posix semantics a little bit, but I tend to think this is fine since linux kernel is doing the same thing. I'm going to compose a draft patch for this. It should be easy.

            People

              niu Niu Yawei (Inactive)
              niu Niu Yawei (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: