Details

    • Review task
    • Resolution: Won't Do
    • Minor
    • None
    • None
    • None
    • 17,845
    • 10115

    Description

      eeb, after some further investigation for readdir+, I found there some issues for my original lockless mode readdir+, I have described them in the HLD, they are difficult to be resolved. So I have made the HLD for simplified lock mode readdir+, and also hope to implement the lustre special traversing directory tool.

      Sir, please inspect the HLD when you have time, and give your suggestion. Thanks!

      Happy Christmas~

      Attachments

        1. readdir+.doc
          68 kB
        2. readdir+.pdf
          123 kB

        Activity

          [LU-31] Please inspect the HLD for readdir+

          Related documents became inactive for a long time, need more consideration for readdir+ feature.

          yong.fan nasf (Inactive) added a comment - Related documents became inactive for a long time, need more consideration for readdir+ feature.
          laisiyao Lai Siyao added a comment -

          Considering that file may have multiple stripes (max 160), async glimpse may not help much. Besides, if the system is WAN based, async glimpse may not help at all.

          IMHO current SOM design is too complicated, because it tries to ensure client update SOM in all cases (server and client recovery), if it's allowed to miss SOM update and let MDS update it later itself (when MDS finds SOM not set upon handling a getattr from client), SOM can be much simplified.

          laisiyao Lai Siyao added a comment - Considering that file may have multiple stripes (max 160), async glimpse may not help much. Besides, if the system is WAN based, async glimpse may not help at all. IMHO current SOM design is too complicated, because it tries to ensure client update SOM in all cases (server and client recovery), if it's allowed to miss SOM update and let MDS update it later itself (when MDS finds SOM not set upon handling a getattr from client), SOM can be much simplified.
          green Oleg Drokin added a comment -

          I sort of agree and disagree on this point from Lai:
          > size glimpse RPC comprises most of the RPCs in readdir+, compared with aggregate
          > glimpse, SOM can avoid most of such RPCs. We'd better evaluate the current progress
          > and remaining work of SOM, if there is no roadblocks, SOM should be a better option
          > than aggregate glimpse.

          SOM will definitely bring in a big boost here, but it's a long work to make SOM really happen with all the cases.

          As for glimpse aggregation, I sort of agree that it won't bring in many of the benefits. But what will bring in many benefits, almost at SOM level is async "glimpse ahead" just kind of like what we currently do with statahead. As long as we can get the data faster than app processes it, of course.

          green Oleg Drokin added a comment - I sort of agree and disagree on this point from Lai: > size glimpse RPC comprises most of the RPCs in readdir+, compared with aggregate > glimpse, SOM can avoid most of such RPCs. We'd better evaluate the current progress > and remaining work of SOM, if there is no roadblocks, SOM should be a better option > than aggregate glimpse. SOM will definitely bring in a big boost here, but it's a long work to make SOM really happen with all the cases. As for glimpse aggregation, I sort of agree that it won't bring in many of the benefits. But what will bring in many benefits, almost at SOM level is async "glimpse ahead" just kind of like what we currently do with statahead. As long as we can get the data faster than app processes it, of course.
          laisiyao Lai Siyao added a comment - - edited

          I have some concerns:

          • size glimpse RPC comprises most of the RPCs in readdir+, compared with aggregate
            glimpse, SOM can avoid most of such RPCs. We'd better evaluate the current progress
            and remaining work of SOM, if there is no roadblocks, SOM should be a better option
            than aggregate glimpse.
          • I prefer STL (Sub Tree Lock) than the new LIST lock, because the first one has two
            advantages:
            • it can avoid access conflicts (both pseudo and real ones) because it's weak, while
              LIST lock looks more like strong STL.
            • STL can be used in the future for WBC.
          • no need to introduce a new READDIRPLUS RPC, we can expand current MDS_READPAGE in a
            backward compatible way: in the request to MDS client specifies what attributes it
            wants along with entry names. In this way readdir() will issue normal MDS_READPAGE
            and store directory entries in directory page cache; while statahead thread issue
            MDS_READPAGE to fetch file attributes.
          laisiyao Lai Siyao added a comment - - edited I have some concerns: size glimpse RPC comprises most of the RPCs in readdir+, compared with aggregate glimpse, SOM can avoid most of such RPCs. We'd better evaluate the current progress and remaining work of SOM, if there is no roadblocks, SOM should be a better option than aggregate glimpse. I prefer STL (Sub Tree Lock) than the new LIST lock, because the first one has two advantages: it can avoid access conflicts (both pseudo and real ones) because it's weak, while LIST lock looks more like strong STL. STL can be used in the future for WBC. no need to introduce a new READDIRPLUS RPC, we can expand current MDS_READPAGE in a backward compatible way: in the request to MDS client specifies what attributes it wants along with entry names. In this way readdir() will issue normal MDS_READPAGE and store directory entries in directory page cache; while statahead thread issue MDS_READPAGE to fetch file attributes.

          In fact, as current master implementation, when create, the pfid is recorded as some kind of xattr by child, such information can be used by readdir+.

          yong.fan nasf (Inactive) added a comment - In fact, as current master implementation, when create, the pfid is recorded as some kind of xattr by child, such information can be used by readdir+.

          Hi, I got a chance to read through the document for interest, and have a question:

          • I think we don't pass parent fid over wire for setattr and setxattr, 0.3.4 says we want to take CW LIST lock of parent for these operations, does this mean that we have to change RPC format for setattr and setxattr, Which may have compatible issue with old version clients like 1.8.* or 2.0.*?
          liang Liang Zhen (Inactive) added a comment - Hi, I got a chance to read through the document for interest, and have a question: I think we don't pass parent fid over wire for setattr and setxattr, 0.3.4 says we want to take CW LIST lock of parent for these operations, does this mean that we have to change RPC format for setattr and setxattr, Which may have compatible issue with old version clients like 1.8.* or 2.0.*?

          updated at 2010-12-29

          yong.fan nasf (Inactive) added a comment - updated at 2010-12-29

          updated at 2010-12-29

          yong.fan nasf (Inactive) added a comment - updated at 2010-12-29

          People

            eeb Eric Barton (Inactive)
            yong.fan nasf (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: