Improve performance for traversing large directory with readdir+ (LU-23)

[LU-31] Please inspect the HLD for readdir+ Created: 23/Dec/10  Updated: 06/Jun/18  Resolved: 06/Jun/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Review task Priority: Minor
Reporter: nasf (Inactive) Assignee: Eric Barton (Inactive)
Resolution: Won't Do Votes: 0
Labels: None

Attachments: Microsoft Word readdir+.doc     PDF File readdir+.pdf    
Bugzilla ID: 17,845
Rank (Obsolete): 10115

 Description   

eeb, after some further investigation for readdir+, I found there some issues for my original lockless mode readdir+, I have described them in the HLD, they are difficult to be resolved. So I have made the HLD for simplified lock mode readdir+, and also hope to implement the lustre special traversing directory tool.

Sir, please inspect the HLD when you have time, and give your suggestion. Thanks!

Happy Christmas~



 Comments   
Comment by nasf (Inactive) [ 29/Dec/10 ]

updated at 2010-12-29

Comment by nasf (Inactive) [ 29/Dec/10 ]

updated at 2010-12-29

Comment by Liang Zhen (Inactive) [ 30/Dec/10 ]

Hi, I got a chance to read through the document for interest, and have a question:

  • I think we don't pass parent fid over wire for setattr and setxattr, 0.3.4 says we want to take CW LIST lock of parent for these operations, does this mean that we have to change RPC format for setattr and setxattr, Which may have compatible issue with old version clients like 1.8.* or 2.0.*?
Comment by nasf (Inactive) [ 30/Dec/10 ]

In fact, as current master implementation, when create, the pfid is recorded as some kind of xattr by child, such information can be used by readdir+.

Comment by Lai Siyao [ 31/Dec/10 ]

I have some concerns:

  • size glimpse RPC comprises most of the RPCs in readdir+, compared with aggregate
    glimpse, SOM can avoid most of such RPCs. We'd better evaluate the current progress
    and remaining work of SOM, if there is no roadblocks, SOM should be a better option
    than aggregate glimpse.
  • I prefer STL (Sub Tree Lock) than the new LIST lock, because the first one has two
    advantages:
    • it can avoid access conflicts (both pseudo and real ones) because it's weak, while
      LIST lock looks more like strong STL.
    • STL can be used in the future for WBC.
  • no need to introduce a new READDIRPLUS RPC, we can expand current MDS_READPAGE in a
    backward compatible way: in the request to MDS client specifies what attributes it
    wants along with entry names. In this way readdir() will issue normal MDS_READPAGE
    and store directory entries in directory page cache; while statahead thread issue
    MDS_READPAGE to fetch file attributes.
Comment by Oleg Drokin [ 12/Jan/11 ]

I sort of agree and disagree on this point from Lai:
> size glimpse RPC comprises most of the RPCs in readdir+, compared with aggregate
> glimpse, SOM can avoid most of such RPCs. We'd better evaluate the current progress
> and remaining work of SOM, if there is no roadblocks, SOM should be a better option
> than aggregate glimpse.

SOM will definitely bring in a big boost here, but it's a long work to make SOM really happen with all the cases.

As for glimpse aggregation, I sort of agree that it won't bring in many of the benefits. But what will bring in many benefits, almost at SOM level is async "glimpse ahead" just kind of like what we currently do with statahead. As long as we can get the data faster than app processes it, of course.

Comment by Lai Siyao [ 13/Jan/11 ]

Considering that file may have multiple stripes (max 160), async glimpse may not help much. Besides, if the system is WAN based, async glimpse may not help at all.

IMHO current SOM design is too complicated, because it tries to ensure client update SOM in all cases (server and client recovery), if it's allowed to miss SOM update and let MDS update it later itself (when MDS finds SOM not set upon handling a getattr from client), SOM can be much simplified.

Comment by nasf (Inactive) [ 06/Jun/18 ]

Related documents became inactive for a long time, need more consideration for readdir+ feature.

Generated at Sat Feb 10 01:03:03 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.