[LU-18] Allow 100k open files on single client Created: 22/Nov/10  Updated: 11/May/21  Due: 10/Dec/10  Resolved: 11/May/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.0.0, Lustre 2.1.0
Fix Version/s: Lustre 2.11.0

Type: Improvement Priority: Minor
Reporter: Niu Yawei (Inactive) Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: None

Attachments: Text File open_100kfiles.patch    
Issue Links:
Related
is related to LU-5703 Quiesce client mountpoints from the s... Open
is related to LU-5964 test large number of concurrent open ... Resolved
Bugzilla ID: 24,217
Epic: interoperability, performance
Rank (Obsolete): 10276

 Description   

Allow 100k open files per client. Fix client to not store committed open RPCs in the resend list but instead reopen files from the file handles upon recovery (see Simplified Interop) to avoid O behaviour when adding new RPCs to the RPCs-for-recovery list on the client. Fix MDS to store "mfd" in a hash table instead of a linked list to avoid O behaviour when searching for an open file handle. For debugging it would be useful to have a /proc entry on the MDS showing the open FIDs for each client export.



 Comments   
Comment by Niu Yawei (Inactive) [ 26/Nov/10 ]

Talked with Andreas and Ericm, to avoid the conflicts with the simplified interop work, also for easy patch/feature management, I decided to use the separate list for the committed open on client (as Andreas suggested) at the first stage.

For the server side mfd list, I found that in normal operations, the mfd can always be found in the general handle hash table(class_handle_hash), the list only be scanned in following two cases:

  • For the resent open(and setattr in som), search the mfd in list by matching xid;
  • For the replayed close(and setattr/done_writing in som), search the mfd in list by matching mfd_old_handle (I don't quite understand this, why can't we just keep the old handle for the replayed open? thus this mfd_old_handle trick will be gone);

so I suppose what we want is:

  • Store the mfd in cfs_hash in stead of global handle hash table (indexed by handle), which requires modifing the general handle hash code to export a handle generator function.
  • Keep old handle for the replayed open, thus the mfd_old_handle matching work can be avoid.
  • Create another cfs_hash for the mfd, indexed by xid, thus list searching for resent open can be avoid.

Have exchanged my ideas with Andreas.

Comment by Niu Yawei (Inactive) [ 30/Nov/10 ]

Have submit the patch for review.

Comment by Niu Yawei (Inactive) [ 10/Dec/10 ]

Updated patch according to reviewers' comments, submitted for the 2nd round review.

Comment by Andreas Dilger [ 05/Feb/14 ]

I was trying to find the patch for this ticket, but it seems it was in the old "lustre" project (not "fs/lustre-release" used today) at http://review.whamcloud.com/171/ which has since been removed.

Niu, do you still have a copy of this patch that you could upload to fs/lustre-release?

Comment by Niu Yawei (Inactive) [ 07/Feb/14 ]

Unfortunately, I can't find a local copy neither. (it could be lost when replacing laptop)

The most significant change in the patch (client side changes) has been merged in master along with the fix of LU-2613. The server side change is about not reusing open handle on server side when do open replay.

Comment by Andreas Dilger [ 07/Feb/14 ]

It looks like there is a copy of the patch at https://bugzilla.lustre.org/show_bug.cgi?id=24217

Comment by Andreas Dilger [ 28/May/17 ]

Patch https://review.whamcloud.com/12885 "LU-5964 tests: open a large number of files at once" adds a test case for this, which has exposed some issues with memory usage when many files are open. That issue is being addressed by patch https://review.whamcloud.com/27208 "LU-9514 ptlrpc: free reply buffer for replay RPC".

While that patch reduces the memory usage of saved open replay RPC buffers, it would be better to fix the open replay code as described here - to regenerate a new RPC to reopen files after the initial create has committed, rather than saving the open RPC indefinitely. Saving the RPCs wastes memory, and makes recovery more complex because the RPC format cannot be changed if the server is upgraded when it is offline.

Comment by Andreas Dilger [ 11/May/21 ]

The referenced patches have landed, and this is likely fixed. There doesn't seem to be any value keeping it open longer.

Generated at Sat Feb 10 01:02:56 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.