Details
-
Epic
-
Resolution: Fixed
-
Minor
-
Lustre 2.4.0
-
None
-
DNE Phase 1
-
3974
Description
DNE Phase 1: Remote Direcories
This issue is for tracking progress of DNE Phase 1: Remote Directories.
Introduction
Today's HPC systems routinely configure Lustre with tens of thousands of clients. Each client draws resources from a single metadata server (MDS). The efforts of the Lustre community continue to improve the performance of the MDS, however, as client numbers continue to increase the single MDS represents a fundamental limit to filesystem scalability.
The goal of the Distributed Namespace (DNE) project is to a deliver a documented and tested implementation of Lustre that addresses this scaling limit by distributing the filesystem metadata over multiple metadata servers. This is an ambitious engineering project that will take place over a period of two years. It requires considerable engineering and testing resource and will therefore be performed in the two phases described below.
Phase 1: Remote Directories
This phase introduces a useful minimum of distributed metadata functionality. The purpose primarily to ensure that efforts concentrate on clean code restructuring for DNE. The phase focuses on extensive testing to shake out bugs and oversights not only in the implementation but also in administrative procedures. DNE brings new usage patterns that must necessarily adapt to manage multiple metadata servers.
The Lustre namespace will be distributed by allowing directory entries to reference sub-directories on different metadata targets (MDTs). Individual directories will remain bound to a single MDT, therefore metadata throughput on single directories will stay limited by single MDS performance, but metadata throughput aggregated over distributed directories will scale.
The creation of non-local subdirectories will initially be restricted to administrators with a Lustre-specific mkdir command. This will ensure that administrators retain control over namespace distribution to guarantee performance isolation.
Attachments
Issue Links
- is duplicated by
-
LU-2359 fldb_seq_start() needs to check for halting condition
- Resolved
-
LU-1183 Cleanup old CMD code
- Closed
- is related to
-
LU-1445 fid on OST landing
- Resolved
-
LU-1186 Object update handler
- Closed
-
LU-1184 Delete hash methods from LMV
- Closed
-
LU-1185 FID on OST
- Closed
-
LU-991 cleanup LMV for DNE
- Resolved
-
LUDOC-86 DNE Phase I Doc Changes
- Closed
- is related to
-
LU-2240 implement index range lookup for osd-zfs.
- Resolved
- Trackbacks
-
Lustre Community Development in Progress Features are being developed for future Lustre releases both at Whamcloud and by other organizations in the Lustre community. These will be eligible for inclusion in future Lustre releases as per our processes