Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10158 FLR2: improve mirror selection policy functions
  3. LU-17973

FLR2: improve read mirror selection for many copies

    XMLWordPrintable

Details

    • Technical task
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.16.0
    • 3
    • 9223372036854775807

    Description

      When selecting a mirror to read from, the client will examine all of the mirrors:

      • any mirror component marked LCME_FL_STALE should be skipped
      • any mirror component marked LCME_FL_PREF_RD should be selected first (LU-10282)
      • any mirror component on OSTs marked OS_STATE_NONROT should be preferred (LU-14996)

      If there are multiple mirrors matching all of these criteria, then it is likely that the mirrors were created for performance or availability reasons, rather than tiered OST storage (which would likely be excluded by OS_STATE_NONROT and/or LCME_FL_PREF_RD).

      In this case, it is desirable to maximize usage of page cache on the OSS nodes. For small files (e.g. <= 128MiB), the clients should deterministically pick the same replica (eg. first one) and always read from the same mirror copy, on the assumption that there are other "small" files being accessed concurrently and the aggregate system performance is maximized by caching different files in each OSS node's RAM.

      For larger files, it is desirable to spread the read workload across multiple OSS nodes to better utilize the RAM and network bandwidth. Clients should deterministically round-robin reads for large files across replicas (e.g. every 1GiB). If there are a large number of equivalent replicas of a file (eg. more than 2 or 3?), clients should deterministically evenly distribute their selection of the mirror by e.g. (client NID + offset in GiB) modulo mirror count (e.g. see patch https://review.whamcloud.com/29136 implementation of lmv_select_statfs_mdt() for how clients distribute MDT_STATFS RPCs across MDS nodes).

      Attachments

        Activity

          People

            wc-triage WC Triage
            adilger Andreas Dilger
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: