Details

    • Type: Technical task
    • Status: In Progress
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: Lustre 2.10.0, Upstream
    • Fix Version/s: Upstream
    • Labels:
    • Rank (Obsolete):
      9223372036854775807

      Description

      The new done callback is implemented in three ways:

      1- Direct (no polling)
      2- softirq (our callback gets called by the IRQ_POLL_SOFTIRQ mechanism)
      3- WorkQueues

      It is very tempting to replace the kiblnd_scheduler() and its use of wait queues with the WorkQueue approach. However, this has two major problems:

      1- There is no way to bind the WorkQueue to a specific CPT without submitting a change to the RDMA code base. I'm not interested in doing this.
      2- It is unclear how the kernel threads for WorkQueues are created/destroyed. If not done efficiently, this will cause a performance degradation to LNet.

      So, my recommendation is to bind our current kiblnd_cq_completion() to the softirq callback (with necessary semantic changes). The main loop for the scheduler, kiblnd_scheduler(), will need to be updated to not do any polling of the cq as that will be done for us by the new callback mechanism. All of o2iblnd needs to be scanned for any cq polling and that needs to be turned off.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dougo Doug Oucharek
                Reporter:
                doug Doug Oucharek (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated: