Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8802

Dynamically allocate obd_devices.

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 9223372036854775807

    Description

      One proposed cleanup upstream was removal of class_devno_max(). It was pointed out that was for future work to enable dynamic obd_device allocation.
      Aya Mahfouz has stepped forward wanting to do this work. Created this ticket to keep track of the progress and have reviewers look over his work.

      Andreas replied to Aya's email with the following suggestions:

      There are several ways to approach this problem to make the allocation of the obd_devs[] array dynamic. In most cases, there isn't any value to dynamically shrink this array, since the filesystem(s) will typically be mounted until the node is rebooted, and it is only in the tens of KB size range, so this will not affect ongoing operations, and that simplifies the implementation.

      The easiest way would be to have a dynamically-sized obd_devs[] array that is reallocated in class_newdev() in PAGE_SIZE chunks whenever the current array has no more free slots and copied to the new array, using obd_dev_lock to protect the array while it is being reallocated and copied. In most cases, this would save memory over the static array (not many filesystems have so many servers), but for the few sites that have 10000+ servers, and/or mounting multiple large filesystems on a single node, they don't need to change the source to handle this. Using libcfs_kvzalloc() would avoid issues with allocating large chunks of memory.

      There are a few places where obd_devs[] is accessed outside obd_dev_lock that would need to be fixed now that this array may be changed at runtime.

      A second approach that may scale better is to change obd_devs from an array to a doubly linked list (using standard list_head helpers). In many cases the whole list is seached linearly in the cases where a hash table lookup for the device name/UUID is not used, and most of the uses of class_num2obd() are just used to walk that list in order, which could be replaced with list_for_each_entry() list traversal. The class_name2dev() function should be changed to return the pointer to struct obd_device, and a new helper class_dev2num() would just return the obd_minor number from struct obd_device for the one use in class_resolve_dev_name(). Using a linked list has the advantage that there is no need to search for free slots in the array, since devices would be removed from the list when it is freed.

      Attachments

        Issue Links

          Activity

            People

              timday Tim Day
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated: