Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17307

osd_dirent_count() keeps multiple threads busy

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • Lustre 2.14.0, Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      When osd_attr_get() is accessing a directory for the first time, it calls osd_dirent_count() to iterate over the directory and fill in obj->oo_dirent_count:

       iterate_dir+0x70/0x140
       osd_ldiskfs_it_fill+0xbd/0x290 [osd_ldiskfs]
       osd_it_ea_next+0xc2/0x150 [osd_ldiskfs]
       osd_attr_get+0x4bc/0x730 [osd_ldiskfs]
       lod_attr_get+0x8b/0x170 [lod]
       mdd_la_get+0x70/0x200 [mdd]
       mdd_attr_get+0x38/0x100 [mdd]
       mdt_attr_get_complex+0x4dd/0x800 [mdt]
       mdt_getattr_internal+0x445/0x1590 [mdt]
       mdt_getattr_name_lock+0x74d/0x2640 [mdt]
       mdt_intent_getattr+0x2a5/0x470 [mdt]
       mdt_intent_opc+0x1ba/0xb40 [mdt]
       mdt_intent_policy+0x1a9/0x370 [mdt]
       ldlm_lock_enqueue+0x3d4/0xb00 [ptlrpc]
       ldlm_handle_enqueue0+0x8b6/0x16d0 [ptlrpc]
       tgt_enqueue+0x62/0x220 [ptlrpc]
       tgt_request_handle+0x8bf/0x18c0 [ptlrpc]
       ptlrpc_server_handle_request+0x253/0xc40 [ptlrpc]
       ptlrpc_main+0xc88/0x26c0 [ptlrpc]
       kthread+0xd1/0xe0
      

      However, there are several issues with this:

      • if the directory is very large (e.g. millions of entries), then this iteration can take a considerable time and blocks the MDS service thread until it is finished.
      • if multiple MDS threads are accessing the same directory, then all of the threads will try to count the number of entries in the directory, blocking all of the threads.
      • the oo_dirent_count value is only needed for auto directory split, which is not enabled today.

      The entry counting should only be done if directory auto-split is enabled, avoiding overhead under normal cases. Since mdt_enable_dir_auto_split is controlled at the MDT level, the need for the count should be passed down to the OSD via an LA_DIRENT_COUNT valid flag. For osd-ldiskfs if this flag is not set then the count would be skipped. For osd-zfs it can optionally return the count since it is available for free.

      If directory auto-split is enabled, it would be much more efficient to only have one thread do the directory iteration. This should be controlled by a flag in the object, and the other threads can ignore oo_dirent_count (return "0" or "LU_DIRENT_COUNT_UNSET" or the current number of entries found). At worst this would defer the auto split by a few entries, but that doesn't matter in the end since the thread doing the counting will always perform the check itself.

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: