Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2115

ldlm_bl_xx thread hangs under high memory pressure

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.1.0, Lustre 1.8.8
    • None
    • it has happened with FEFS, based on Lustre-1.8.5 in RIKEN K computer environmentI
      MDSx1, OSSx2592, OSTx5184, Clientx84672
    • 3
    • 5110

    Description

      since the time after cheking the patch of Bugzilla 24320(https://bugzilla.lustre.org/show_bug.cgi?id=24320), I've started to doubt if the patch is enough.

      it's because, let's say, all ldlm_bl_xx threads tried to create new threads and failed to do it due to lack of memory. Then next, all ldlm_bl_xx threads will cal try_to_free_pages via the kmalloc, which has failed to allocate slab memory, and finally try_to_free_pages will call _ldlm_bl_to_thread with LDLM_SYNC via shrink_slab(). so, this case can end up in dead-lock situation, all ldlm_bl_xx thread awaits returning of_ldlm_bl_to_threads but thre is no more ldlm_bl_xx thread to handle blocking requests.

      So I think we'd better add set/clear PF_MEMALLOC into before/after cfs_kernel_thread/cfs_create_thread to prevent ldlm_bl_xxx threads from calling __ldlm_bl_to_thread().

      Attachments

        Activity

          People

            wc-triage WC Triage
            nozaki Hiroya Nozaki
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: