Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11063

RHEL7.[345] RCU breakage

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.13.0
    • None
    • 3
    • 18,015
    • 9223372036854775807

    Description

      I finally traced my debug kernel problems with later rhel releases to RCU breakage of some sort.

      ldlm_locks slab is declared as SLAB_DESTROY_BY_RCU if it's defined This is going back to bugzilla 18015 https://bugzilla.lustre.org/show_bug.cgi?id=18015 patch by BobiJam.

      Now it appears that as we schedule a free in that slab and then destroy the slab, the actual free is delayed and is executed after the slab is already freed despite rcu_barrier() being present.

      Clear bug that I will file rh bugzilla ticket for.

      But in addition to that I wonder how much do we need that thing nowadays, esp. considering that newer kernels renamed the flag to SLAB_TYPESAFE_BY_RCU that we do not detect and just not set it in that case.

      Should we just convert ldlm_locks into a normal slab again I wonder?

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: