Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4725

wrong lock ordering in rename leads to deadlocks

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.6.0, Lustre 2.5.4
    • Lustre 2.1.6, Lustre 2.6.0, Lustre 2.5.2, Lustre 2.4.3
    • 3
    • 12984

    Description

      the current rename code locks objects in the order: src parent, dst parent, src child, dst child. it may happen that dst is a parent of src, what may lead to deadlock.

      example from a core dump:
      res1 - dst parent
      res2 - dst parent, PDO
      res3 - src parent

      Thread 1 (T1), rename:
      Has RES3 (CW,0x2)
      Wants RES1 (CW,0x2)

      Thread 2 (T2), getattr:
      Has RES1(CR,0x2)
      Has RES2(PR,0x2)
      Wants RES3(PR,0x2) - blocked by T1

      Thread 3 (T3), create or open|create
      Has RES1(CW,0x2)
      Wants RES2(PW,0x2) - blocked by T2

      Thread4 (T4), getattr or similar
      Wants RES1(PR,0x2) - blocked by T3

      T1 has no conflicts, but is sitting in the waiting queue behind T4, thus not granted.

      Attachments

        Issue Links

          Activity

            People

              hongchao.zhang Hongchao Zhang
              vitaly_fertman Vitaly Fertman
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: