Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14357

Simplify locking in fid_request

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • None
    • None
    • 9223372036854775807

    Description

      The lu_client_seq contains a mutex, lcs_mutex, which is used for serialising updates to the fid, in fid_request.

      It also contains waitq and flag, lcs_waitq and lcs_update, which effectively provide a second mutex. This appear to add no value.  The second mutex is held while and rpc to the server to get a new fid is pending.

      Originally there was just the one mutex, but in Commit 23e2a370c8a3 ("b=24255 move seq_client_alloc_seq out of lcs_sem") the second was added.  The apparent reason was that "in a case of recovery the recovery thread takes  [lcs_murex] too and deadlocks.

      This presumably refers to seq_client_flush as that seems to be the only relevant place which takes lcs_mutex but doesn't take the new open-coded mutex.  However this was changed in Commit d1feb5c774d4 ("LU-662 fix conflict between seq_client_flush and seq_client_alloc_fid") when it was noticed that there was a new race.

      As this doesn't appear to have brought back the deadlock that was originally a concern, we must assume that the deadlock possibility has disappeared for other reasons.

      So this open-coded mutex can be removed and the code simplified.

       

      Attachments

        Activity

          People

            neilb Neil Brown
            neilb Neil Brown
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: