Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17089

Bug in the barrier code could cause barrier freeze to fail everytime

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      The barrier code has a bug that could cause the freeze to fail everytime. barrier freeze would be called before trying a FS backup, but would repeatedly fail due to an issue in the mdd_trans_create() function.

      https://git.whamcloud.com/?p=fs/lustre-release.git;a=blob;f=lustre/mdd/mdd_trans.c;hb=2b0a71081d9c2465cb4b6368fede266fcde91b82#l49

      The barrier entry increments the global counter barrier_writer, but it does not get decremented if mdd_child_ops() returns error. If the barrier_writer counter does not go to 0, the freeze cannot happen.

      Attachments

        Activity

          People

            timday Tim Day
            timday Tim Day
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: