Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-19015

Possible records skipping during changelog processing when an ENOSPC occurred while writing a record

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.17.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Multiple threads get into mdd_changelog_write_rec(), they all get their own offsets and now start dt_record_write() in parallel.
      Due to scheduling one thread with an offset corresponding to block X+1 starts first and allocates block X+1, few more threads now can write to X+1.
      Another thread with block X can't allocate it due to ENOSPC.

      This situation leads to a sparse file for the changelog. When processing, dt_read() will get a zeroed block for an offset that has not been written. And llog_process_thread() will skip an entire chunk (2 blocks), losing changelog records from the second block. I understand that there is a small chance of this happening, but it is still possible that regression could occur after LU-18218.

      Attachments

        Issue Links

          Activity

            [LU-19015] Possible records skipping during changelog processing when an ENOSPC occurred while writing a record
            pjones Peter Jones added a comment -

            Merged for 2.17

            pjones Peter Jones added a comment - Merged for 2.17
            gerrit Gerrit Updater added a comment -

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/59267/
            Subject: LU-19015 llog: logic for skipping a zeroed record
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 919c5d25fc45121466ae0ea803558039a2162538

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/59267/ Subject: LU-19015 llog: logic for skipping a zeroed record Project: fs/lustre-release Branch: master Current Patch Set: Commit: 919c5d25fc45121466ae0ea803558039a2162538

            "Alexander Boyko <alexander.boyko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/59267
            Subject: LU-19015 llog: logic for skipping a zeroed record
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 89c2d5b51cc436311b755ba90bb3408450c719eb

            gerrit Gerrit Updater added a comment - "Alexander Boyko <alexander.boyko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/59267 Subject: LU-19015 llog: logic for skipping a zeroed record Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 89c2d5b51cc436311b755ba90bb3408450c719eb

            People

              aboyko Alexander Boyko
              aboyko Alexander Boyko
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: