Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6556

changelog catalog corruption if all possible records is define

Details

    • 3
    • 9223372036854775807

    Description

      After our last lustre upgrade, On tera100 and tgcc site, some
      lustre fs have meet the same corruption on the changelog_catalog
      The robinhood node panic like in the LU-6471 and the crash analyze
      show that is the changelog-catalog file that have a corruption.
      The file is too big than the maximum size of this type of file and
      the record who produces the panic is not in the right place.

      Attachments

        Issue Links

          Activity

            [LU-6556] changelog catalog corruption if all possible records is define

            Backported of the http://review.whamcloud.com/14912 patch also need the patch http://review.whamcloud.com/17052 "LU-7329 obdclass: sync device to flush journal callbacks" patch to avoid introducing test failures in sanity test_60a.

            adilger Andreas Dilger added a comment - Backported of the http://review.whamcloud.com/14912 patch also need the patch http://review.whamcloud.com/17052 " LU-7329 obdclass: sync device to flush journal callbacks" patch to avoid introducing test failures in sanity test_60a.

            Landed to 2.8

            jgmitter Joseph Gmitter (Inactive) added a comment - Landed to 2.8

            LU-7340 has been created to address previous ChangeLogs related and more graceful handling of ENOSPC conditions.

            bfaccini Bruno Faccini (Inactive) added a comment - LU-7340 has been created to address previous ChangeLogs related and more graceful handling of ENOSPC conditions.

            Bruno, that is fine. Please file a separate bug and copy over relevant comments before closing this one, so that they are not forgotten.

            adilger Andreas Dilger added a comment - Bruno, that is fine. Please file a separate bug and copy over relevant comments before closing this one, so that they are not forgotten.

            Andreas, Robert,
            I also think that your concerns are really good points for more ChangeLogs related enhancements, but also that they should addressed in a separate ticket, when this ticket could now be closed.
            Do you agree ?

            bfaccini Bruno Faccini (Inactive) added a comment - Andreas, Robert, I also think that your concerns are really good points for more ChangeLogs related enhancements, but also that they should addressed in a separate ticket, when this ticket could now be closed. Do you agree ?

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14912/
            Subject: LU-6556 obdclass: re-allow catalog to wrap around
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 4691290f6d39bffaa3e463697fbc3ac351015e76

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14912/ Subject: LU-6556 obdclass: re-allow catalog to wrap around Project: fs/lustre-release Branch: master Current Patch Set: Commit: 4691290f6d39bffaa3e463697fbc3ac351015e76
            rread Robert Read added a comment -

            I suggest going a step further and proactively remove stale watchers after a configurable period or when hitting a max watermark to try o avoid running out of space. Also, being unregistered is a reasonable notification to the application that they've lost their changelog feed and need to resync.

            rread Robert Read added a comment - I suggest going a step further and proactively remove stale watchers after a configurable period or when hitting a max watermark to try o avoid running out of space. Also, being unregistered is a reasonable notification to the application that they've lost their changelog feed and need to resync.

            People

              bfaccini Bruno Faccini (Inactive)
              apercher Antoine Percher
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: