Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17432

add "slow start" to some CWARN/CERROR messages

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Fixed
    • Icon: Minor Minor
    • Lustre 2.17.0
    • Lustre 2.14.0, Lustre 2.16.0
    • 3
    • 9223372036854775807

      In some error cases, it is OK to have an occasional error (e.g. RPC timeout) that is handled transparently by RPC retry, but repeated errors on the local node or with the same peer indicates a more significant error.

      It would be useful to re-enable some CWARN/CERROR messages that were quieted because they were too noisy, but now we are losing insight into problems on nodes that have continuous errors. There should be a new variant of CERROR/CWARN that have a "skip first N messages" parameter and then start printing to the console as normal.

            fdilger Fred Dilger
            adilger Andreas Dilger
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: