Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11981

lnet_is_health_check() Msg is in inconsistent state, don't perform health checking (0, 2)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.12.4
    • Lustre 2.12.0
    • clients and routers: Lustre 2.12.0_1.chaos
      lustre servers: Lustre 2.10.6_2.chaos

      Linux version 3.10.0-957.1.3.1chaos.ch6.x86_64
      Clients OmniPath <-> routers <-> Servers mlx5
    • 3
    • 9223372036854775807

    Description

      Over the span of about 20 minutes, routers reported the following in their console logs:
      2019-02-19 10:05:02 [330235.278414] LNetError: 33048:0:(lib-msg.c:811:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 2)
      2019-02-19 10:05:02 [330235.294305] LNetError: 33048:0:(lib-msg.c:811:lnet_is_health_check()) Skipped 1646 previous similar messages

      While the lustre servers were being rebooted.
      (0, 2) corresponds to:
      msg->msg_ev.status == 0 (success)
      msg->msg_health_status == 2 (LNET_MSG_STATUS_LOCAL_DROPPED)

      See https://github.com/LLNL/lustre/releases for contents of 2.12.0_1.chaos.

      Attachments

        Activity

          People

            ashehata Amir Shehata (Inactive)
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: