Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5629

osp_sync_interpret() ASSERTION( rc || req->rq_transno ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • Lustre 2.6.0, Lustre 2.4.2, Lustre 2.5.3
    • Lustre 2.4.2-14chaos (see github.com/chaos/lustre)
    • 3
    • 15744

    Description

      One of our MDS nodes crashed to day with the following assertion:

      client.c:304:ptlrpc_at_adj_net_latency()) Reported service time 548 > total measured time 165
      osp_sync.c:355:osp_sync_interpret())  ASSERTION( rc || req->rq_transno ) failed
      

      Note that the two messages above were printed in the same second (as reported by syslog) and by the same kernel thread. I don't know if the ptlrpc_at_adj_net_latency() message is actually related to the assertion or not, but the proximity makes it worth noting.

      There were a few OST to which the MDS lost and reestablished a connection a couple of minutes earlier in the log.

      The backtrace was:

      panic
      lbug_with_loc
      osp_sync_interpret
      ptlrpc_check_set
      ptlrpcd_check
      ptlrpcd
      kernel_thread
      

      It was running lustre version 2.4.2-14chaos (see github.com/chaos/lustre).

      We cannot provide logs or crash dumps for this machine.

      Attachments

        1. lbugmay2.zip
          53.38 MB
        2. LU-5629-syslog.bz2
          174 kB

        Issue Links

          Activity

            People

              dmiter Dmitry Eremin (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: