Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13763

ptlrpc_invalidate_import()) lsrza-OST0000_UUID: Unregistering RPCs found (0). Network is sluggish? Waiting them to error out.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0, Lustre 2.12.6
    • Lustre 2.12.4
    • TOSS 3.6-3 / RH78
      in-kernel OFED
      3.10.0-1127.8.2.1chaos.ch6.x86_64
      lustre-2.12.4_6.chaos-1.ch6.x86_64
    • 3
    • 9223372036854775807

    Description

      Console log messages following this pattern, repeatedly, for several days:

      LustreError: 67801:0:(import.c:361:ptlrpc_invalidate_import()) lsrza-OST0000_UUID: rc = -110 waiting for callback (1 != 0)
      LustreError: 67801:0:(import.c:387:ptlrpc_invalidate_import()) @@@ still on sending list  req@ffff8c3eb65f7500 x1669124751850560/t0(0) o4->lsrza-OST0000-osc-ffff8c44c608a000@172.21.3.5@o2ib700:6/4 lens 488/448 e 2 to 0 dl 1592847228 ref 1 fl Interpret:E/0/ffffffff rc -5/-1
      LustreError: 67801:0:(import.c:401:ptlrpc_invalidate_import()) lsrza-OST0000_UUID: Unregistering RPCs found (0). Network is sluggish? Waiting them to error out.
      

      Note that the number in the parentheses is 0. This refers to imp->imp_unregistering, an atomic variable that looks like it is intended to track the number of RPC buffers we ware waiting for the underlying network to unregister so know that no data will be lost. But there is still one RPC on the sending list, so why is imp_unregistering 0?

      Attachments

        1. dk.rzgenie28.1594702860
          12.70 MB
        2. console.rzgenie28-20200705.gz
          121 kB
        3. console.rzgenie28-20200619.gz
          235 kB
        4. console.rzgenie28
          1.33 MB

        Activity

          People

            tappro Mikhail Pershin
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: