Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1517

no retry for the bulk operation

XMLWordPrintable

    • 3
    • 3998

      lnd_cb.c:558:kgnilnd_setup_phys_buffer()) failed to allocate tx_phys
      [2012-04-07 02:08:24][c5-0c0s5n2]LNet: 29099:0:(gnilnd_cb.c:1068:kgnilnd_tx_done()) $$ error -12 on tx 0xffff88000fe06b40-><?> id 0/0 state GNILND_TX_ALLOCD age 17481575s  msg@0xffff88000fe06bc0 m/v/ty/ck/pck/pl b00fbabe/8/3/0/78db/0 x0:GNILND_MSG_PUT_REQ
      [2012-04-07 02:08:24][c5-0c0s5n2]LustreError: 29099:0:(events.c:198:client_bulk_callback()) event type 0, status -5, desc ffff880627c24000
      

      The error is detected on both client and server; the server expects the client to retry but it doesn't. In the mean time, the OSS issues a lock callback to the client, but the client does not respond because it is waiting for the I/O to complete. Eventually the OSS evicts the client. Lustre does not retry the bulk op when it detects the error.

            keith Keith Mannthey (Inactive)
            aboyko Alexander Boyko
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: