Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1517

no retry for the bulk operation

    XMLWordPrintable

Details

    • 3
    • 3998

    Description

      lnd_cb.c:558:kgnilnd_setup_phys_buffer()) failed to allocate tx_phys
      [2012-04-07 02:08:24][c5-0c0s5n2]LNet: 29099:0:(gnilnd_cb.c:1068:kgnilnd_tx_done()) $$ error -12 on tx 0xffff88000fe06b40-><?> id 0/0 state GNILND_TX_ALLOCD age 17481575s  msg@0xffff88000fe06bc0 m/v/ty/ck/pck/pl b00fbabe/8/3/0/78db/0 x0:GNILND_MSG_PUT_REQ
      [2012-04-07 02:08:24][c5-0c0s5n2]LustreError: 29099:0:(events.c:198:client_bulk_callback()) event type 0, status -5, desc ffff880627c24000
      

      The error is detected on both client and server; the server expects the client to retry but it doesn't. In the mean time, the OSS issues a lock callback to the client, but the client does not respond because it is waiting for the I/O to complete. Eventually the OSS evicts the client. Lustre does not retry the bulk op when it detects the error.

      Attachments

        Issue Links

          Activity

            People

              keith Keith Mannthey (Inactive)
              aboyko Alexander Boyko
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: