Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9828

LBUG ASSERTION( desc->bd_nob_transferred == 0 ) failed:

Details

    • 3
    • 9223372036854775807

    Description

      One of clients crashed due to the following LBUG.

      LustreError: 11818:0:(events.c:201:client_bulk_callback()) event type 2, status -103, desc ffff880827971600
      LustreError: 11840:0:(niobuf.c:329:ptlrpc_register_bulk()) ASSERTION( desc->bd_nob_transferred == 0 ) failed:
      LustreError: 11818:0:(events.c:201:client_bulk_callback()) event type 2, status -103, desc ffff880d40623400
      Lustre: yshare1-OST0023-osc-ffff882049a1c800: Connection to yshare1-OST0023 (at 172.28.8.204@o2ib1) was lost; in progress operations using this service will wait for recovery to complete
      Lustre: Skipped 21 previous similar messages
      LNet: 11818:0:(o2iblnd_cb.c:1364:kiblnd_reconnect_peer()) Abort reconnection of 172.28.8.204@o2ib1: connected
      LNet: 11818:0:(o2iblnd_cb.c:1364:kiblnd_reconnect_peer()) Skipped 1 previous similar message
      LustreError: 11840:0:(niobuf.c:329:ptlrpc_register_bulk()) LBUG
      Pid: 11840, comm: ptlrpcd_01_01
      
      Call Trace:
       [<ffffffffa0967895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa0967e97>] lbug_with_loc+0x47/0xb0 [libcfs]
       [<ffffffffa0cae07c>] ptlrpc_register_bulk+0xfc/0x9c0 [ptlrpc]
       [<ffffffffa0985c74>] ? cfs_percpt_unlock+0x24/0xb0 [libcfs]
       [<ffffffffa0a1b7b4>] ? LNetMDUnlink+0xd4/0x160 [lnet]
       [<ffffffffa0cb5c64>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc]
       [<ffffffffa0caf5af>] ptl_send_rpc+0x1af/0xea0 [ptlrpc]
       [<ffffffffa0ce6804>] ? sptlrpc_req_refresh_ctx+0x154/0x910 [ptlrpc]
       [<ffffffffa0ca90b2>] ptlrpc_check_set+0x1462/0x1bf0 [ptlrpc]
       [<ffffffffa0cd6d83>] ptlrpcd_check+0x3d3/0x610 [ptlrpc]
       [<ffffffffa0cd7232>] ptlrpcd+0x272/0x4f0 [ptlrpc]
       [<ffffffff8106c500>] ? default_wake_function+0x0/0x20
       [<ffffffffa0cd6fc0>] ? ptlrpcd+0x0/0x4f0 [ptlrpc]
       [<ffffffff810a640e>] kthread+0x9e/0xc0
       [<ffffffff8100c28a>] child_rip+0xa/0x20
       [<ffffffff810a6370>] ? kthread+0x0/0xc0
       [<ffffffff8100c280>] ? child_rip+0x0/0x20
      
      

       

      Attachments

        Issue Links

          Activity

            [LU-9828] LBUG ASSERTION( desc->bd_nob_transferred == 0 ) failed:

            Andriy Skulysh (c17819@cray.com) uploaded a new patch: https://review.whamcloud.com/30368
            Subject: LU-9828 ptlrpc: ASSERTION(desc->bd_nob_transferred == 0)
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d4d5658f3cdad115c06a998e7fa91a2bd89e33dd

            gerrit Gerrit Updater added a comment - Andriy Skulysh (c17819@cray.com) uploaded a new patch: https://review.whamcloud.com/30368 Subject: LU-9828 ptlrpc: ASSERTION(desc->bd_nob_transferred == 0) Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d4d5658f3cdad115c06a998e7fa91a2bd89e33dd

            The assertion failure can happen only during resend vs reply race. It is better to skip reply and restore the assertion. I'll commit the patch.

            askulysh Andriy Skulysh added a comment - The assertion failure can happen only during resend vs reply race. It is better to skip reply and restore the assertion. I'll commit the patch.

            John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28759/
            Subject: LU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set:
            Commit: 39a275578e5d77d14f5b50b3c2a3fc924081e03c

            gerrit Gerrit Updater added a comment - John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28759/ Subject: LU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0 Project: fs/lustre-release Branch: b2_10 Current Patch Set: Commit: 39a275578e5d77d14f5b50b3c2a3fc924081e03c

            Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28759
            Subject: LU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set: 1
            Commit: 7289ae9b0767ac65323bad97471b15f735154024

            gerrit Gerrit Updater added a comment - Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28759 Subject: LU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0 Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: 7289ae9b0767ac65323bad97471b15f735154024
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28491/
            Subject: LU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: e6490ea6cf0b793c0b47f17ac5a5fa3a2a136e0d

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28491/ Subject: LU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0 Project: fs/lustre-release Branch: master Current Patch Set: Commit: e6490ea6cf0b793c0b47f17ac5a5fa3a2a136e0d
            green Oleg Drokin added a comment -

            I just hit this on my testbed as well

            green Oleg Drokin added a comment - I just hit this on my testbed as well

            Amir Shehata (amir.shehata@intel.com) uploaded a new patch: https://review.whamcloud.com/28491
            Subject: LU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 88b65811680989d7dbc7620716ed1fa5d9388d28

            gerrit Gerrit Updater added a comment - Amir Shehata (amir.shehata@intel.com) uploaded a new patch: https://review.whamcloud.com/28491 Subject: LU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 88b65811680989d7dbc7620716ed1fa5d9388d28

            People

              ashehata Amir Shehata (Inactive)
              mdiep Minh Diep
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: