Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4330

LustreError: 46336:0:(events.c:433:ptlrpc_master_callback()) ASSERTION( callback == request_out_callback || callback == reply_in_callback || callback == client_bulk_callback || callback == request_in_callback || callback == reply_out_callback ... ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.1.6
    • None
    • 3
    • 11850

    Description

      Hi,

      More and more compute nodes of several different customer clusters are hitting an LBUG on this 'assertion failed' issue:

      2013-11-21 14:06:54 LustreError: 46336:0:(events.c:433:ptlrpc_master_callback()) ASSERTION( callback == request_out_callback || callback == reply_in_callback || callback == client_bulk_callback || callback == request_in_callback || callback == reply_out_callback || callback == server_bulk_callback ) failed:
      2013-11-21 14:06:54 LustreError: 46336:0:(events.c:433:ptlrpc_master_callback()) LBUG
      2013-11-21 14:06:54 Nov 21 14:06:54 Pid: 46336, comm: kiblnd_sd_00
      2013-11-21 14:06:54 compute5666 kernel
      2013-11-21 14:06:54 : LustreError: 4Call Trace:
      2013-11-21 14:06:54 6336:0:(events.c [<ffffffffa041c7f5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      2013-11-21 14:06:54 :433:ptlrpc_mast [<ffffffffa041ce07>] lbug_with_loc+0x47/0xb0 [libcfs]
      2013-11-21 14:06:54 er_callback()) A [<ffffffffa06a426c>] ptlrpc_master_callback+0xcc/0xd0 [ptlrpc]
      2013-11-21 14:06:54 SSERTION( callba [<ffffffffa048ebd2>] lnet_enq_event_locked+0x62/0xd0 [lnet]
      2013-11-21 14:06:54 ck == request_ou [<ffffffffa048ecdb>] lnet_finalize+0x9b/0x2f0 [lnet]
      2013-11-21 14:06:54 t_callback || ca [<ffffffffa083d073>] kiblnd_recv+0x103/0x570 [ko2iblnd]
      2013-11-21 14:06:54 llback == reply_ [<ffffffffa04928dd>] lnet_ni_recv+0xad/0x2f0 [lnet]
      2013-11-21 14:06:54 in_callback || c [<ffffffffa0492c06>] lnet_recv_put+0xe6/0x120 [lnet]
      2013-11-21 14:06:54 allback == clien [<ffffffffa0499c33>] lnet_parse+0x1273/0x1b80 [lnet]
      2013-11-21 14:06:54 t_bulk_callback [<ffffffff81042ca3>] ? enqueue_task+0x43/0x90
      2013-11-21 14:06:54 || callback == r [<ffffffffa083d7ab>] kiblnd_handle_rx+0x2cb/0x680 [ko2iblnd]
      2013-11-21 14:06:54 equest_in_callba [<ffffffffa083e590>] kiblnd_rx_complete+0x2d0/0x440 [ko2iblnd]
      2013-11-21 14:06:54 ck || callback = [<ffffffff81042a63>] ? __wake_up+0x53/0x70
      2013-11-21 14:06:54 = reply_out_call [<ffffffffa083e762>] kiblnd_complete+0x62/0xe0 [ko2iblnd]
      2013-11-21 14:06:54 back || callback [<ffffffffa083eb19>] kiblnd_scheduler+0x339/0x7a0 [ko2iblnd]
      2013-11-21 14:06:54 == server_bulk_ [<ffffffff8104a320>] ? default_wake_function+0x0/0x20
      2013-11-21 14:06:54 callback ) faile [<ffffffffa083e7e0>] ? kiblnd_scheduler+0x0/0x7a0 [ko2iblnd]
      2013-11-21 14:06:54 d:
      2013-11-21 14:06:54 Nov 21 14:06 [<ffffffff8100412a>] child_rip+0xa/0x20
      2013-11-21 14:06:54 :54 compute5666 ke [<ffffffffa083e7e0>] ? kiblnd_scheduler+0x0/0x7a0 [ko2iblnd]
      2013-11-21 14:06:54 rnel: LustreErro [<ffffffffa083e7e0>] ? kiblnd_scheduler+0x0/0x7a0 [ko2iblnd]
      2013-11-21 14:06:54 r: 46336:0:(even [<ffffffff81004120>] ? child_rip+0x0/0x20
      2013-11-21 14:06:54 ts.c:433:ptlrpc_
      2013-11-21 14:06:54 master_callback()) LBUG
      2013-11-21 14:06:54 Nov 21 14:06:54 compute566Kernel panic - not syncing: LBUG
      

      For information, systems are running with kernel boot parameter 'tolerant=1' set.

      We have a crash dump that we will upload to ftp.

      Sebastien.

      Attachments

        Issue Links

          Activity

            People

              bfaccini Bruno Faccini (Inactive)
              sebastien.buisson Sebastien Buisson (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated: