Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13652

[1575337.260035] LNetError: 8719:0:(peer.c:280:lnet_destroy_peer_locked()) LBUG

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.12.4
    • None
    • 2
    • 9223372036854775807

    Description

      OSS LBUG. First time we have seen this.

       

       [1574769.939126] LNetError: 7420:0:(o2iblnd_cb.c:3351:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds
      [1574769.972906] LNetError: 7420:0:(o2iblnd_cb.c:3426:kiblnd_check_conns()) Timed out RDMA with 10.151.11.102@o2ib (293): c: 32, oc: 0, rc: 32
      [1574968.944839] LNetError: 7420:0:(o2iblnd_cb.c:3351:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds
      [1574968.978608] LNetError: 7420:0:(o2iblnd_cb.c:3351:kiblnd_check_txs_locked()) Skipped 3 previous similar messages
      [1574969.012379] LNetError: 7420:0:(o2iblnd_cb.c:3426:kiblnd_check_conns()) Timed out RDMA with 10.151.24.203@o2ib (247): c: 32, oc: 0, rc: 32
      [1574969.053585] LNetError: 7420:0:(o2iblnd_cb.c:3426:kiblnd_check_conns()) Skipped 3 previous similar messages
      [1575256.183968] Lustre: nbp8-OST0103: Connection restored to 10dfb7c6-2481-1ba8-d8c9-5458677b6b29 (at 10.151.31.52@o2ib)
      [1575256.183973] Lustre: Skipped 15281 previous similar messages
      [1575337.223394] LNetError: 8719:0:(peer.c:280:lnet_destroy_peer_locked()) ASSERTION( list_empty(&lp->lp_peer_nets) ) failed: 
      [1575337.260035] LNetError: 8719:0:(peer.c:280:lnet_destroy_peer_locked()) LBUG
      [1575337.283229] Pid: 8719, comm: lnet_discovery 3.10.0-1062.12.1.el7_lustre2124.x86_64 #1 SMP Tue Mar 17 13:32:19 PDT 2020
      [1575337.283233] Call Trace:
      [1575337.283243]  [<ffffffffc0cbd7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [1575337.305316]  [<ffffffffc0cbd87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [1575337.305340]  [<ffffffffc0d56a8a>] lnet_destroy_peer_locked+0x24a/0x350 [lnet]
      [1575337.305351]  [<ffffffffc0d570c5>] lnet_peer_discovery_complete+0x2a5/0x350 [lnet]
      [1575337.305361]  [<ffffffffc0d5bd20>] lnet_peer_discovery+0x6c0/0x1150 [lnet]
      [1575337.305365]  [<ffffffffb20c61f1>] kthread+0xd1/0xe0
      [1575337.305368]  [<ffffffffb278dd37>] ret_from_fork_nospec_end+0x0/0x39
      [1575337.305389]  [<ffffffffffffffff>] 0xffffffffffffffff
      [1575337.305391] Kernel panic - not syncing: LBUG
      [1575337.305393] CPU: 11 PID: 8719 Comm: lnet_discovery Kdump: loaded Tainted: G           OE  ------------   3.10.0-1062.12.1.el7_lustre2124.x86_64 #1
      [1575337.305394] Hardware name: SGI.COM SUMMIT/S2600GZ, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013
      [1575337.305395] Call Trace:
      [1575337.305399]  [<ffffffffb277ac43>] dump_stack+0x19/0x1b
      [1575337.305402]  [<ffffffffb2774987>] panic+0xe8/0x21f
      [1575337.305408]  [<ffffffffc0cbd8cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [1575337.305417]  [<ffffffffc0d56a8a>] lnet_destroy_peer_locked+0x24a/0x350 [lnet]
      [1575337.305425]  [<ffffffffc0d570c5>] lnet_peer_discovery_complete+0x2a5/0x350 [lnet]
      [1575337.305434]  [<ffffffffc0d5bd20>] lnet_peer_discovery+0x6c0/0x1150 [lnet]
      [1575337.305436]  [<ffffffffb20c72e0>] ? wake_up_atomic_t+0x30/0x30
      [1575337.305444]  [<ffffffffc0d5b660>] ? lnet_peer_merge_data+0xde0/0xde0 [lnet]
      [1575337.305446]  [<ffffffffb20c61f1>] kthread+0xd1/0xe0
      [1575337.305448]  [<ffffffffb20c6120>] ? insert_kthread_work+0x40/0x40
      [1575337.305450]  [<ffffffffb278dd37>] ret_from_fork_nospec_begin+0x21/0x21
      [1575337.305452]  [<ffffffffb20c6120>] ? insert_kthread_work+0x40/0x40
      

      Attachments

        Issue Links

          Activity

            People

              ashehata Amir Shehata (Inactive)
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: