Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2410

Interop 2.3<->2.4 Failure on test suite recovery-small test_50: unable to handle kernel NULL pointer

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.4.0
    • None
    • server: 2.3 RHEL6
      client: lustre-master build# 1065 RHEL6
    • 3
    • 5716

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/8788848a-3982-11e2-9fda-52540035b04c.

      The sub-test test_50 failed with the following error:

      test failed to respond and timed out

      08:25:35:LustreError: 166-1: MGC10.10.4.154@tcp: Connection to MGS (at 10.10.4.154@tcp) was lost; in progress operations using this service will fail
      08:25:46:------------[ cut here ]------------
      08:25:46:WARNING: at lib/list_debug.c:26 __list_add+0x6d/0xa0() (Not tainted)
      08:25:46:Hardware name: KVM
      08:25:46:list_add corruption. next->prev should be prev (ffff88007c6f7710), but was (null). (next=ffff88007196b340).
      08:25:46:Modules linked in: osd_ldiskfs(U) fsfilt_ldiskfs(U) ldiskfs(U) lustre(U) ofd(U) ost(U) cmm(U) mdt(U) mdd(U) mds(U) mgs(U) jbd2 obdecho(U) mgc(U) lquota(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
      08:25:46:Pid: 8350, comm: ll_ost_io00_017 Not tainted 2.6.32-279.5.1.el6_lustre.gb16fe80.x86_64 #1
      08:25:46:Call Trace:
      08:25:46: [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0
      08:25:46: [<ffffffff8106b836>] ? warn_slowpath_fmt+0x46/0x50
      08:25:46: [<ffffffff8128343d>] ? __list_add+0x6d/0xa0
      08:25:46: [<ffffffffa04f4c73>] ? lnet_md_link+0x53/0xf0 [lnet]
      08:25:46: [<ffffffffa04f59ed>] ? LNetMDBind+0x20d/0x4c0 [lnet]
      08:25:46: [<ffffffffa076edf5>] ? ptlrpc_start_bulk_transfer+0xf5/0x640 [ptlrpc]
      08:25:46: [<ffffffffa07407b0>] ? target_bulk_io+0x180/0x950 [ptlrpc]
      08:25:46: [<ffffffffa04465f1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      08:25:46: [<ffffffffa0436885>] ? cfs_waitq_init+0x15/0x20 [libcfs]
      08:25:46: [<ffffffffa0765c56>] ? new_bulk+0x106/0x210 [ptlrpc]
      08:25:46: [<ffffffffa05c9081>] ? class_export_get+0x81/0x90 [obdclass]
      08:25:47: [<ffffffffa0c2b467>] ? ost_brw_write+0x1357/0x1600 [ost]
      08:25:47: [<ffffffff8127ce76>] ? vsnprintf+0x2b6/0x5f0
      08:25:47: [<ffffffffa04465f1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      08:25:47: [<ffffffffa0c3102c>] ? ost_handle+0x360c/0x4850 [ost]
      08:25:47: [<ffffffffa04465f1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      08:25:47: [<ffffffffa04423f4>] ? libcfs_id2str+0x74/0xb0 [libcfs]
      08:25:47: [<ffffffffa0784b3c>] ? ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc]
      08:25:47: [<ffffffffa043665e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      08:25:47: [<ffffffffa077bf37>] ? ptlrpc_wait_event+0xa7/0x2a0 [ptlrpc]
      08:25:47: [<ffffffffa04465f1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      08:25:47: [<ffffffff810533f3>] ? __wake_up+0x53/0x70
      08:25:47: [<ffffffffa0786111>] ? ptlrpc_main+0xbf1/0x19e0 [ptlrpc]
      08:25:47: [<ffffffffa0785520>] ? ptlrpc_main+0x0/0x19e0 [ptlrpc]
      08:25:47: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
      08:25:47: [<ffffffffa0785520>] ? ptlrpc_main+0x0/0x19e0 [ptlrpc]
      08:25:47: [<ffffffffa0785520>] ? ptlrpc_main+0x0/0x19e0 [ptlrpc]
      08:25:47: [<ffffffff8100c140>] ? child_rip+0x0/0x20
      08:25:47:---[ end trace dc5e882009dc5dab ]---
      08:25:47:Lustre: lustre-OST0000: Received new MDS connection from 10.10.4.154@tcp, removing former export from same NID
      08:25:47:Lustre: Skipped 6 previous similar messages
      08:25:47:BUG: unable to handle kernel NULL pointer dereference at (null)
      

      Attachments

        Activity

          People

            wc-triage WC Triage
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: