Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9110

sanity test_255a: test failed to respond and timed out

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.9.0, Lustre 2.10.0
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/043a9cd8-efec-11e6-8c0d-5254006e85c2.

      The sub-test test_255a failed with the following error:

      test failed to respond and timed out
      

      This test fail is in the same test as LU-8582, but is an entirely different failure. Not seen in interop, seen on master. panic seen in OST:

      17:58:34:[ 4537.926362] ------------[ cut here ]------------
      17:58:34:[ 4537.927011] WARNING: at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
      17:58:34:[ 4537.927011] list_del corruption. prev->next should be ffffc9000486e010, but was           (null)
      17:58:34:[ 4537.927011] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_ssse3 sha512_generic crypto_null libcfs(OE) zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw ppdev gf128mul glue_helper ablk_helper cryptd pcspkr virtio_balloon i2c_piix4 parport_pc parport nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common ata_piix virtio_blk crc32c_intel 8139too drm serio_raw floppy libata virtio_pci virtio_ring virtio 8139cp i2c_core mii
      17:58:34:[ 4537.927011] CPU: 1 PID: 14765 Comm: sh Tainted: P           OE  ------------   3.10.0-514.6.1.el7_lustre.x86_64 #1
      17:58:34:[ 4537.927011] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      17:58:34:[ 4537.927011]  ffff88001dcf3c28 00000000f729102d ffff88001dcf3be0 ffffffff816863f8
      17:58:34:[ 4537.927011]  ffff88001dcf3c18 ffffffff81085940 ffffc9000486e010 ffff880079378240
      17:58:34:[ 4537.927011]  0000000000000040 0000000000000028 ffff88007b8c3200 ffff88001dcf3c80
      17:58:34:[ 4537.927011] Call Trace:
      17:58:34:[ 4537.927011]  [<ffffffff816863f8>] dump_stack+0x19/0x1b
      17:58:34:[ 4537.927011]  [<ffffffff81085940>] warn_slowpath_common+0x70/0xb0
      17:58:34:[ 4537.927011]  [<ffffffff810859dc>] warn_slowpath_fmt+0x5c/0x80
      17:58:34:[ 4537.927011]  [<ffffffff813332f1>] __list_del_entry+0xa1/0xd0
      17:58:34:[ 4537.927011]  [<ffffffff8133332d>] list_del+0xd/0x30
      17:58:34:[ 4537.927011]  [<ffffffffa0672f1d>] __spl_cache_flush+0xed/0x150 [spl]
      17:58:34:[ 4537.927011]  [<ffffffffa0673046>] spl_cache_flush+0x36/0x50 [spl]
      17:58:34:[ 4537.927011]  [<ffffffffa067371f>] spl_kmem_cache_reap_now+0x10f/0x120 [spl]
      17:58:34:[ 4537.927011]  [<ffffffffa07213c9>] arc_kmem_reap_now+0x79/0xe0 [zfs]
      17:58:34:[ 4537.927011]  [<ffffffffa0726bb7>] arc_shrinker_func+0x97/0x130 [zfs]
      17:58:34:[ 4537.927011]  [<ffffffff81194203>] shrink_slab+0x163/0x330
      17:58:34:[ 4537.927011]  [<ffffffff8121aa1b>] ? iput+0x3b/0x180
      17:58:34:[ 4537.927011]  [<ffffffff81260a23>] drop_caches_sysctl_handler+0xc3/0x120
      17:58:34:[ 4537.927011]  [<ffffffff812776e3>] proc_sys_call_handler+0xd3/0xf0
      17:58:34:[ 4537.927011]  [<ffffffff81277714>] proc_sys_write+0x14/0x20
      17:58:34:[ 4537.927011]  [<ffffffff811fe27d>] vfs_write+0xbd/0x1e0
      17:58:34:[ 4537.927011]  [<ffffffff811fed9f>] SyS_write+0x7f/0xe0
      17:58:34:[ 4537.927011]  [<ffffffff81696a09>] system_call_fastpath+0x16/0x1b
      17:58:34:[ 4537.927011] ---[ end trace 3af46e318dee0d65 ]---
      17:58:34:[ 4538.002361] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      17:58:34:[ 4538.004088] IP: [<ffffffff8133319f>] __list_add+0xf/0xc0
      17:58:34:[ 4538.004088] PGD 1517f067 PUD 88aa067 PMD 0 
      17:58:34:[ 4538.004088] Oops: 0000 [#1] SMP 
      17:58:34:[ 4538.004088] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_ssse3 sha512_generic crypto_null libcfs(OE) zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw ppdev gf128mul glue_helper ablk_helper cryptd pcspkr virtio_balloon i2c_piix4 parport_pc parport nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common ata_piix virtio_blk crc32c_intel 8139too drm serio_raw floppy libata virtio_pci virtio_ring virtio 8139cp i2c_core mii
      17:58:34:[ 4538.004088] CPU: 1 PID: 14765 Comm: sh Tainted: P        W  OE  ------------   3.10.0-514.6.1.el7_lustre.x86_64 #1
      17:58:34:[ 4538.004088] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      17:58:34:[ 4538.004088] task: ffff88003ed73ec0 ti: ffff88001dcf0000 task.ti: ffff88001dcf0000
      17:58:34:[ 4538.004088] RIP: 0010:[<ffffffff8133319f>]  [<ffffffff8133319f>] __list_add+0xf/0xc0
      17:58:34:[ 4538.004088] RSP: 0018:ffff88001dcf3c90  EFLAGS: 00010086
      17:58:34:[ 4538.004088] RAX: 0000000000020000 RBX: ffffc90004a6c000 RCX: 0000000000000004
      17:58:34:[ 4538.004088] RDX: 0000000000000000 RSI: ffffc90004a6c020 RDI: ffffc90004af0018
      17:58:34:[ 4538.004088] RBP: ffff88001dcf3ca8 R08: ffffc90004486018 R09: 007b8c3050400000
      17:58:34:[ 4538.004088] R10: ff6673ee92cc1410 R11: 00000000a2f6ee67 R12: 0000000000000000
      17:58:34:[ 4538.004088] R13: ffffc90004a6c020 R14: 0000000000000010 R15: ffff88007b8c3200
      17:58:34:[ 4538.004088] FS:  00007fd3a13c0740(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
      17:58:34:[ 4538.004088] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      17:58:34:[ 4538.004088] CR2: 0000000000000008 CR3: 000000003d4b7000 CR4: 00000000000406e0
      17:58:34:[ 4538.004088] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      17:58:34:[ 4538.004088] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      17:58:34:[ 4538.004088] Stack:
      17:58:34:[ 4538.004088]  ffffc90004a6c000 ffff880079378240 0000000000000020 ffff88001dcf3cf8
      17:58:34:[ 4538.004088]  ffffffffa0672ee5 0000000400001000 ffffffffa06731f2 ffff88007b8c32a0
      17:58:34:[ 4538.004088]  ffff88007b8c32b8 ffff88007b8c3200 ffff880079378240 0000000000000004
      17:58:34:[ 4538.004088] Call Trace:
      17:58:34:[ 4538.004088]  [<ffffffffa0672ee5>] __spl_cache_flush+0xb5/0x150 [spl]
      17:58:34:[ 4538.004088]  [<ffffffffa06731f2>] ? spl_slab_reclaim+0x122/0x220 [spl]
      17:58:34:[ 4538.004088]  [<ffffffffa0673046>] spl_cache_flush+0x36/0x50 [spl]
      17:58:34:[ 4538.004088]  [<ffffffffa067371f>] spl_kmem_cache_reap_now+0x10f/0x120 [spl]
      17:58:34:[ 4538.004088]  [<ffffffffa07213c9>] arc_kmem_reap_now+0x79/0xe0 [zfs]
      17:58:34:[ 4538.004088]  [<ffffffffa0726bb7>] arc_shrinker_func+0x97/0x130 [zfs]
      17:58:34:[ 4538.004088]  [<ffffffff81194203>] shrink_slab+0x163/0x330
      17:58:34:[ 4538.004088]  [<ffffffff8121aa1b>] ? iput+0x3b/0x180
      17:58:34:[ 4538.004088]  [<ffffffff81260a23>] drop_caches_sysctl_handler+0xc3/0x120
      17:58:34:[ 4538.004088]  [<ffffffff812776e3>] proc_sys_call_handler+0xd3/0xf0
      17:58:34:[ 4538.004088]  [<ffffffff81277714>] proc_sys_write+0x14/0x20
      17:58:34:[ 4538.004088]  [<ffffffff811fe27d>] vfs_write+0xbd/0x1e0
      17:58:34:[ 4538.004088]  [<ffffffff811fed9f>] SyS_write+0x7f/0xe0
      17:58:34:[ 4538.004088]  [<ffffffff81696a09>] system_call_fastpath+0x16/0x1b
      17:58:34:[ 4538.004088] Code: 48 89 df e8 d4 99 ea ff b8 f4 ff ff ff e9 3b ff ff ff b8 f4 ff ff ff e9 31 ff ff ff 55 48 89 e5 41 55 49 89 f5 41 54 49 89 d4 53 <4c> 8b 42 08 48 89 fb 49 39 f0 75 2a 4d 8b 45 00 4d 39 c4 75 68 
      17:58:34:[ 4538.004088] RIP  [<ffffffff8133319f>] __list_add+0xf/0xc0
      17:58:34:[ 4538.004088]  RSP <ffff88001dcf3c90>
      17:58:34:[ 4538.004088] CR2: 0000000000000008
      

      Info required for matching: sanity 255a

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: