Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9279

coral-beta-combined build 124 kernel BUG at include/linux/scatterlist.h:65! invalid opcode: 0000 [#1] SMP

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • None
    • Lustre 2.9.0
    • Lustre 2.9.0, but with special zfs: fs/zfs -b coral-betel-combined build 124
    • 1
    • 9223372036854775807

    Description

      Running IOR, Mdtest, fsx, and FileAger on 4 clients to two OSS with dRAID pools with metadata segregation and 1 MDS we hit the following:

      [78289.557925] -----------[ cut here ]-----------
      [78289.564140] kernel BUG at include/linux/scatterlist.h:65!
      [78289.571153] invalid opcode: 0000 1 SMP
      [78289.576735] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) sha512_generic crypto_null rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ses dm_service_time enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd mpt3sas ipmi_ssif sb_edac iTCO_wdt ipmi_devintf iTCO_vendor_support
      [78289.665352] edac_core ioatdma ipmi_si sg pcspkr raid_class scsi_transport_sas ipmi_msghandler mei_me shpchp lpc_ich acpi_pad i2c_i801 mei acpi_power_meter mfd_core wmi dm_multipath dm_mod ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ib_sa ip6_udp_tunnel ib_mad udp_tunnel ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt igb drm_kms_helper crct10dif_pclmul ttm crct10dif_common ptp crc32c_intel ahci pps_core libahci drm mlx4_core dca libata i2c_algo_bit i2c_core [last unloaded: zunicode]
      [78289.725227] CPU: 37 PID: 51095 Comm: ll_ost_io00_005 Tainted: G IOE ------------ 3.10.0-327.36.3.el7.x86_64 #1
      [78289.739202] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
      [78289.752574] task: ffff882001da8b80 ti: ffff882008ed8000 task.ti: ffff882008ed8000
      [78289.762680] RIP: 0010:[<ffffffffa0996fef>] [<ffffffffa0996fef>] cfs_crypto_hash_update_page+0x9f/0xb0 [libcfs]
      [78289.775760] RSP: 0018:ffff882008edbab8 EFLAGS: 00010202
      [78289.783469] RAX: 0000000000000002 RBX: ffff88068f158580 RCX: 0000000000000000
      [78289.793264] RDX: 0000000000000020 RSI: 0000000000000000 RDI: ffff882008edbad8
      [78289.803076] RBP: ffff882008edbb00 R08: 00000000000195a0 R09: ffff882008edbab8
      [78289.812907] R10: ffff88103e807900 R11: 0000000000000001 R12: 3534333231303635
      [78289.822752] R13: 0000000032313036 R14: 0000000000000433 R15: 0000000000000000
      [78289.832613] FS: 0000000000000000(0000) GS:ffff88103f0c0000(0000) knlGS:0000000000000000
      [78289.843577] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [78289.851940] CR2: 00007fa078dc8028 CR3: 000000000194a000 CR4: 00000000001407e0
      [78289.861897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [78289.871871] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [78289.881850] Stack:
      [78289.886131] 0000000000000002 0000000000000000 0000000000000000 0000000000000000
      [78289.896576] 00000000ec45cb06 0000000000000000 ffff880bc199b201 ffff880e5e4a7e00
      [78289.907042] 0000000000000000 ffff882008edbb68 ffffffffa0dd5459 ffff880ff80640a8
      [78289.917526] Call Trace:
      [78289.922480] [<ffffffffa0dd5459>] tgt_checksum_bulk.isra.33+0x35a/0x4e7 [ptlrpc]
      [78289.932997] [<ffffffffa0dae21d>] tgt_brw_write+0x114d/0x1640 [ptlrpc]
      [78289.942464] [<ffffffff81632d15>] ? __slab_free+0x10e/0x277
      [78289.950833] [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
      [78289.959081] [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
      [78289.968337] [<ffffffffa0d04560>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
      [78289.978578] [<ffffffffa0daa225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
      [78289.988447] [<ffffffffa0d561ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
      [78289.999139] [<ffffffffa099d128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
      [78290.008864] [<ffffffffa0d53d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
      [78290.018520] [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
      [78290.027562] [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
      [78290.036171] [<ffffffffa0d5a260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
      [78290.045239] [<ffffffffa0d597c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
      [78290.055472] [<ffffffff810a5b8f>] kthread+0xcf/0xe0
      [78290.062884] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
      [78290.072171] [<ffffffff81646a98>] ret_from_fork+0x58/0x90
      [78290.080149] [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
      [78290.089344] Code: 89 43 38 48 8b 43 20 ff 50 c0 48 8b 55 d8 65 48 33 14 25 28 00 00 00 75 0d 48 83 c4 28 5b 41 5c 41 5d 41 5e 5d c3 e8 61 40 6e e0 <0f> 0b 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
      [78290.115466] RIP [<ffffffffa0996fef>] cfs_crypto_hash_update_page+0x9f/0xb0 [libcfs]
      [78290.126018] RSP <ffff882008edbab8>

      Version information:
      [root@wolf-3 10.8.1.3-2017-03-15-15:01:39]# rpm -qa |grep -i lustre
      kmod-lustre-tests-2.9.0_dirty-1.el7.centos.x86_64
      lustre-tests-2.9.0_dirty-1.el7.centos.x86_64
      lustre-osd-zfs-mount-2.9.0_dirty-1.el7.centos.x86_64
      lustre-2.9.0_dirty-1.el7.centos.x86_64
      lustre-iokit-2.9.0_dirty-1.el7.centos.x86_64
      kmod-lustre-2.9.0_dirty-1.el7.centos.x86_64
      kmod-lustre-osd-zfs-2.9.0_dirty-1.el7.centos.x86_64
      [root@wolf-3 10.8.1.3-2017-03-15-15:01:39]# rpm -qa |grep zfs
      libzfs2-0.7.0-rc3_21_g6324695.el7.centos.x86_64
      kmod-zfs-0.7.0-rc3_21_g6324695.el7.centos.x86_64
      zfs-test-0.7.0-rc3_21_g6324695.el7.centos.x86_64
      lustre-osd-zfs-mount-2.9.0_dirty-1.el7.centos.x86_64
      zfs-0.7.0-rc3_21_g6324695.el7.centos.x86_64
      kmod-lustre-osd-zfs-2.9.0_dirty-1.el7.centos.x86_64

      PID: 51095 TASK: ffff882001da8b80 CPU: 37 COMMAND: "ll_ost_io00_005"
      [ffff882008edac50] list_del at ffffffff8130c6dd
      [ffff882008edac68] __rmqueue at ffffffff8117069a
      [ffff882008edacb0] zone_statistics at ffffffff81189b89
      [ffff882008edadb0] list_del at ffffffff8130c6dd
      [ffff882008edadc8] __rmqueue at ffffffff8117069a
      [ffff882008edae10] zone_statistics at ffffffff81189b89
      [ffff882008edaed0] list_del at ffffffff8130c6dd
      [ffff882008edaee8] get_partial_node at ffffffff8163306c
      [ffff882008edaf40] __alloc_pages_nodemask at ffffffff81173327
      [ffff882008edafd8] mempool_alloc_slab at ffffffff8116c235
      [ffff882008edb070] kmem_cache_alloc at ffffffff811c1693
      [ffff882008edb0b0] mempool_alloc_slab at ffffffff8116c235
      [ffff882008edb0c0] mempool_alloc at ffffffff8116c379
      [ffff882008edb118] __blk_segment_map_sg at ffffffff812d0736
      [ffff882008edb128] update_curr at ffffffff810c15cc
      [ffff882008edb140] account_entity_dequeue at ffffffff810be46e
      [ffff882008edb168] dequeue_entity at ffffffff810c1a96
      [ffff882008edb1b0] list_del at ffffffff8130c6dd
      [ffff882008edb1e0] mga_dirty_update at ffffffffa01614e7 [mgag200]
      [ffff882008edb230] mga_imageblit at ffffffffa016162f [mgag200]
      [ffff882008edb250] bit_putcs at ffffffff81356997
      [ffff882008edb260] mga_dirty_update at ffffffffa01614e7 [mgag200]
      [ffff882008edb2b0] mga_imageblit at ffffffffa016162f [mgag200]
      [ffff882008edb330] sys_fillrect at ffffffffa00101a8 [sysfillrect]
      [ffff882008edb350] mga_dirty_update at ffffffffa01614e7 [mgag200]
      [ffff882008edb380] mga_dirty_update at ffffffffa01614e7 [mgag200]
      [ffff882008edb3e0] mga_dirty_update at ffffffffa01614e7 [mgag200]
      [ffff882008edb430] mga_imageblit at ffffffffa016162f [mgag200]
      [ffff882008edb450] bit_putcs at ffffffff81356997
      [ffff882008edb460] mga_dirty_update at ffffffffa01614e7 [mgag200]
      [ffff882008edb4b0] mga_imageblit at ffffffffa016162f [mgag200]
      [ffff882008edb530] sys_fillrect at ffffffffa00101a8 [sysfillrect]
      [ffff882008edb550] mga_dirty_update at ffffffffa01614e7 [mgag200]
      [ffff882008edb5a0] append_elf_note at ffffffff810f12e4
      [ffff882008edb5f8] crash_save_cpu at ffffffff810f2619
      [ffff882008edb700] cfs_crypto_hash_update_page at ffffffffa0996fef [libcfs]
      [ffff882008edb778] machine_kexec at ffffffff81051e9b
      [ffff882008edb7d8] crash_kexec at ffffffff810f27e2
      [ffff882008edb860] cfs_crypto_hash_update_page at ffffffffa0996fef [libcfs]
      [ffff882008edb8a8] oops_end at ffffffff8163f448
      [ffff882008edb8d0] die at ffffffff8101859b
      [ffff882008edb900] do_trap at ffffffff8163eb00
      [ffff882008edb950] do_invalid_op at ffffffff81015204
      [ffff882008edb968] cfs_crypto_hash_update_page at ffffffffa0996fef [libcfs]
      [ffff882008edb9a0] crypto_create_tfm at ffffffff812aaa08
      [ffff882008edb9d0] crypto_init_shash_ops_async at ffffffff812b2607
      [ffff882008edba00] invalid_op at ffffffff8164825e
      [ffff882008edba88] cfs_crypto_hash_update_page at ffffffffa0996fef [libcfs]
      [ffff882008edbab0] cfs_crypto_hash_update_page at ffffffffa0996f94 [libcfs]
      [ffff882008edbb08] tgt_checksum_bulk at ffffffffa0dd5459 [ptlrpc]
      [ffff882008edbb70] tgt_brw_write at ffffffffa0dae21d [ptlrpc]
      [ffff882008edbb98] __slab_free at ffffffff81632d15
      [ffff882008edbbc8] update_curr at ffffffff810c15cc
      [ffff882008edbbe0] account_entity_dequeue at ffffffff810be46e
      [ffff882008edbc88] target_bulk_timeout at ffffffffa0d04560 [ptlrpc]
      [ffff882008edbcd8] tgt_request_handle at ffffffffa0daa225 [ptlrpc]
      [ffff882008edbd20] ptlrpc_server_handle_request at ffffffffa0d561ab [ptlrpc]
      [ffff882008edbd28] lc_watchdog_touch at ffffffffa099d128 [libcfs]
      [ffff882008edbd50] ptlrpc_wait_event at ffffffffa0d53d68 [ptlrpc]
      [ffff882008edbd58] default_wake_function at ffffffff810b8952
      [ffff882008edbd68] __wake_up_common at ffffffff810af0b8
      [ffff882008edbde8] ptlrpc_main at ffffffffa0d5a260 [ptlrpc]
      [ffff882008edbea8] ptlrpc_main at ffffffffa0d597c0 [ptlrpc]
      [ffff882008edbec8] kthread at ffffffff810a5b8f
      [ffff882008edbf30] kthread at ffffffff810a5ac0
      [ffff882008edbf50] ret_from_fork at ffffffff81646a98
      [ffff882008edbf80] kthread at ffffffff810a5ac0

      Attachments

        Issue Links

          Activity

            People

              utopiabound Nathaniel Clark
              jsalians_intel John Salinas (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: