Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      The logs on the OST:

      07:05:17:[25848.056002] BUG: soft lockup - CPU#0 stuck for 22s! [ll_ost_seq00_00:30636]
      07:05:17:[25848.056002] Modules linked in: lustre(OE) lod(OE) mdt(OE) mdd(OE) mgs(OE) obdecho(OE) lov(OE) osc(OE) mdc(OE) lmv(OE) ptlrpc_gss(OE) osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) ldiskfs(OE) sha512_generic crypto_null dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ppdev pcspkr ib_isert virtio_balloon iscsi_target_mod i2c_piix4 parport_pc parport ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic crct10dif_common ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm 8139too ata_piix drm 8139cp serio_raw mii virtio_pci virtio_ring i2c_core virtio libata floppy [last unloaded: libcfs]
      07:05:17:[25848.056002] CPU: 0 PID: 30636 Comm: ll_ost_seq00_00 Tainted: G           OE  ------------   3.10.0-327.18.2.el7_lustre.x86_64 #1
      07:05:17:[25848.056002] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      07:05:17:[25848.056002] task: ffff880047004500 ti: ffff8800546cc000 task.ti: ffff8800546cc000
      07:05:17:[25848.056002] RIP: 0010:[<ffffffff8163d5f2>]  [<ffffffff8163d5f2>] _raw_spin_lock+0x32/0x50
      07:05:17:[25848.056002] RSP: 0018:ffff8800546cf850  EFLAGS: 00000293
      07:05:17:[25848.056002] RAX: 0000000000000fb8 RBX: ffff880079f5fea0 RCX: 0000000000000bde
      07:05:17:[25848.056002] RDX: 0000000000000b24 RSI: 0000000000000b24 RDI: ffff8800478e5ba0
      07:05:17:[25848.056002] RBP: ffff8800546cf850 R08: 4010000000000000 R09: 0079f5fea0080000
      07:05:17:[25848.056002] R10: ff680a1fdd77a802 R11: 000000000000000f R12: ffff88007bb5c340
      07:05:17:[25848.056002] R13: ffff880079f5fea0 R14: 00000000659b4173 R15: ffff88004b4648f0
      07:05:17:[25848.056002] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
      07:05:17:[25848.056002] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      07:05:17:[25848.056002] CR2: 00007f4a800054a9 CR3: 000000000194a000 CR4: 00000000000006f0
      07:05:17:[25848.056002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      07:05:17:[25848.056002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      07:05:17:[25848.056002] Stack:
      07:05:17:[25848.056002]  ffff8800546cf8d8 ffffffffa015cbfc ffffffffa0698656 01ff88007922ff38
      07:05:17:[25848.056002]  0000000000008603 ffff8800478e5800 0000000000000000 00000000659b4173
      07:05:17:[25848.056002]  ffff8800546cf904 0000000000000001 ffff8800546cf940 00000000659b4173
      07:05:17:[25848.056002] Call Trace:
      07:05:17:[25848.056002]  [<ffffffffa015cbfc>] do_get_write_access+0x32c/0x4e0 [jbd2]
      07:05:17:[25848.056002]  [<ffffffffa0698656>] ? ldiskfs_getblk+0xa6/0x200 [ldiskfs]
      07:05:17:[25848.056002]  [<ffffffffa015cdd7>] jbd2_journal_get_write_access+0x27/0x40 [jbd2]
      07:05:17:[25848.056002]  [<ffffffffa064670b>] __ldiskfs_journal_get_write_access+0x3b/0x80 [ldiskfs]
      07:05:17:[25848.056002]  [<ffffffffa0c2a108>] iam_txn_add+0x28/0x60 [osd_ldiskfs]
      07:05:17:[25848.056002]  [<ffffffffa0c2c953>] iam_add_rec+0x43/0x2e0 [osd_ldiskfs]
      07:05:17:[25848.056002]  [<ffffffffa0c2b29e>] ? __iam_it_get+0x1ae/0x1b0 [osd_ldiskfs]
      07:05:17:[25848.056002]  [<ffffffffa0c2d49e>] iam_insert+0xce/0x120 [osd_ldiskfs]
      07:05:17:[25848.056002]  [<ffffffffa0c18925>] osd_index_iam_insert+0x225/0x530 [osd_ldiskfs]
      07:05:17:[25848.056002]  [<ffffffffa0967388>] fld_index_create+0x1d8/0x7a0 [fld]
      07:05:17:[25848.056002]  [<ffffffffa05e8654>] ? libcfs_log_return+0x24/0x30 [libcfs]
      07:05:17:[25848.056002]  [<ffffffffa0967bb1>] fld_insert_entry+0x261/0x370 [fld]
      07:05:17:[25848.056002]  [<ffffffffa097c9d1>] seq_server_check_and_alloc_super+0x1c1/0x310 [fid]
      07:05:17:[25848.056002]  [<ffffffffa097cb8f>] seq_server_alloc_meta+0x6f/0x5d0 [fid]
      07:05:17:[25848.056002]  [<ffffffffa097df85>] seq_handler+0x235/0x4b0 [fid]
      07:05:17:[25848.056002]  [<ffffffffa0a61f25>] tgt_request_handle+0x915/0x1320 [ptlrpc]
      07:05:17:[25848.056002]  [<ffffffffa0a0e4bb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
      07:05:17:[25848.056002]  [<ffffffffa0a0c078>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
      07:05:17:[25848.056002]  [<ffffffffa05eb957>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      07:05:17:[25848.056002]  [<ffffffffa0a12570>] ptlrpc_main+0xaa0/0x1dd0 [ptlrpc]
      07:05:17:[25848.056002]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
      07:05:17:[25848.056002]  [<ffffffffa0a11ad0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
      07:05:17:[25848.056002]  [<ffffffff810a5acf>] kthread+0xcf/0xe0
      07:05:17:[25848.056002]  [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
      07:05:17:[25848.056002]  [<ffffffff81646318>] ret_from_fork+0x58/0x90
      07:05:17:[25848.056002]  [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
      07:05:17:[25848.056002] Code: 89 e5 b8 00 00 02 00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0c 0f 1f 44 00 00 f3 90 <83> e8 01 74 0a 0f b7 0f 66 39 ca 75 f1 5d c3 0f 1f 80 00 00 00 
      

      Attachments

        Issue Links

          Activity

            [LU-8327] conf_sanity test_61: soft lockup

            Just in case this could help, I just found this ticket and all others that have been linked/duped to it, and I wonder if some of them could not be related to LU-8685 instead. My feeling comes from the fact that, according to my own debugging/disassembly, the spin-lock being referenced and causing the associated threads to be stuck, at do_get_write_access()+0x32c is (journal_t *)->j_list_lock and thus bug and patch identified in LU-8685 could also be highly related.

            bfaccini Bruno Faccini (Inactive) added a comment - Just in case this could help, I just found this ticket and all others that have been linked/duped to it, and I wonder if some of them could not be related to LU-8685 instead. My feeling comes from the fact that, according to my own debugging/disassembly, the spin-lock being referenced and causing the associated threads to be stuck, at do_get_write_access()+0x32c is (journal_t *)->j_list_lock and thus bug and patch identified in LU-8685 could also be highly related.
            pjones Peter Jones added a comment -

            Landed for 2.9

            pjones Peter Jones added a comment - Landed for 2.9

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/22738/
            Subject: LU-8327 ldiskfs: release bh in make_indexed_dir
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8f7759cad5692b628a662b27fd60677dc806f1b7

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/22738/ Subject: LU-8327 ldiskfs: release bh in make_indexed_dir Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8f7759cad5692b628a662b27fd60677dc806f1b7

            We hit a similar problem during mdtest.

            <format>

            327157.963523] Lustre: scratch-MDT0000: Recovery over after 0:23, of 3 clients 3 recovered and 0 were evicted.
            [360914.220229] Lustre: scratch-MDT0000: haven't heard from client a0cd3281-9880-d337-3d07-517afc288361 (at 10.0.10.69@o2ib) in 241 seconds. I think it's dead, and I am evicting it. exp ffff881fd2af1c00, cur 1475220380 expire 1475220230 last 1475220139
            [360914.220233] Lustre: Skipped 21 previous similar messages
            [375716.231473] BUG: soft lockup - CPU#26 stuck for 23s! [mdt00_030:34083]
            [375716.238822] Modules linked in: nls_utf8 isofs ofd(OE) ost(OE) loop iptable_filter rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) ksocklnd(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ko2iblnd(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) ib_srp(OE) scsi_transport_srp(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx4_en(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) mlx4_core(OE) dm_service_time intel_powerclamp coretemp intel_rapl dm_round_robin kvm crc32_pclmul ghash_clmulni_intel cryptd iTCO_wdt shpchp mxm_wmi iTCO_vendor_support lpc_ich mfd_core sb_edac edac_core mei_me mei i2c_i801 ioatdma pcspkr ipmi_devintf acpi_power_meter
            [375716.238855] ipmi_si ipmi_msghandler wmi acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables ext4 mbcache jbd2 mlx5_ib(OE) ib_core(OE) ib_addr(OE) ib_netlink(OE) sd_mod crc_t10dif crct10dif_generic ast syscopyarea sysfillrect sysimgblt drm_kms_helper crct10dif_pclmul crct10dif_common crc32c_intel ttm ahci igb drm mlx5_core(OE) libahci qla2xxx vxlan dca ip6_udp_tunnel i2c_algo_bit udp_tunnel libata i2c_core mlx_compat(OE) ptp scsi_transport_fc pps_core scsi_tgt dm_mirror dm_region_hash dm_log dm_mod sg
            [375716.238878] CPU: 26 PID: 34083 Comm: mdt00_030 Tainted: G OE ------------ 3.10.0-327.28.3.el7_lustre.2.7.18.ddn0.gd4e0769.x86_64 #1
            [375716.238879] Hardware name: Supermicro X10DDW-i/X10DDW-i, BIOS 2.0 01/11/2016
            [375716.238880] task: ffff880fd0527300 ti: ffff880fcd528000 task.ti: ffff880fcd528000
            [375716.238881] RIP: 0010:[<ffffffff8163e1a0>] [<ffffffff8163e1a0>] _raw_spin_lock+0x30/0x50
            [375716.238887] RSP: 0018:ffff880fcd52b638 EFLAGS: 00000287
            [375716.238888] RAX: 0000000000007420 RBX: ffff880fb2ba2b60 RCX: 000000000000e160
            [375716.238889] RDX: 000000000000dec6 RSI: 000000000000dec6 RDI: ffff880fc6011ba0
            [375716.238889] RBP: ffff880fcd52b638 R08: 8010000000000000 R09: 10264119c0080000
            [375716.238890] R10: efbbc2efd03e7002 R11: ffffea0040975b80 R12: ffff880fb60f4750
            [375716.238891] R13: ffff880f96f91138 R14: ffff880fb60f51a0 R15: ffff880fefb5d820
            [375716.238892] FS: 0000000000000000(0000) GS:ffff88103fb80000(0000) knlGS:0000000000000000
            [375716.238893] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            [375716.238894] CR2: 00007f5af2025854 CR3: 000000000194e000 CR4: 00000000001407e0
            [375716.238894] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
            [375716.238895] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
            [375716.238896] Stack:
            [375716.238896] ffff880fcd52b6c0 ffffffffa0218bfc 8000000000012820 0000000000012820
            [375716.238899] 00000000267cb810 ffff880fc6011800 0000000000000000 0000000000000001
            [375716.238902] ffff880fb0c54f01 ffff880fcd52b6e0 ffffffffa1004e96 00000000e0d19ef6
            [375716.238904] Call Trace:
            [375716.238912] [<ffffffffa0218bfc>] do_get_write_access+0x32c/0x4e0 [jbd2]
            [375716.238924] [<ffffffffa1004e96>] ? ldiskfs_getblk+0xa6/0x200 [ldiskfs]
            [375716.238928] [<ffffffffa0218dd7>] jbd2_journal_get_write_access+0x27/0x40 [jbd2]
            [375716.238932] [<ffffffffa0fe11db>] __ldiskfs_journal_get_write_access+0x3b/0x80 [ldiskfs]
            [375716.238936] [<ffffffffa0219144>] ? jbd2_journal_dirty_metadata+0xd4/0x260 [jbd2]
            [375716.238951] [<ffffffffa10a32e8>] osd_ldiskfs_write_record+0xa8/0x360 [osd_ldiskfs]
            [375716.238957] [<ffffffffa10a3698>] osd_write+0xf8/0x230 [osd_ldiskfs]
            [375716.238987] [<ffffffffa0b3a295>] dt_record_write+0x45/0x130 [obdclass]
            [375716.238997] [<ffffffffa0af769f>] llog_osd_write_rec+0x72f/0x1210 [obdclass]
            [375716.239003] [<ffffffffa109a602>] ? iam_path_release+0x42/0x60 [osd_ldiskfs]
            [375716.239013] [<ffffffffa0ae7f0a>] llog_write_rec+0xaa/0x280 [obdclass]
            [375716.239023] [<ffffffffa0aebfae>] llog_cat_add_rec+0x46e/0xe00 [obdclass]
            [375716.239031] [<ffffffffa0ae514a>] llog_add+0x7a/0x1a0 [obdclass]
            [375716.239044] [<ffffffffa13c789d>] osp_sync_add_rec+0x24d/0x9a0 [osp]
            [375716.239050] [<ffffffffa1096e71>] ? osd_oi_delete+0x1a1/0x420 [osd_ldiskfs]
            [375716.239055] [<ffffffffa13cb147>] osp_sync_add+0x47/0x50 [osp]
            [375716.239059] [<ffffffffa13b7f1f>] osp_object_destroy+0x10f/0x170 [osp]
            [375716.239073] [<ffffffffa1310d87>] lod_object_destroy+0x677/0xa50 [lod]
            [375716.239084] [<ffffffffa135d2e7>] ? mdd_mark_dead_object+0x27/0x3d0 [mdd]
            [375716.239091] [<ffffffffa136a20e>] mdd_finish_unlink+0x2fe/0x460 [mdd]
            [375716.239097] [<ffffffffa136e5ed>] mdd_unlink+0x8dd/0xa90 [mdd]
            [375716.239120] [<ffffffffa122d936>] mdt_reint_unlink+0xa96/0x11f0 [mdt]
            [375716.239137] [<ffffffffa0b5699e>] ? lu_ucred+0x1e/0x30 [obdclass]
            [375716.239146] [<ffffffffa1231420>] mdt_reint_rec+0x80/0x210 [mdt]
            [375716.239155] [<ffffffffa1212299>] mdt_reint_internal+0x5d9/0xb30 [mdt]
            [375716.239164] [<ffffffffa121d237>] mdt_reint+0x67/0x140 [mdt]
            [375716.239208] [<ffffffffa0db4adb>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
            [375716.239232] [<ffffffffa0d5797b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            [375716.239249] [<ffffffffa07b7d78>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
            [375716.239272] [<ffffffffa0d54a48>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
            [375716.239296] [<ffffffffa0d5b2a0>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
            [375716.239300] [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
            [375716.239323] [<ffffffffa0d5a6a0>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc]
            [375716.239328] [<ffffffff810a5b2f>] kthread+0xcf/0xe0
            [375716.239330] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140
            [375716.239333] [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            [375716.239335] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140
            [375716.239336] Code: 55 48 89 e5 b8 00 00 02 00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0c 0f 1f 44 00 00 <f3> 90 83 e8 01 74 0a 0f b7 0f 66 39 ca 75 f1 5d c3 0f 1f 80 00
            [375744.242810] BUG: soft lockup - CPU#26 stuck for 23s! [mdt00_030:34083]
            [375744.250127] Modules linked in: nls_utf8 isofs ofd(OE) ost(OE) loop iptable_filter rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) ksocklnd(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ko2iblnd(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) ib_srp(OE) scsi_transport_srp(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx4_en(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) mlx4_core(OE) dm_service_time intel_powerclamp coretemp intel_rapl dm_round_robin kvm crc32_pclmul ghash_clmulni_intel cryptd iTCO_wdt shpchp mxm_wmi iTCO_vendor_support lpc_ich mfd_core sb_edac edac_core mei_me mei i2c_i801 ioatdma pcspkr ipmi_devintf acpi_power_meter
            [375744.250147] ipmi_si ipmi_msghandler wmi acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables ext4 mbcache jbd2 mlx5_ib(OE) ib_core(OE) ib_addr(OE) ib_netlink(OE) sd_mod crc_t10dif crct10dif_generic ast syscopyarea sysfillrect sysimgblt drm_kms_helper crct10dif_pclmul crct10dif_common crc32c_intel ttm ahci igb drm mlx5_core(OE) libahci qla2xxx vxlan dca ip6_udp_tunnel i2c_algo_bit udp_tunnel libata i2c_core mlx_compat(OE) ptp scsi_transport_fc pps_core scsi_tgt dm_mirror dm_region_hash dm_log dm_mod sg
            [375744.250162] CPU: 26 PID: 34083 Comm: mdt00_030 Tainted: G OEL ------------ 3.10.0-327.28.3.el7_lustre.2.7.18.ddn0.gd4e0769.x86_64 #1
            [375744.250163] Hardware name: Supermicro X10DDW-i/X10DDW-i, BIOS 2.0 01/11/2016
            [375744.250164] task: ffff880fd0527300 ti: ffff880fcd528000 task.ti: ffff880fcd528000
            [375744.250165] RIP: 0010:[<ffffffff8163e1a2>] [<ffffffff8163e1a2>] _raw_spin_lock+0x32/0x50
            [375744.250168] RSP: 0018:ffff880fcd52b638 EFLAGS: 00000287
            [375744.250169] RAX: 000000000000184a RBX: ffff880fb2ba2b60 RCX: 000000000000e160
            [375744.250170] RDX: 000000000000dec6 RSI: 000000000000dec6 RDI: ffff880fc6011ba0
            [375744.250170] RBP: ffff880fcd52b638 R08: 8010000000000000 R09: 10264119c0080000
            [375744.250171] R10: efbbc2efd03e7002 R11: ffffea0040975b80 R12: ffff880fb60f4750
            [375744.250172] R13: ffff880f96f91138 R14: ffff880fb60f51a0 R15: ffff880fefb5d820
            [375744.250173] FS: 0000000000000000(0000) GS:ffff88103fb80000(0000) knlGS:0000000000000000
            [375744.250173] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            [375744.250174] CR2: 00007f5af2025854 CR3: 000000000194e000 CR4: 00000000001407e0
            [375744.250175] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
            [375744.250176] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
            [375744.250176] Stack:
            [375744.250177] ffff880fcd52b6c0 ffffffffa0218bfc 8000000000012820 0000000000012820
            [375744.250180] 00000000267cb810 ffff880fc6011800 0000000000000000 0000000000000001
            [375744.250182] ffff880fb0c54f01 ffff880fcd52b6e0 ffffffffa1004e96 00000000e0d19ef6
            [375744.250185] Call Trace:
            [375744.250191] [<ffffffffa0218bfc>] do_get_write_access+0x32c/0x4e0 [jbd2]
            [375744.250198] [<ffffffffa1004e96>] ? ldiskfs_getblk+0xa6/0x200 [ldiskfs]
            [375744.250203] [<ffffffffa0218dd7>] jbd2_journal_get_write_access+0x27/0x40 [jbd2]
            [375744.250207] [<ffffffffa0fe11db>] __ldiskfs_journal_get_write_access+0x3b/0x80 [ldiskfs]
            [375744.250211] [<ffffffffa0219144>] ? jbd2_journal_dirty_metadata+0xd4/0x260 [jbd2]
            [375744.250219] [<ffffffffa10a32e8>] osd_ldiskfs_write_record+0xa8/0x360 [osd_ldiskfs]
            [375744.250225] [<ffffffffa10a3698>] osd_write+0xf8/0x230 [osd_ldiskfs]
            [375744.250240] [<ffffffffa0b3a295>] dt_record_write+0x45/0x130 [obdclass]
            [375744.250250] [<ffffffffa0af769f>] llog_osd_write_rec+0x72f/0x1210 [obdclass]
            [375744.250256] [<ffffffffa109a602>] ? iam_path_release+0x42/0x60 [osd_ldiskfs]
            [375744.250266] [<ffffffffa0ae7f0a>] llog_write_rec+0xaa/0x280 [obdclass]
            [375744.250275] [<ffffffffa0aebfae>] llog_cat_add_rec+0x46e/0xe00 [obdclass]
            [375744.250283] [<ffffffffa0ae514a>] llog_add+0x7a/0x1a0 [obdclass]
            [375744.250288] [<ffffffffa13c789d>] osp_sync_add_rec+0x24d/0x9a0 [osp]
            [375744.250294] [<ffffffffa1096e71>] ? osd_oi_delete+0x1a1/0x420 [osd_ldiskfs]
            [375744.250299] [<ffffffffa13cb147>] osp_sync_add+0x47/0x50 [osp]
            [375744.250302] [<ffffffffa13b7f1f>] osp_object_destroy+0x10f/0x170 [osp]
            [375744.250311] [<ffffffffa1310d87>] lod_object_destroy+0x677/0xa50 [lod]
            [375744.250316] [<ffffffffa135d2e7>] ? mdd_mark_dead_object+0x27/0x3d0 [mdd]
            [375744.250321] [<ffffffffa136a20e>] mdd_finish_unlink+0x2fe/0x460 [mdd]
            [375744.250325] [<ffffffffa136e5ed>] mdd_unlink+0x8dd/0xa90 [mdd]
            [375744.250334] [<ffffffffa122d936>] mdt_reint_unlink+0xa96/0x11f0 [mdt]
            [375744.250347] [<ffffffffa0b5699e>] ? lu_ucred+0x1e/0x30 [obdclass]
            [375744.250355] [<ffffffffa1231420>] mdt_reint_rec+0x80/0x210 [mdt]
            [375744.250361] [<ffffffffa1212299>] mdt_reint_internal+0x5d9/0xb30 [mdt]
            [375744.250367] [<ffffffffa121d237>] mdt_reint+0x67/0x140 [mdt]
            [375744.250390] [<ffffffffa0db4adb>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
            [375744.250410] [<ffffffffa0d5797b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            [375744.250421] [<ffffffffa07b7d78>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
            [375744.250439] [<ffffffffa0d54a48>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
            [375744.250457] [<ffffffffa0d5b2a0>] ptlrpc_main+0xc00/0x1f60 [ptlrpc]
            [375744.250460] [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
            [375744.250478] [<ffffffffa0d5a6a0>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc]
            [375744.250480] [<ffffffff810a5b2f>] kthread+0xcf/0xe0
            [375744.250483] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140
            [375744.250485] [<ffffffff81646e58>] ret_from_fork+0x58/0x90
            [375744.250487] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140
            [375744.250488] Code: 89 e5 b8 00 00 02 00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0c 0f 1f 44 00 00 f3 90 <83> e8 01 74 0a 0f b7 0f 66 39 ca 75 f1 5d c3 0f 1f 80 00 00 00
            [375750.710430] INFO: rcu_sched self-detected stall on CPU
            [375750.713449] INFO: rcu_sched detected stalls on CPUs/tasks:
            [375750.713449] {
            [375750.713450] 26

            </format>

            wangshilong Wang Shilong (Inactive) added a comment - We hit a similar problem during mdtest. <format> 327157.963523] Lustre: scratch-MDT0000: Recovery over after 0:23, of 3 clients 3 recovered and 0 were evicted. [360914.220229] Lustre: scratch-MDT0000: haven't heard from client a0cd3281-9880-d337-3d07-517afc288361 (at 10.0.10.69@o2ib) in 241 seconds. I think it's dead, and I am evicting it. exp ffff881fd2af1c00, cur 1475220380 expire 1475220230 last 1475220139 [360914.220233] Lustre: Skipped 21 previous similar messages [375716.231473] BUG: soft lockup - CPU#26 stuck for 23s! [mdt00_030:34083] [375716.238822] Modules linked in: nls_utf8 isofs ofd(OE) ost(OE) loop iptable_filter rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) ksocklnd(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ko2iblnd(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) ib_srp(OE) scsi_transport_srp(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx4_en(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) mlx4_core(OE) dm_service_time intel_powerclamp coretemp intel_rapl dm_round_robin kvm crc32_pclmul ghash_clmulni_intel cryptd iTCO_wdt shpchp mxm_wmi iTCO_vendor_support lpc_ich mfd_core sb_edac edac_core mei_me mei i2c_i801 ioatdma pcspkr ipmi_devintf acpi_power_meter [375716.238855] ipmi_si ipmi_msghandler wmi acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables ext4 mbcache jbd2 mlx5_ib(OE) ib_core(OE) ib_addr(OE) ib_netlink(OE) sd_mod crc_t10dif crct10dif_generic ast syscopyarea sysfillrect sysimgblt drm_kms_helper crct10dif_pclmul crct10dif_common crc32c_intel ttm ahci igb drm mlx5_core(OE) libahci qla2xxx vxlan dca ip6_udp_tunnel i2c_algo_bit udp_tunnel libata i2c_core mlx_compat(OE) ptp scsi_transport_fc pps_core scsi_tgt dm_mirror dm_region_hash dm_log dm_mod sg [375716.238878] CPU: 26 PID: 34083 Comm: mdt00_030 Tainted: G OE ------------ 3.10.0-327.28.3.el7_lustre.2.7.18.ddn0.gd4e0769.x86_64 #1 [375716.238879] Hardware name: Supermicro X10DDW-i/X10DDW-i, BIOS 2.0 01/11/2016 [375716.238880] task: ffff880fd0527300 ti: ffff880fcd528000 task.ti: ffff880fcd528000 [375716.238881] RIP: 0010: [<ffffffff8163e1a0>] [<ffffffff8163e1a0>] _raw_spin_lock+0x30/0x50 [375716.238887] RSP: 0018:ffff880fcd52b638 EFLAGS: 00000287 [375716.238888] RAX: 0000000000007420 RBX: ffff880fb2ba2b60 RCX: 000000000000e160 [375716.238889] RDX: 000000000000dec6 RSI: 000000000000dec6 RDI: ffff880fc6011ba0 [375716.238889] RBP: ffff880fcd52b638 R08: 8010000000000000 R09: 10264119c0080000 [375716.238890] R10: efbbc2efd03e7002 R11: ffffea0040975b80 R12: ffff880fb60f4750 [375716.238891] R13: ffff880f96f91138 R14: ffff880fb60f51a0 R15: ffff880fefb5d820 [375716.238892] FS: 0000000000000000(0000) GS:ffff88103fb80000(0000) knlGS:0000000000000000 [375716.238893] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [375716.238894] CR2: 00007f5af2025854 CR3: 000000000194e000 CR4: 00000000001407e0 [375716.238894] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [375716.238895] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [375716.238896] Stack: [375716.238896] ffff880fcd52b6c0 ffffffffa0218bfc 8000000000012820 0000000000012820 [375716.238899] 00000000267cb810 ffff880fc6011800 0000000000000000 0000000000000001 [375716.238902] ffff880fb0c54f01 ffff880fcd52b6e0 ffffffffa1004e96 00000000e0d19ef6 [375716.238904] Call Trace: [375716.238912] [<ffffffffa0218bfc>] do_get_write_access+0x32c/0x4e0 [jbd2] [375716.238924] [<ffffffffa1004e96>] ? ldiskfs_getblk+0xa6/0x200 [ldiskfs] [375716.238928] [<ffffffffa0218dd7>] jbd2_journal_get_write_access+0x27/0x40 [jbd2] [375716.238932] [<ffffffffa0fe11db>] __ldiskfs_journal_get_write_access+0x3b/0x80 [ldiskfs] [375716.238936] [<ffffffffa0219144>] ? jbd2_journal_dirty_metadata+0xd4/0x260 [jbd2] [375716.238951] [<ffffffffa10a32e8>] osd_ldiskfs_write_record+0xa8/0x360 [osd_ldiskfs] [375716.238957] [<ffffffffa10a3698>] osd_write+0xf8/0x230 [osd_ldiskfs] [375716.238987] [<ffffffffa0b3a295>] dt_record_write+0x45/0x130 [obdclass] [375716.238997] [<ffffffffa0af769f>] llog_osd_write_rec+0x72f/0x1210 [obdclass] [375716.239003] [<ffffffffa109a602>] ? iam_path_release+0x42/0x60 [osd_ldiskfs] [375716.239013] [<ffffffffa0ae7f0a>] llog_write_rec+0xaa/0x280 [obdclass] [375716.239023] [<ffffffffa0aebfae>] llog_cat_add_rec+0x46e/0xe00 [obdclass] [375716.239031] [<ffffffffa0ae514a>] llog_add+0x7a/0x1a0 [obdclass] [375716.239044] [<ffffffffa13c789d>] osp_sync_add_rec+0x24d/0x9a0 [osp] [375716.239050] [<ffffffffa1096e71>] ? osd_oi_delete+0x1a1/0x420 [osd_ldiskfs] [375716.239055] [<ffffffffa13cb147>] osp_sync_add+0x47/0x50 [osp] [375716.239059] [<ffffffffa13b7f1f>] osp_object_destroy+0x10f/0x170 [osp] [375716.239073] [<ffffffffa1310d87>] lod_object_destroy+0x677/0xa50 [lod] [375716.239084] [<ffffffffa135d2e7>] ? mdd_mark_dead_object+0x27/0x3d0 [mdd] [375716.239091] [<ffffffffa136a20e>] mdd_finish_unlink+0x2fe/0x460 [mdd] [375716.239097] [<ffffffffa136e5ed>] mdd_unlink+0x8dd/0xa90 [mdd] [375716.239120] [<ffffffffa122d936>] mdt_reint_unlink+0xa96/0x11f0 [mdt] [375716.239137] [<ffffffffa0b5699e>] ? lu_ucred+0x1e/0x30 [obdclass] [375716.239146] [<ffffffffa1231420>] mdt_reint_rec+0x80/0x210 [mdt] [375716.239155] [<ffffffffa1212299>] mdt_reint_internal+0x5d9/0xb30 [mdt] [375716.239164] [<ffffffffa121d237>] mdt_reint+0x67/0x140 [mdt] [375716.239208] [<ffffffffa0db4adb>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc] [375716.239232] [<ffffffffa0d5797b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] [375716.239249] [<ffffffffa07b7d78>] ? lc_watchdog_touch+0x68/0x180 [libcfs] [375716.239272] [<ffffffffa0d54a48>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc] [375716.239296] [<ffffffffa0d5b2a0>] ptlrpc_main+0xc00/0x1f60 [ptlrpc] [375716.239300] [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0 [375716.239323] [<ffffffffa0d5a6a0>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc] [375716.239328] [<ffffffff810a5b2f>] kthread+0xcf/0xe0 [375716.239330] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140 [375716.239333] [<ffffffff81646e58>] ret_from_fork+0x58/0x90 [375716.239335] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140 [375716.239336] Code: 55 48 89 e5 b8 00 00 02 00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0c 0f 1f 44 00 00 <f3> 90 83 e8 01 74 0a 0f b7 0f 66 39 ca 75 f1 5d c3 0f 1f 80 00 [375744.242810] BUG: soft lockup - CPU#26 stuck for 23s! [mdt00_030:34083] [375744.250127] Modules linked in: nls_utf8 isofs ofd(OE) ost(OE) loop iptable_filter rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) ksocklnd(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ko2iblnd(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) ib_srp(OE) scsi_transport_srp(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx4_en(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) mlx4_core(OE) dm_service_time intel_powerclamp coretemp intel_rapl dm_round_robin kvm crc32_pclmul ghash_clmulni_intel cryptd iTCO_wdt shpchp mxm_wmi iTCO_vendor_support lpc_ich mfd_core sb_edac edac_core mei_me mei i2c_i801 ioatdma pcspkr ipmi_devintf acpi_power_meter [375744.250147] ipmi_si ipmi_msghandler wmi acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_multipath ip_tables ext4 mbcache jbd2 mlx5_ib(OE) ib_core(OE) ib_addr(OE) ib_netlink(OE) sd_mod crc_t10dif crct10dif_generic ast syscopyarea sysfillrect sysimgblt drm_kms_helper crct10dif_pclmul crct10dif_common crc32c_intel ttm ahci igb drm mlx5_core(OE) libahci qla2xxx vxlan dca ip6_udp_tunnel i2c_algo_bit udp_tunnel libata i2c_core mlx_compat(OE) ptp scsi_transport_fc pps_core scsi_tgt dm_mirror dm_region_hash dm_log dm_mod sg [375744.250162] CPU: 26 PID: 34083 Comm: mdt00_030 Tainted: G OEL ------------ 3.10.0-327.28.3.el7_lustre.2.7.18.ddn0.gd4e0769.x86_64 #1 [375744.250163] Hardware name: Supermicro X10DDW-i/X10DDW-i, BIOS 2.0 01/11/2016 [375744.250164] task: ffff880fd0527300 ti: ffff880fcd528000 task.ti: ffff880fcd528000 [375744.250165] RIP: 0010: [<ffffffff8163e1a2>] [<ffffffff8163e1a2>] _raw_spin_lock+0x32/0x50 [375744.250168] RSP: 0018:ffff880fcd52b638 EFLAGS: 00000287 [375744.250169] RAX: 000000000000184a RBX: ffff880fb2ba2b60 RCX: 000000000000e160 [375744.250170] RDX: 000000000000dec6 RSI: 000000000000dec6 RDI: ffff880fc6011ba0 [375744.250170] RBP: ffff880fcd52b638 R08: 8010000000000000 R09: 10264119c0080000 [375744.250171] R10: efbbc2efd03e7002 R11: ffffea0040975b80 R12: ffff880fb60f4750 [375744.250172] R13: ffff880f96f91138 R14: ffff880fb60f51a0 R15: ffff880fefb5d820 [375744.250173] FS: 0000000000000000(0000) GS:ffff88103fb80000(0000) knlGS:0000000000000000 [375744.250173] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [375744.250174] CR2: 00007f5af2025854 CR3: 000000000194e000 CR4: 00000000001407e0 [375744.250175] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [375744.250176] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [375744.250176] Stack: [375744.250177] ffff880fcd52b6c0 ffffffffa0218bfc 8000000000012820 0000000000012820 [375744.250180] 00000000267cb810 ffff880fc6011800 0000000000000000 0000000000000001 [375744.250182] ffff880fb0c54f01 ffff880fcd52b6e0 ffffffffa1004e96 00000000e0d19ef6 [375744.250185] Call Trace: [375744.250191] [<ffffffffa0218bfc>] do_get_write_access+0x32c/0x4e0 [jbd2] [375744.250198] [<ffffffffa1004e96>] ? ldiskfs_getblk+0xa6/0x200 [ldiskfs] [375744.250203] [<ffffffffa0218dd7>] jbd2_journal_get_write_access+0x27/0x40 [jbd2] [375744.250207] [<ffffffffa0fe11db>] __ldiskfs_journal_get_write_access+0x3b/0x80 [ldiskfs] [375744.250211] [<ffffffffa0219144>] ? jbd2_journal_dirty_metadata+0xd4/0x260 [jbd2] [375744.250219] [<ffffffffa10a32e8>] osd_ldiskfs_write_record+0xa8/0x360 [osd_ldiskfs] [375744.250225] [<ffffffffa10a3698>] osd_write+0xf8/0x230 [osd_ldiskfs] [375744.250240] [<ffffffffa0b3a295>] dt_record_write+0x45/0x130 [obdclass] [375744.250250] [<ffffffffa0af769f>] llog_osd_write_rec+0x72f/0x1210 [obdclass] [375744.250256] [<ffffffffa109a602>] ? iam_path_release+0x42/0x60 [osd_ldiskfs] [375744.250266] [<ffffffffa0ae7f0a>] llog_write_rec+0xaa/0x280 [obdclass] [375744.250275] [<ffffffffa0aebfae>] llog_cat_add_rec+0x46e/0xe00 [obdclass] [375744.250283] [<ffffffffa0ae514a>] llog_add+0x7a/0x1a0 [obdclass] [375744.250288] [<ffffffffa13c789d>] osp_sync_add_rec+0x24d/0x9a0 [osp] [375744.250294] [<ffffffffa1096e71>] ? osd_oi_delete+0x1a1/0x420 [osd_ldiskfs] [375744.250299] [<ffffffffa13cb147>] osp_sync_add+0x47/0x50 [osp] [375744.250302] [<ffffffffa13b7f1f>] osp_object_destroy+0x10f/0x170 [osp] [375744.250311] [<ffffffffa1310d87>] lod_object_destroy+0x677/0xa50 [lod] [375744.250316] [<ffffffffa135d2e7>] ? mdd_mark_dead_object+0x27/0x3d0 [mdd] [375744.250321] [<ffffffffa136a20e>] mdd_finish_unlink+0x2fe/0x460 [mdd] [375744.250325] [<ffffffffa136e5ed>] mdd_unlink+0x8dd/0xa90 [mdd] [375744.250334] [<ffffffffa122d936>] mdt_reint_unlink+0xa96/0x11f0 [mdt] [375744.250347] [<ffffffffa0b5699e>] ? lu_ucred+0x1e/0x30 [obdclass] [375744.250355] [<ffffffffa1231420>] mdt_reint_rec+0x80/0x210 [mdt] [375744.250361] [<ffffffffa1212299>] mdt_reint_internal+0x5d9/0xb30 [mdt] [375744.250367] [<ffffffffa121d237>] mdt_reint+0x67/0x140 [mdt] [375744.250390] [<ffffffffa0db4adb>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc] [375744.250410] [<ffffffffa0d5797b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] [375744.250421] [<ffffffffa07b7d78>] ? lc_watchdog_touch+0x68/0x180 [libcfs] [375744.250439] [<ffffffffa0d54a48>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc] [375744.250457] [<ffffffffa0d5b2a0>] ptlrpc_main+0xc00/0x1f60 [ptlrpc] [375744.250460] [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0 [375744.250478] [<ffffffffa0d5a6a0>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc] [375744.250480] [<ffffffff810a5b2f>] kthread+0xcf/0xe0 [375744.250483] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140 [375744.250485] [<ffffffff81646e58>] ret_from_fork+0x58/0x90 [375744.250487] [<ffffffff810a5a60>] ? kthread_create_on_node+0x140/0x140 [375744.250488] Code: 89 e5 b8 00 00 02 00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0c 0f 1f 44 00 00 f3 90 <83> e8 01 74 0a 0f b7 0f 66 39 ca 75 f1 5d c3 0f 1f 80 00 00 00 [375750.710430] INFO: rcu_sched self-detected stall on CPU [375750.713449] INFO: rcu_sched detected stalls on CPUs/tasks: [375750.713449] { [375750.713450] 26 </format>

            Yang Sheng (yang.sheng@intel.com) uploaded a new patch: http://review.whamcloud.com/22738
            Subject: LU-8327 ldiskfs: release bh in make_indexed_dir
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: df68010b3459742dfb137b75eb6dd82beef911d1

            gerrit Gerrit Updater added a comment - Yang Sheng (yang.sheng@intel.com) uploaded a new patch: http://review.whamcloud.com/22738 Subject: LU-8327 ldiskfs: release bh in make_indexed_dir Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: df68010b3459742dfb137b75eb6dd82beef911d1

            All available logs can be accessed via the Maloo test results:
            https://testing.hpdd.intel.com/test_sets/bcc051cc-3a33-11e6-acf3-5254006e85c2

            yong.fan nasf (Inactive) added a comment - All available logs can be accessed via the Maloo test results: https://testing.hpdd.intel.com/test_sets/bcc051cc-3a33-11e6-acf3-5254006e85c2

            Assign this to Alex.

            Fan, do you have crash dumps to share with Alex? Thanks!

            cheneva1 Evan D. Chen (Inactive) added a comment - Assign this to Alex. Fan, do you have crash dumps to share with Alex? Thanks!
            green Oleg Drokin added a comment -

            rhel7.2 kernel has a bug where irtualized spinlocks don't work correctly.

            Not sayig this is the case here, but to keep in mind.

            Fixes from upstream:
            4f9d1382e6f8 ("x86/spinlock: Replace ACCESS_ONCE with READ_ONCE")
            78bff1c8684f ("x86/ticketlock: Fix spin_unlock_wait() livelock")
            d6abfdb20223 ("x86/spinlocks/paravirt: Fix memory corruption on unlock")

            green Oleg Drokin added a comment - rhel7.2 kernel has a bug where irtualized spinlocks don't work correctly. Not sayig this is the case here, but to keep in mind. Fixes from upstream: 4f9d1382e6f8 ("x86/spinlock: Replace ACCESS_ONCE with READ_ONCE") 78bff1c8684f ("x86/ticketlock: Fix spin_unlock_wait() livelock") d6abfdb20223 ("x86/spinlocks/paravirt: Fix memory corruption on unlock")
            yong.fan nasf (Inactive) added a comment - https://testing.hpdd.intel.com/test_sets/bcc051cc-3a33-11e6-acf3-5254006e85c2

            I think this has been seen before. did you get it locally or with Maloo?

            bzzz Alex Zhuravlev added a comment - I think this has been seen before. did you get it locally or with Maloo?

            People

              ys Yang Sheng
              yong.fan nasf (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: