Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
3
-
9223372036854775807
Description
I tried to run a debugging kernel with latest master with autotest and sanity-quota fell apart with multiple sleeping under spinlock problems culminating in double taking a spinlock:
08:09:46:[10054.232995] BUG: sleeping function called from invalid context at mm/slab.c:3054 08:09:46:[10054.237505] in_atomic(): 1, irqs_disabled(): 0, pid: 25313, name: mdt00_003 08:09:46:[10054.241875] CPU: 0 PID: 25313 Comm: mdt00_003 Tainted: G W OE ------------ 3.10.0-327.22.2.el7_lustre.x86_64 #1 08:09:46:[10054.243752] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 08:09:46:[10054.245245] ffff8800399ede50 000000001b0e0c50 ffff88003cec7a28 ffffffff8164bed6 08:09:46:[10054.246953] ffff88003cec7a38 ffffffff810b5639 ffff88003cec7ad0 ffffffff811cb595 08:09:46:[10054.248630] ffffffffa016233f 0000000000000046 ffff880027ae7410 ffffffffa0bfcfb5 08:09:46:[10054.250317] Call Trace: 08:09:46:[10054.251601] [<ffffffff8164bed6>] dump_stack+0x19/0x1b 08:09:46:[10054.253108] [<ffffffff810b5639>] __might_sleep+0xd9/0x100 08:09:46:[10054.254619] [<ffffffff811cb595>] kmem_cache_alloc_trace+0x65/0x630 08:09:46:[10054.256179] [<ffffffffa016233f>] ? jbd2_journal_stop+0x1ef/0x400 [jbd2] 08:09:46:[10054.257791] [<ffffffffa0bfcfb5>] ? qmt_glimpse_lock+0x155/0x780 [lquota] 08:09:46:[10054.259396] [<ffffffff810bcab6>] ? try_to_wake_up+0x1b6/0x320 08:09:46:[10054.260937] [<ffffffffa0bfcfb5>] qmt_glimpse_lock+0x155/0x780 [lquota] 08:09:46:[10054.262581] [<ffffffffa0c00a2f>] qmt_glb_lock_notify+0x12f/0x310 [lquota] 08:09:46:[10054.264180] [<ffffffffa0bfae19>] qmt_set.constprop.14+0x4d9/0x700 [lquota] 08:09:46:[10054.265796] [<ffffffffa0bfb1fe>] qmt_quotactl+0x1be/0x630 [lquota] 08:09:46:[10054.267394] [<ffffffffa0dde014>] mdt_quotactl+0x514/0x610 [mdt] 08:09:46:[10054.269032] [<ffffffffa0a8b7e5>] tgt_request_handle+0x925/0x1330 [ptlrpc] 08:09:46:[10054.270655] [<ffffffffa0a3924e>] ptlrpc_server_handle_request+0x22e/0xaa0 [ptlrpc] 08:09:46:[10054.272376] [<ffffffffa0a37aee>] ? ptlrpc_wait_event+0xae/0x350 [ptlrpc] 08:09:46:[10054.273983] [<ffffffff810bcc92>] ? default_wake_function+0x12/0x20 08:09:46:[10054.275549] [<ffffffff810b2cd8>] ? __wake_up_common+0x58/0x90 08:09:46:[10054.277121] [<ffffffffa0a3d018>] ptlrpc_main+0xa58/0x1db0 [ptlrpc] 08:09:46:[10054.278700] [<ffffffffa0a3c5c0>] ? ptlrpc_register_service+0xe60/0xe60 [ptlrpc] 08:09:46:[10054.280352] [<ffffffff810a8a24>] kthread+0xe4/0xf0 08:09:46:[10054.281827] [<ffffffff810a8940>] ? kthread_create_on_node+0x140/0x140 08:09:46:[10054.283415] [<ffffffff8165d3d8>] ret_from_fork+0x58/0x90 08:09:46:[10054.284925] [<ffffffff810a8940>] ? kthread_create_on_node+0x140/0x140
08:09:46:[10058.183002] BUG: scheduling while atomic: qmt_reba_lustre/24447/0x10000002 08:09:46:[10058.184521] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) ldiskfs(OE) dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic crct10dif_common ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ppdev pcspkr virtio_balloon i2c_piix4 parport_pc parport nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm 8139too drm ata_piix i2c_core serio_raw virtio_pci virtio_ring virtio libata 8139cp mii floppy 08:09:46:[10058.198939] CPU: 0 PID: 24447 Comm: qmt_reba_lustre Tainted: G W OE ------------ 3.10.0-327.22.2.el7_lustre.x86_64 #1 08:09:46:[10058.202211] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 08:09:46:[10058.203895] ffff880027ae3fd8 00000000444ee29f ffff880027ae3c20 ffffffff8164bed6 08:09:46:[10058.205776] ffff880027ae3c30 ffffffff81648241 ffff880027ae3c90 ffffffff8165223c 08:09:46:[10058.207645] ffff880027ae7410 ffff880027ae3fd8 ffff880027ae3fd8 ffff880027ae3fd8 08:09:46:[10058.209514] Call Trace: 08:09:46:[10058.210983] [<ffffffff8164bed6>] dump_stack+0x19/0x1b 08:09:46:[10058.212645] [<ffffffff81648241>] __schedule_bug+0x4d/0x5b 08:09:46:[10058.214325] [<ffffffff8165223c>] __schedule+0x7bc/0x900 08:09:46:[10058.215984] [<ffffffff810b9ce6>] __cond_resched+0x26/0x30 08:09:46:[10058.217643] [<ffffffff8165264a>] _cond_resched+0x3a/0x50 08:09:46:[10058.219298] [<ffffffff811cb59a>] kmem_cache_alloc_trace+0x6a/0x630 08:09:46:[10058.221019] [<ffffffffa0bfcfb5>] ? qmt_glimpse_lock+0x155/0x780 [lquota] 08:09:46:[10058.222777] [<ffffffffa0bfcfb5>] qmt_glimpse_lock+0x155/0x780 [lquota] 08:09:46:[10058.224528] [<ffffffffa0bfdcf5>] qmt_reba_thread+0x715/0xc90 [lquota] 08:09:46:[10058.226260] [<ffffffff810bcc80>] ? wake_up_state+0x20/0x20 08:09:46:[10058.227914] [<ffffffffa0bfd5e0>] ? qmt_glimpse_lock+0x780/0x780 [lquota] 08:09:46:[10058.229668] [<ffffffff810a8a24>] kthread+0xe4/0xf0 08:09:46:[10058.231276] [<ffffffff810a8940>] ? kthread_create_on_node+0x140/0x140 08:09:46:[10058.232990] [<ffffffff8165d3d8>] ret_from_fork+0x58/0x90 08:09:46:[10058.234594] [<ffffffff810a8940>] ? kthread_create_on_node+0x140/0x140 08:09:46:[10058.240296] BUG: spinlock cpu recursion on CPU#0, ldlm_cn00_002/17112 08:09:46:[10058.242027] lock: 0xffff880036e6a4a0, .magic: dead4ead, .owner: qmt_reba_lustre/24447, .owner_cpu: 0 08:10:08:[10058.243896] CPU: 0 PID: 17112 Comm: ldlm_cn00_002 Tainted: G W OE ------------ 3.10.0-327.22.2.el7_lustre.x86_64 #1 08:10:08:[10058.247148] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 08:10:08:[10058.248814] ffff880027ae7410 00000000d868321e ffff88002a1f7b98 ffffffff8164bed6 08:10:08:[10058.250668] ffff88002a1f7bb8 ffffffff8164bf64 ffff880036e6a4a0 ffffffff818b3096 08:10:08:[10058.252541] ffff88002a1f7bd8 ffffffff8164bf8a ffff880036e6a4a0 0000000000000000 08:10:08:[10058.254397] Call Trace: 08:10:08:[10058.255848] [<ffffffff8164bed6>] dump_stack+0x19/0x1b 08:10:08:[10058.257488] [<ffffffff8164bf64>] spin_dump+0x8c/0x91 08:10:08:[10058.259116] [<ffffffff8164bf8a>] spin_bug+0x21/0x26 08:10:08:[10058.260725] [<ffffffff8131c008>] do_raw_spin_lock+0x118/0x170 08:10:08:[10058.262421] [<ffffffff8165413e>] _raw_spin_lock+0x1e/0x20 08:10:08:[10058.264094] [<ffffffffa09d902c>] lock_res_and_lock+0x2c/0x50 [ptlrpc] 08:10:08:[10058.265831] [<ffffffffa09e15dd>] ldlm_lock_cancel+0x2d/0x1e0 [ptlrpc] 08:10:08:[10058.267560] [<ffffffffa0a06251>] ldlm_request_cancel+0x151/0x710 [ptlrpc] 08:10:08:[10058.269316] [<ffffffffa0a09b4a>] ldlm_handle_cancel+0xba/0x250 [ptlrpc] 08:10:08:[10058.271051] [<ffffffffa0a09e21>] ldlm_cancel_handler+0x141/0x490 [ptlrpc] 08:10:08:[10058.272791] [<ffffffffa0a3924e>] ptlrpc_server_handle_request+0x22e/0xaa0 [ptlrpc] 08:10:08:[10058.274534] [<ffffffffa0a37aee>] ? ptlrpc_wait_event+0xae/0x350 [ptlrpc] 08:10:08:[10058.276176] [<ffffffff810bcc92>] ? default_wake_function+0x12/0x20 08:10:08:[10058.277743] [<ffffffff810b2cd8>] ? __wake_up_common+0x58/0x90 08:10:08:[10058.279278] [<ffffffffa0a3d018>] ptlrpc_main+0xa58/0x1db0 [ptlrpc] 08:10:08:[10058.280804] [<ffffffffa0a3c5c0>] ? ptlrpc_register_service+0xe60/0xe60 [ptlrpc] 08:10:08:[10058.282370] [<ffffffff810a8a24>] kthread+0xe4/0xf0 08:10:08:[10058.283727] [<ffffffff810a8940>] ? kthread_create_on_node+0x140/0x140 08:10:08:[10058.285182] [<ffffffff8165d3d8>] ret_from_fork+0x58/0x90 08:10:08:[10058.286542] [<ffffffff810a8940>] ? kthread_create_on_node+0x140/0x140
Full report is at https://testing.hpdd.intel.com/test_sets/c73d6a92-5e4e-11e6-b5b1-5254006e85c2
Landed for 2.10