Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.7.0, Lustre 2.12.0, Lustre 2.13.0, Lustre 2.10.6
-
None
-
kernel 3.10.0-514.26.2.el7_lustre.2.7.21.1.ddn4.g3b21639.x86_64
lustre 2.7.21.3-ddn36
-
3
-
9223372036854775807
Description
I think this can be a deadlock on OSS. Many threads were being blocked while there were no I/O activity.
[root@foss22 ~]# vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 265 0 4842020 1118532 57555660 0 0 752 983 8 5 0 1 89 10 0 0 265 0 4840220 1118532 57557244 0 0 0 12 12538 20455 0 1 0 99 0 0 265 0 4841724 1118532 57555684 0 0 0 4 17744 25132 0 1 0 99 0 0 265 0 4839416 1118532 57554168 0 0 0 16 13079 20067 0 1 0 99 0 2 265 0 4839468 1118532 57553992 0 0 0 4 12368 20798 0 1 0 99 0 0 265 0 4838772 1118532 57553444 0 0 0 8 11571 18796 0 1 0 99 0 0 265 0 4846112 1118540 57552180 0 0 0 24 12476 18410 0 1 0 99 0 2 265 0 4846332 1118540 57549600 0 0 0 4 12562 18163 0 1 0 99 0 0 265 0 4844584 1118540 57550976 0 0 0 16 10843 18739 0 1 0 99 0 1 265 0 4846072 1118540 57546488 0 0 0 8 20940 27705 0 1 0 98 0
It started from following call trace.
Feb 23 15:18:42 foss22 kernel: INFO: task kswapd0:101 blocked for more than 90 seconds. Feb 23 15:18:42 foss22 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 23 15:18:42 foss22 kernel: intel_powerclamp: No package C-state available Feb 23 15:18:42 foss22 kernel: kswapd0 D Feb 23 15:18:42 foss22 kernel: ffff881546baf200 0 101 2 0x00000000 Feb 23 15:18:42 foss22 kernel: ffff88165e4b37d0 0000000000000046 ffff88165e433ec0 ffff88165e4b3fd8 Feb 23 15:18:42 foss22 kernel: ffff88165e4b3fd8 ffff88165e4b3fd8 ffff88165e433ec0 ffff88165e4b3938 Feb 23 15:18:42 foss22 kernel: ffff88165e4b3940 7fffffffffffffff ffff88165e433ec0 ffff881546baf200 Feb 23 15:18:42 foss22 kernel: Call Trace: Feb 23 15:18:42 foss22 kernel: [<ffffffff8168d629>] schedule+0x29/0x70 Feb 23 15:18:42 foss22 kernel: [<ffffffff8168b069>] schedule_timeout+0x239/0x2c0 Feb 23 15:18:42 foss22 kernel: [<ffffffff8168da06>] wait_for_completion+0x116/0x170 Feb 23 15:18:42 foss22 kernel: [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20 Feb 23 15:18:42 foss22 kernel: [<ffffffff810b08e8>] kthread_create_on_node+0xa8/0x140 Feb 23 15:18:42 foss22 kernel: [<ffffffffa0c6e5d0>] ? qsd_reint_index+0x1700/0x1700 [lquota] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0c6fde8>] ? qsd_start_reint_thread+0x778/0xd70 [lquota] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0c6fe73>] qsd_start_reint_thread+0x803/0xd70 [lquota] Feb 23 15:18:42 foss22 kernel: [<ffffffff812635ee>] ? dqput+0x16e/0x1f0 Feb 23 15:18:42 foss22 kernel: [<ffffffffa0c74691>] qsd_ready+0x231/0x3c0 [lquota] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0c77722>] qsd_adjust+0xa2/0x900 [lquota] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0c68d1a>] ? qsd_refresh_usage+0x6a/0x2b0 [lquota] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0c78b34>] qsd_op_adjust+0x4d4/0x720 [lquota] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0d85a00>] osd_object_delete+0x1f0/0x510 [osd_ldiskfs] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0836f0d>] lu_object_free.isra.30+0x9d/0x1a0 [obdclass] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0837b96>] lu_site_purge_objects+0x326/0x4a0 [obdclass] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0838dd9>] lu_cache_shrink+0x259/0x2d0 [obdclass] Feb 23 15:18:42 foss22 kernel: [<ffffffff811947b3>] shrink_slab+0x163/0x330 Feb 23 15:18:42 foss22 kernel: [<ffffffff811f5b77>] ? vmpressure+0x87/0x90 Feb 23 15:18:42 foss22 kernel: [<ffffffff811985b1>] balance_pgdat+0x4b1/0x5e0 Feb 23 15:18:42 foss22 kernel: [<ffffffff81198853>] kswapd+0x173/0x450 Feb 23 15:18:42 foss22 kernel: [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30 Feb 23 15:18:42 foss22 kernel: [<ffffffff811986e0>] ? balance_pgdat+0x5e0/0x5e0 Feb 23 15:18:42 foss22 kernel: [<ffffffff810b0a4f>] kthread+0xcf/0xe0 Feb 23 15:18:42 foss22 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 Feb 23 15:18:42 foss22 kernel: [<ffffffff81698598>] ret_from_fork+0x58/0x90 Feb 23 15:18:42 foss22 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
Then,
Feb 23 15:18:42 foss22 kernel: INFO: task ll_ost_io00_002:3904 blocked for more than 90 seconds. Feb 23 15:18:42 foss22 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 23 15:18:42 foss22 kernel: ll_ost_io00_002 D ffffffffa089d1e8 0 3904 2 0x00000080 Feb 23 15:18:42 foss22 kernel: ffff88161b11b530 0000000000000046 ffff88161d214e70 ffff88161b11bfd8 Feb 23 15:18:42 foss22 kernel: ffff88161b11bfd8 ffff88161b11bfd8 ffff88161d214e70 ffffffffa089d1e0 Feb 23 15:18:42 foss22 kernel: ffffffffa089d1e4 ffff88161d214e70 00000000ffffffff ffffffffa089d1e8 Feb 23 15:18:42 foss22 kernel: Call Trace: Feb 23 15:18:42 foss22 kernel: [<ffffffff8168e719>] schedule_preempt_disabled+0x29/0x70 Feb 23 15:18:42 foss22 kernel: [<ffffffff8168c365>] __mutex_lock_slowpath+0xc5/0x1d0 Feb 23 15:18:42 foss22 kernel: [<ffffffff8168b7bf>] mutex_lock+0x1f/0x2f Feb 23 15:18:42 foss22 kernel: [<ffffffffa0838bed>] lu_cache_shrink+0x6d/0x2d0 [obdclass] Feb 23 15:18:42 foss22 kernel: [<ffffffff811946f9>] shrink_slab+0xa9/0x330 Feb 23 15:18:42 foss22 kernel: [<ffffffff811f5b11>] ? vmpressure+0x21/0x90 Feb 23 15:18:42 foss22 kernel: [<ffffffff81197ab2>] do_try_to_free_pages+0x3c2/0x4e0 Feb 23 15:18:42 foss22 kernel: [<ffffffff81197ccc>] try_to_free_pages+0xfc/0x180 Feb 23 15:18:42 foss22 kernel: [<ffffffff81683898>] __alloc_pages_slowpath+0x458/0x725 Feb 23 15:18:42 foss22 kernel: [<ffffffff8118b655>] __alloc_pages_nodemask+0x405/0x420 Feb 23 15:18:42 foss22 kernel: [<ffffffff811cfa0a>] alloc_pages_current+0xaa/0x170 Feb 23 15:18:42 foss22 kernel: [<ffffffff81180be7>] __page_cache_alloc+0x97/0xb0 Feb 23 15:18:42 foss22 kernel: [<ffffffff81181905>] find_or_create_page+0x45/0xa0 Feb 23 15:18:42 foss22 kernel: [<ffffffffa0da8ef7>] osd_bufs_get+0x3a7/0x870 [osd_ldiskfs] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0eecbf8>] ofd_preprw+0x688/0x1220 [ofd] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0ac8faf>] ? __req_capsule_get+0x15f/0x710 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0b0cd91>] tgt_brw_read+0x9a1/0x1850 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffff811dda53>] ? __kmalloc+0x1f3/0x240 Feb 23 15:18:42 foss22 kernel: [<ffffffffa0addfd0>] ? null_alloc_rs+0xa0/0x380 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0ade109>] ? null_alloc_rs+0x1d9/0x380 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0aa098f>] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0aa0b2f>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0aa0cb1>] ? lustre_pack_reply+0x11/0x20 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0b0ab3b>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0aad91b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffffa06e2668>] ? lc_watchdog_touch+0x68/0x180 [libcfs] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0aab1f8>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffffa0ab1240>] ptlrpc_main+0xc00/0x1f70 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0 Feb 23 15:18:42 foss22 kernel: [<ffffffffa0ab0640>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc] Feb 23 15:18:42 foss22 kernel: [<ffffffff810b0a4f>] kthread+0xcf/0xe0 Feb 23 15:18:42 foss22 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 Feb 23 15:18:42 foss22 kernel: [<ffffffff81698598>] ret_from_fork+0x58/0x90 Feb 23 15:18:42 foss22 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
Attachments
Issue Links
- is duplicated by
-
LU-12162 Major issue on OSS after upgrading Oak to 2.10.7
- Open