Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10133

Multi-page allocation failures in mlx4/mlx5

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • Lustre 2.11.0
    • Soak cluster - lustre-master build 3654 lustre version=2.10.54_13_g84f690e
    • 3
    • 9223372036854775807

    Description

      I am seeing multiple page allocation failures from soak-clients. Failures seem to be semi-random.
      Example:

      Oct 17 02:20:07 soak-17 kernel: kworker/u480:1: page allocation failure: order:8, mode:0x80d0
      Oct 17 02:20:07 soak-17 kernel: CPU: 9 PID: 58714 Comm: kworker/u480:1 Tainted: G           OE  ------------   3.10.0-693.2.2.el7.x86_64 #1
      Oct 17 02:20:07 soak-17 kernel: Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
      Oct 17 02:20:08 soak-17 kernel: Workqueue: rdma_cm cma_work_handler [rdma_cm]
      Oct 17 02:20:08 soak-17 kernel: 00000000000080d0 00000000a9e78c95 ffff8803ee9bf848 ffffffff816a3db1
      Oct 17 02:20:08 soak-17 kernel: ffff8803ee9bf8d8 ffffffff81188810 0000000000000000 ffff88043ffdb000
      Oct 17 02:20:08 soak-17 kernel: 0000000000000008 00000000000080d0 ffff8803ee9bf8d8 00000000a9e78c95
      Oct 17 02:20:08 soak-17 kernel: Call Trace:
      Oct 17 02:20:08 soak-17 kernel: [<ffffffff816a3db1>] dump_stack+0x19/0x1b
      Oct 17 02:20:08 soak-17 kernel: [<ffffffff81188810>] warn_alloc_failed+0x110/0x180
      Oct 17 02:20:08 soak-17 kernel: [<ffffffff8169fd8a>] __alloc_pages_slowpath+0x6b6/0x724
      Oct 17 02:20:08 soak-17 kernel: [<ffffffff8118cd85>] __alloc_pages_nodemask+0x405/0x420
      Oct 17 02:20:08 soak-17 kernel: [<ffffffff81030f8f>] dma_generic_alloc_coherent+0x8f/0x140
      Oct 17 02:20:08 soak-17 kernel: [<ffffffff81064341>] x86_swiotlb_alloc_coherent+0x21/0x50
      Oct 17 02:20:08 soak-17 kernel: [<ffffffffc02914d3>] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffffc029176b>] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffffc02940d0>] ? __mlx4_cmd+0x560/0x920 [mlx4_core]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffffc061085e>] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffffc061144e>] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffffc03c9c3a>] ib_create_qp+0x7a/0x2f0 [ib_core]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffffc04f66d4>] rdma_create_qp+0x34/0xb0 [rdma_cm]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffffc0bd8539>] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffffc0be8649>] kiblnd_cm_callback+0x1429/0x2300 [ko2iblnd]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffffc04fa57c>] cma_work_handler+0x6c/0xa0 [rdma_cm]
      Oct 17 02:20:09 soak-17 kernel: [<ffffffff810a881a>] process_one_work+0x17a/0x440
      Oct 17 02:20:09 soak-17 kernel: [<ffffffff810a94e6>] worker_thread+0x126/0x3c0
      Oct 17 02:20:09 soak-17 kernel: [<ffffffff810a93c0>] ? manage_workers.isra.24+0x2a0/0x2a0
      Oct 17 02:20:09 soak-17 kernel: [<ffffffff810b098f>] kthread+0xcf/0xe0
      Oct 17 02:20:09 soak-17 kernel: [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
      Oct 17 02:20:10 soak-17 kernel: [<ffffffff816b4f58>] ret_from_fork+0x58/0x90
      Oct 17 02:20:10 soak-17 kernel: [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
      Oct 17 02:20:10 soak-17 kernel: Mem-Info:
      Oct 17 02:20:10 soak-17 kernel: active_anon:36658 inactive_anon:27590 isolated_anon:6#012 active_file:2710466 inactive_file:345768 isolated_file:10#012 unevictable:0 dirty:14 writeback:0 unstable:0#012 slab_reclaimable:30971 slab_unreclaimable:3983583#012 mapped:10108 shmem:6384 pagetables:3086 bounce:0#012 free:776253 free_pcp:359 free_cma:0
      Oct 17 02:20:11 soak-17 kernel: Node 0 DMA free:15784kB min:40kB low:48kB high:60kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15932kB managed:15848kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
      Oct 17 02:20:11 soak-17 kernel: lowmem_reserve[]: 0 2580 15620 15620
      Oct 17 02:20:11 soak-17 kernel: Node 0 DMA32 free:132736kB min:7320kB low:9148kB high:10980kB active_anon:6472kB inactive_anon:8768kB active_file:1063620kB inactive_file:27644kB unevictable:0kB isolated(anon):24kB isolated(file):40kB present:3051628kB managed:2643828kB mlocked:0kB dirty:8kB writeback:0kB mapped:2140kB shmem:116kB slab_reclaimable:9352kB slab_unreclaimable:1306892kB kernel_stack:1152kB pagetables:1196kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      Oct 17 02:20:11 soak-17 kernel: lowmem_reserve[]: 0 0 13040 13040
      Oct 17 02:20:11 soak-17 kernel: Node 0 Normal free:1149812kB min:37012kB low:46264kB high:55516kB active_anon:69848kB inactive_anon:32420kB active_file:4495364kB inactive_file:737992kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13631488kB managed:13353036kB mlocked:0kB dirty:24kB writeback:0kB mapped:9156kB shmem:248kB slab_reclaimable:54264kB slab_unreclaimable:6303688kB kernel_stack:7248kB pagetables:5096kB unstable:0kB bounce:0kB free_pcp:860kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      Oct 17 02:20:12 soak-17 kernel: lowmem_reserve[]: 0 0 0 0
      Oct 17 02:20:12 soak-17 kernel: Node 1 Normal free:1805688kB min:45728kB low:57160kB high:68592kB active_anon:70700kB inactive_anon:69172kB active_file:5282880kB inactive_file:617436kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:16777216kB managed:16498508kB mlocked:0kB dirty:24kB writeback:0kB mapped:29136kB shmem:25172kB slab_reclaimable:60268kB slab_unreclaimable:8323752kB kernel_stack:5568kB pagetables:6052kB unstable:0kB bounce:0kB free_pcp:1468kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      Oct 17 02:20:13 soak-17 kernel: lowmem_reserve[]: 0 0 0 0
      Oct 17 02:20:13 soak-17 kernel: Node 0 DMA: 0*4kB 1*8kB (U) 0*16kB 1*32kB (U) 0*64kB 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15784kB
      Oct 17 02:20:13 soak-17 kernel: Node 0 DMA32: 2018*4kB (UEM) 1070*8kB (UEM) 670*16kB (UEM) 685*32kB (UEM) 594*64kB (UEM) 199*128kB (UEM) 80*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 133240kB
      Oct 17 02:20:13 soak-17 kernel: Node 0 Normal: 8492*4kB (UEM) 5207*8kB (UEM) 3978*16kB (UEM) 8657*32kB (UEM) 8319*64kB (EM) 1594*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1152744kB
      Oct 17 02:20:13 soak-17 kernel: Node 1 Normal: 14583*4kB (UEM) 8566*8kB (UEM) 5482*16kB (UEM) 13112*32kB (UEM) 11765*64kB (UEM) 2443*128kB (UM) 418*256kB (UM) 5*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 1809388kB
      Oct 17 02:20:13 soak-17 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
      Oct 17 02:20:13 soak-17 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
      Oct 17 02:20:13 soak-17 kernel: Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
      Oct 17 02:20:14 soak-17 kernel: Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
      Oct 17 02:20:14 soak-17 kernel: 3062619 total pagecache pages
      Oct 17 02:20:14 soak-17 kernel: 6 pages in swap cache
      Oct 17 02:20:14 soak-17 kernel: Swap cache stats: add 13, delete 7, find 0/0
      Oct 17 02:20:14 soak-17 kernel: Free swap  = 16319432kB
      Oct 17 02:20:14 soak-17 kernel: Total swap = 16319484kB
      Oct 17 02:20:14 soak-17 kernel: 8369066 pages RAM
      Oct 17 02:20:14 soak-17 kernel: 0 pages HighMem/MovableOnly
      Oct 17 02:20:14 soak-17 kernel: 241261 pages reserved
      Oct 17 02:20:15 soak-17 kernel: kworker/u480:1: page allocation failure: order:8, mode:0x80d0
      Oct 17 02:20:15 soak-17 kernel: CPU: 9 PID: 58714 Comm: kworker/u480:1 Tainted: G           OE  ------------   3.10.0-693.2.2.el7.x86_64 #1
      Oct 17 02:20:15 soak-17 kernel: Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
      Oct 17 02:20:15 soak-17 kernel: Workqueue: rdma_cm cma_work_handler [rdma_cm]
      

      The systems appear to recover and continue. Lustre-log dump from soak-17 after the most recent failure attached.

      Attachments

        Issue Links

          Activity

            [LU-10133] Multi-page allocation failures in mlx4/mlx5

            This issue is fixed in the MOFED 4.4 release.

            adilger Andreas Dilger added a comment - This issue is fixed in the MOFED 4.4 release.

            The patch https://review.whamcloud.com/30164 would change two kmalloc() calls in create_qp_common() so that a __vmalloc() call would be made in case kmalloc() fails.

            However, both Mahmoud and Cliff White reported a failure at difference locations: mlx4_buf_alloc() call inside the create_qp_common() routine. The fix from #30164 would have no effect on our problem.

            [558213.837942] [<ffffffff81686d81>] dump_stack+0x19/0x1b
            [558213.837946] [<ffffffff81186160>] warn_alloc_failed+0x110/0x180
            [558213.837949] [<ffffffff8118a954>] __alloc_pages_nodemask+0x9b4/0xba0
            [558213.837951] [<ffffffff811ce868>] alloc_pages_current+0x98/0x110
            [558213.837954] [<ffffffff81184fae>] __get_free_pages+0xe/0x50
            [558213.837956] [<ffffffff8133f6fe>] swiotlb_alloc_coherent+0x5e/0x150
            [558213.837959] [<ffffffff81062551>] x86_swiotlb_alloc_coherent+0x41/0x50
            [558213.837968] [<ffffffffa04704c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core]
            [558213.837975] [<ffffffffa047073b>] mlx4_buf_alloc+0x1bb/0x260 [mlx4_core]
            [558213.837980] [<ffffffffa05ff496>] create_qp_common+0x536/0x1000 [mlx4_ib]

            jaylan Jay Lan (Inactive) added a comment - The patch https://review.whamcloud.com/30164 would change two kmalloc() calls in create_qp_common() so that a __vmalloc() call would be made in case kmalloc() fails. However, both Mahmoud and Cliff White reported a failure at difference locations: mlx4_buf_alloc() call inside the create_qp_common() routine. The fix from #30164 would have no effect on our problem. [558213.837942] [<ffffffff81686d81>] dump_stack+0x19/0x1b [558213.837946] [<ffffffff81186160>] warn_alloc_failed+0x110/0x180 [558213.837949] [<ffffffff8118a954>] __alloc_pages_nodemask+0x9b4/0xba0 [558213.837951] [<ffffffff811ce868>] alloc_pages_current+0x98/0x110 [558213.837954] [<ffffffff81184fae>] __get_free_pages+0xe/0x50 [558213.837956] [<ffffffff8133f6fe>] swiotlb_alloc_coherent+0x5e/0x150 [558213.837959] [<ffffffff81062551>] x86_swiotlb_alloc_coherent+0x41/0x50 [558213.837968] [<ffffffffa04704c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core] [558213.837975] [<ffffffffa047073b>] mlx4_buf_alloc+0x1bb/0x260 [mlx4_core] [558213.837980] [<ffffffffa05ff496>] create_qp_common+0x536/0x1000 [mlx4_ib]

            We are seeing what may be this issue on the 2.10.3-RC1 tag
            (code)
            Jan 17 11:20:40 soak-17 kernel: kworker/u480:2: page allocation failure: order:8, mode:0x80d0
            Jan 17 11:20:40 soak-17 kernel: CPU: 5 PID: 119497 Comm: kworker/u480:2 Tainted: G OE ------------ 3.10.0-693.11.6.el7.x86_64 #1
            Jan 17 11:20:40 soak-17 kernel: Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
            Jan 17 11:20:40 soak-17 kernel: Workqueue: rdma_cm cma_work_handler [rdma_cm]
            Jan 17 11:20:40 soak-17 kernel: Call Trace:
            Jan 17 11:20:40 soak-17 kernel: [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
            Jan 17 11:20:40 soak-17 kernel: [<ffffffff8118a510>] warn_alloc_failed+0x110/0x180
            Jan 17 11:20:40 soak-17 kernel: [<ffffffff816a1e7a>] __alloc_pages_slowpath+0x6b6/0x724
            Jan 17 11:20:41 soak-17 kernel: [<ffffffff8118eaa5>] __alloc_pages_nodemask+0x405/0x420
            Jan 17 11:20:41 soak-17 kernel: [<ffffffff81030e8f>] dma_generic_alloc_coherent+0x8f/0x140
            Jan 17 11:20:41 soak-17 kernel: [<ffffffff810645d1>] x86_swiotlb_alloc_coherent+0x21/0x50
            Jan 17 11:20:41 soak-17 kernel: [<ffffffffc02dd4c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core]
            Jan 17 11:20:41 soak-17 kernel: [<ffffffffc02dd73b>] mlx4_buf_alloc+0x1bb/0x260 [mlx4_core]
            Jan 17 11:20:41 soak-17 kernel: [<ffffffffc01db4a6>] create_qp_common+0x536/0x1000 [mlx4_ib]
            Jan 17 11:20:41 soak-17 kernel: [<ffffffffc01dc3d1>] mlx4_ib_create_qp+0x3b1/0xdc0 [mlx4_ib]
            Jan 17 11:20:41 soak-17 kernel: [<ffffffffc01c7bc2>] ? mlx4_ib_create_cq+0x2d2/0x430 [mlx4_ib]
            Jan 17 11:20:41 soak-17 kernel: [<ffffffffc01e7f30>] mlx4_ib_create_qp_wrp+0x10/0x20 [mlx4_ib]
            Jan 17 11:20:41 soak-17 kernel: [<ffffffffc00d952a>] ib_create_qp+0x7a/0x2f0 [ib_core]
            Jan 17 11:20:41 soak-17 kernel: [<ffffffffc05a65d4>] rdma_create_qp+0x34/0xb0 [rdma_cm]
            Jan 17 11:20:41 soak-17 kernel: [<ffffffffc0bf45c9>] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd]
            Jan 17 11:20:42 soak-17 kernel: [<ffffffff816af3c6>] ? common_interrupt+0x106/0x232
            Jan 17 11:20:42 soak-17 kernel: [<ffffffffc0c047af>] kiblnd_cm_callback+0x145f/0x2380 [ko2iblnd]
            Jan 17 11:20:42 soak-17 kernel: [<ffffffffc05aa11c>] cma_work_handler+0x6c/0xa0 [rdma_cm]
            Jan 17 11:20:42 soak-17 kernel: [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
            Jan 17 11:20:42 soak-17 kernel: [<ffffffff810ab086>] worker_thread+0x126/0x3c0
            Jan 17 11:20:42 soak-17 kernel: [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
            Jan 17 11:20:42 soak-17 kernel: [<ffffffff810b252f>] kthread+0xcf/0xe0
            Jan 17 11:20:42 soak-17 kernel: [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
            Jan 17 11:20:42 soak-17 kernel: [<ffffffff816b8798>] ret_from_fork+0x58/0x90
            Jan 17 11:20:42 soak-17 kernel: [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
            Jan 17 11:20:42 soak-17 kernel: Mem-Info:
            Jan 17 11:20:42 soak-17 kernel: active_anon:38088 inactive_anon:40507 isolated_anon:0#012 active_file:2789491 inactive_file:301913 isolated_file:10#012 unevictable:0 dirty:20 writeback:0 unstable:0#012 slab_reclaimable:31599 slab_unreclaimable:4366903#012 mapped:9817 shmem:26652 pagetables:2238 bounce:0#012 free:316930 free_pcp:3684 free_cma:0
            Jan 17 11:20:43 soak-17 kernel: Node 0 DMA free:15848kB min:40kB low:48kB high:60kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15932kB managed:15848kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
            Jan 17 11:20:43 soak-17 kernel: lowmem_reserve[]: 0 2580 15619 15619
            Jan 17 11:20:43 soak-17 kernel: Node 0 DMA32 free:111620kB min:7320kB low:9148kB high:10980kB active_anon:4340kB inactive_anon:16340kB active_file:806572kB inactive_file:91160kB unevictable:0kB isolated(anon):0kB isolated(file):40kB present:3051628kB managed:2643792kB mlocked:0kB dirty:4kB writeback:0kB mapped:4972kB shmem:10624kB slab_reclaimable:10900kB slab_unreclaimable:1503576kB kernel_stack:1680kB pagetables:836kB unstable:0kB bounce:0kB free_pcp:4204kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
            Jan 17 11:20:44 soak-17 kernel: lowmem_reserve[]: 0 0 13039 13039
            Jan 17 11:20:44 soak-17 kernel: Node 0 Normal free:479720kB min:37012kB low:46264kB high:55516kB active_anon:80388kB inactive_anon:90464kB active_file:4307316kB inactive_file:454368kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13631488kB managed:13352076kB mlocked:0kB dirty:40kB writeback:0kB mapped:23080kB shmem:63164kB slab_reclaimable:59148kB slab_unreclaimable:7310632kB kernel_stack:7168kB pagetables:5300kB unstable:0kB bounce:0kB free_pcp:6380kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
            Jan 17 11:20:45 soak-17 kernel: lowmem_reserve[]: 0 0 0 0
            Jan 17 11:20:45 soak-17 kernel: Node 1 Normal free:668892kB min:45728kB low:57160kB high:68592kB active_anon:67496kB inactive_anon:55352kB active_file:6041560kB inactive_file:661408kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:16777216kB managed:16497548kB mlocked:0kB dirty:20kB writeback:0kB mapped:11260kB shmem:32820kB slab_reclaimable:56348kB slab_unreclaimable:8652444kB kernel_stack:4368kB pagetables:2820kB unstable:0kB bounce:0kB free_pcp:4820kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
            Jan 17 11:20:45 soak-17 kernel: lowmem_reserve[]: 0 0 0 0
            (code)
            We are running OFED 4.2, i believe.

            cliffw Cliff White (Inactive) added a comment - We are seeing what may be this issue on the 2.10.3-RC1 tag (code) Jan 17 11:20:40 soak-17 kernel: kworker/u480:2: page allocation failure: order:8, mode:0x80d0 Jan 17 11:20:40 soak-17 kernel: CPU: 5 PID: 119497 Comm: kworker/u480:2 Tainted: G OE ------------ 3.10.0-693.11.6.el7.x86_64 #1 Jan 17 11:20:40 soak-17 kernel: Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 Jan 17 11:20:40 soak-17 kernel: Workqueue: rdma_cm cma_work_handler [rdma_cm] Jan 17 11:20:40 soak-17 kernel: Call Trace: Jan 17 11:20:40 soak-17 kernel: [<ffffffff816a5ea1>] dump_stack+0x19/0x1b Jan 17 11:20:40 soak-17 kernel: [<ffffffff8118a510>] warn_alloc_failed+0x110/0x180 Jan 17 11:20:40 soak-17 kernel: [<ffffffff816a1e7a>] __alloc_pages_slowpath+0x6b6/0x724 Jan 17 11:20:41 soak-17 kernel: [<ffffffff8118eaa5>] __alloc_pages_nodemask+0x405/0x420 Jan 17 11:20:41 soak-17 kernel: [<ffffffff81030e8f>] dma_generic_alloc_coherent+0x8f/0x140 Jan 17 11:20:41 soak-17 kernel: [<ffffffff810645d1>] x86_swiotlb_alloc_coherent+0x21/0x50 Jan 17 11:20:41 soak-17 kernel: [<ffffffffc02dd4c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core] Jan 17 11:20:41 soak-17 kernel: [<ffffffffc02dd73b>] mlx4_buf_alloc+0x1bb/0x260 [mlx4_core] Jan 17 11:20:41 soak-17 kernel: [<ffffffffc01db4a6>] create_qp_common+0x536/0x1000 [mlx4_ib] Jan 17 11:20:41 soak-17 kernel: [<ffffffffc01dc3d1>] mlx4_ib_create_qp+0x3b1/0xdc0 [mlx4_ib] Jan 17 11:20:41 soak-17 kernel: [<ffffffffc01c7bc2>] ? mlx4_ib_create_cq+0x2d2/0x430 [mlx4_ib] Jan 17 11:20:41 soak-17 kernel: [<ffffffffc01e7f30>] mlx4_ib_create_qp_wrp+0x10/0x20 [mlx4_ib] Jan 17 11:20:41 soak-17 kernel: [<ffffffffc00d952a>] ib_create_qp+0x7a/0x2f0 [ib_core] Jan 17 11:20:41 soak-17 kernel: [<ffffffffc05a65d4>] rdma_create_qp+0x34/0xb0 [rdma_cm] Jan 17 11:20:41 soak-17 kernel: [<ffffffffc0bf45c9>] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] Jan 17 11:20:42 soak-17 kernel: [<ffffffff816af3c6>] ? common_interrupt+0x106/0x232 Jan 17 11:20:42 soak-17 kernel: [<ffffffffc0c047af>] kiblnd_cm_callback+0x145f/0x2380 [ko2iblnd] Jan 17 11:20:42 soak-17 kernel: [<ffffffffc05aa11c>] cma_work_handler+0x6c/0xa0 [rdma_cm] Jan 17 11:20:42 soak-17 kernel: [<ffffffff810aa3ba>] process_one_work+0x17a/0x440 Jan 17 11:20:42 soak-17 kernel: [<ffffffff810ab086>] worker_thread+0x126/0x3c0 Jan 17 11:20:42 soak-17 kernel: [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0 Jan 17 11:20:42 soak-17 kernel: [<ffffffff810b252f>] kthread+0xcf/0xe0 Jan 17 11:20:42 soak-17 kernel: [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40 Jan 17 11:20:42 soak-17 kernel: [<ffffffff816b8798>] ret_from_fork+0x58/0x90 Jan 17 11:20:42 soak-17 kernel: [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40 Jan 17 11:20:42 soak-17 kernel: Mem-Info: Jan 17 11:20:42 soak-17 kernel: active_anon:38088 inactive_anon:40507 isolated_anon:0#012 active_ file:2789491 inactive_ file:301913 isolated_ file:10#012 unevictable:0 dirty:20 writeback:0 unstable:0#012 slab_reclaimable:31599 slab_unreclaimable:4366903#012 mapped:9817 shmem:26652 pagetables:2238 bounce:0#012 free:316930 free_pcp:3684 free_cma:0 Jan 17 11:20:43 soak-17 kernel: Node 0 DMA free:15848kB min:40kB low:48kB high:60kB active_anon:0kB inactive_anon:0kB active_ file:0kB inactive_ file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15932kB managed:15848kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Jan 17 11:20:43 soak-17 kernel: lowmem_reserve[]: 0 2580 15619 15619 Jan 17 11:20:43 soak-17 kernel: Node 0 DMA32 free:111620kB min:7320kB low:9148kB high:10980kB active_anon:4340kB inactive_anon:16340kB active_ file:806572kB inactive_ file:91160kB unevictable:0kB isolated(anon):0kB isolated(file):40kB present:3051628kB managed:2643792kB mlocked:0kB dirty:4kB writeback:0kB mapped:4972kB shmem:10624kB slab_reclaimable:10900kB slab_unreclaimable:1503576kB kernel_stack:1680kB pagetables:836kB unstable:0kB bounce:0kB free_pcp:4204kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Jan 17 11:20:44 soak-17 kernel: lowmem_reserve[]: 0 0 13039 13039 Jan 17 11:20:44 soak-17 kernel: Node 0 Normal free:479720kB min:37012kB low:46264kB high:55516kB active_anon:80388kB inactive_anon:90464kB active_ file:4307316kB inactive_ file:454368kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13631488kB managed:13352076kB mlocked:0kB dirty:40kB writeback:0kB mapped:23080kB shmem:63164kB slab_reclaimable:59148kB slab_unreclaimable:7310632kB kernel_stack:7168kB pagetables:5300kB unstable:0kB bounce:0kB free_pcp:6380kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Jan 17 11:20:45 soak-17 kernel: lowmem_reserve[]: 0 0 0 0 Jan 17 11:20:45 soak-17 kernel: Node 1 Normal free:668892kB min:45728kB low:57160kB high:68592kB active_anon:67496kB inactive_anon:55352kB active_ file:6041560kB inactive_ file:661408kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:16777216kB managed:16497548kB mlocked:0kB dirty:20kB writeback:0kB mapped:11260kB shmem:32820kB slab_reclaimable:56348kB slab_unreclaimable:8652444kB kernel_stack:4368kB pagetables:2820kB unstable:0kB bounce:0kB free_pcp:4820kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Jan 17 11:20:45 soak-17 kernel: lowmem_reserve[]: 0 0 0 0 (code) We are running OFED 4.2, i believe.

            We are running with MOFED 4.1 and Cent7.4 servers.

            options ko2iblnd timeout=150 retry_count=7 peer_timeout=0 map_on_demand=32 peer_credits=63 concurrent_sends=63
            
            

            Seeing this issue.

            n  9 08:37:52 nbp1-oss6 kernel: [1189787.313194] kworker/u48:3: page allocation failure: order:5, mode:0x8010
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313196] CPU: 20 PID: 57793 Comm: kworker/u48:3 Tainted: G           OE  ------------   3.10.0-693.2.2.el7.20170918.x86_64.lustre2101 #1
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313196] Hardware name: SGI.COM CH-C2112-GP2/X10DRU-i+, BIOS 1.0b 05/08/2015
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313200] Workqueue: ipoib_wq ipoib_cm_tx_start [ib_ipoib]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313201]  0000000000008010 0000000022ff91e8 ffff8810a141f7e0 ffffffff81684ac1
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313202]  ffff8810a141f870 ffffffff811841c0 0000000000000000 ffff88207ffd8000
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313203]  0000000000000005 0000000000008010 ffff8810a141f870 0000000022ff91e8
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313204] Call Trace:
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313205]  [<ffffffff81684ac1>] dump_stack+0x19/0x1b
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313207]  [<ffffffff811841c0>] warn_alloc_failed+0x110/0x180
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313209]  [<ffffffff81188984>] __alloc_pages_nodemask+0x9b4/0xba0
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313211]  [<ffffffff811cc688>] alloc_pages_current+0x98/0x110
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313216]  [<ffffffff8118300e>] __get_free_pages+0xe/0x50
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313217]  [<ffffffff8133d41e>] swiotlb_alloc_coherent+0x5e/0x150
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313221]  [<ffffffff810622c1>] x86_swiotlb_alloc_coherent+0x41/0x50
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313224]  [<ffffffffa05aa4c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313228]  [<ffffffffa05aa73b>] mlx4_buf_alloc+0x1bb/0x250 [mlx4_core]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313233]  [<ffffffffa07b8435>] create_qp_common+0x645/0x1090 [mlx4_ib]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313237]  [<ffffffffa07b9104>] ? mlx4_ib_create_qp+0x254/0x4d0 [mlx4_ib]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313240]  [<ffffffffa07b9157>] mlx4_ib_create_qp+0x2a7/0x4d0 [mlx4_ib]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313244]  [<ffffffffa07c3c40>] mlx4_ib_create_qp_wrp+0x10/0x20 [mlx4_ib]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313248]  [<ffffffffa04d02aa>] ib_create_qp+0x7a/0x2f0 [ib_core]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313253]  [<ffffffffa055b2fc>] ipoib_cm_create_tx_qp_rss+0xcc/0x110 [ib_ipoib]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313257]  [<ffffffffa055b9f9>] ipoib_cm_tx_init+0x89/0x2f0 [ib_ipoib]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313260]  [<ffffffffa055d6b8>] ipoib_cm_tx_start+0x248/0x3c0 [ib_ipoib]
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313263]  [<ffffffff810a587a>] process_one_work+0x17a/0x440
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313265]  [<ffffffff810a6546>] worker_thread+0x126/0x3c0
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313266]  [<ffffffff810a6420>] ? manage_workers.isra.24+0x2a0/0x2a0
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313268]  [<ffffffff810ad9ef>] kthread+0xcf/0xe0
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313269]  [<ffffffff810ad920>] ? insert_kthread_work+0x40/0x40
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313270]  [<ffffffff81695ad8>] ret_from_fork+0x58/0x90
            Jan  9 08:37:52 nbp1-oss6 kernel: [1189787.313272]  [<ffffffff810ad920>] ? insert_kthread_work+0x40/0x40
            
            

            So is this issue fixed in MOFED4.2?

             

            mhanafi Mahmoud Hanafi added a comment - We are running with MOFED 4.1 and Cent7.4 servers. options ko2iblnd timeout=150 retry_count=7 peer_timeout=0 map_on_demand=32 peer_credits=63 concurrent_sends=63 Seeing this issue. n 9 08:37:52 nbp1-oss6 kernel: [1189787.313194] kworker/u48:3: page allocation failure: order:5, mode:0x8010 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313196] CPU: 20 PID: 57793 Comm: kworker/u48:3 Tainted: G OE ------------ 3.10.0-693.2.2.el7.20170918.x86_64.lustre2101 #1 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313196] Hardware name: SGI.COM CH-C2112-GP2/X10DRU-i+, BIOS 1.0b 05/08/2015 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313200] Workqueue: ipoib_wq ipoib_cm_tx_start [ib_ipoib] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313201] 0000000000008010 0000000022ff91e8 ffff8810a141f7e0 ffffffff81684ac1 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313202] ffff8810a141f870 ffffffff811841c0 0000000000000000 ffff88207ffd8000 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313203] 0000000000000005 0000000000008010 ffff8810a141f870 0000000022ff91e8 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313204] Call Trace: Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313205] [<ffffffff81684ac1>] dump_stack+0x19/0x1b Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313207] [<ffffffff811841c0>] warn_alloc_failed+0x110/0x180 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313209] [<ffffffff81188984>] __alloc_pages_nodemask+0x9b4/0xba0 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313211] [<ffffffff811cc688>] alloc_pages_current+0x98/0x110 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313216] [<ffffffff8118300e>] __get_free_pages+0xe/0x50 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313217] [<ffffffff8133d41e>] swiotlb_alloc_coherent+0x5e/0x150 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313221] [<ffffffff810622c1>] x86_swiotlb_alloc_coherent+0x41/0x50 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313224] [<ffffffffa05aa4c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313228] [<ffffffffa05aa73b>] mlx4_buf_alloc+0x1bb/0x250 [mlx4_core] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313233] [<ffffffffa07b8435>] create_qp_common+0x645/0x1090 [mlx4_ib] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313237] [<ffffffffa07b9104>] ? mlx4_ib_create_qp+0x254/0x4d0 [mlx4_ib] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313240] [<ffffffffa07b9157>] mlx4_ib_create_qp+0x2a7/0x4d0 [mlx4_ib] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313244] [<ffffffffa07c3c40>] mlx4_ib_create_qp_wrp+0x10/0x20 [mlx4_ib] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313248] [<ffffffffa04d02aa>] ib_create_qp+0x7a/0x2f0 [ib_core] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313253] [<ffffffffa055b2fc>] ipoib_cm_create_tx_qp_rss+0xcc/0x110 [ib_ipoib] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313257] [<ffffffffa055b9f9>] ipoib_cm_tx_init+0x89/0x2f0 [ib_ipoib] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313260] [<ffffffffa055d6b8>] ipoib_cm_tx_start+0x248/0x3c0 [ib_ipoib] Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313263] [<ffffffff810a587a>] process_one_work+0x17a/0x440 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313265] [<ffffffff810a6546>] worker_thread+0x126/0x3c0 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313266] [<ffffffff810a6420>] ? manage_workers.isra.24+0x2a0/0x2a0 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313268] [<ffffffff810ad9ef>] kthread+0xcf/0xe0 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313269] [<ffffffff810ad920>] ? insert_kthread_work+0x40/0x40 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313270] [<ffffffff81695ad8>] ret_from_fork+0x58/0x90 Jan 9 08:37:52 nbp1-oss6 kernel: [1189787.313272] [<ffffffff810ad920>] ? insert_kthread_work+0x40/0x40 So is this issue fixed in MOFED4.2?  
            simmonsja James A Simmons added a comment - - edited

            Which OFED/MOFED version does this fix appear in. For those who want to avoid patched kernels for server side at all cost

            simmonsja James A Simmons added a comment - - edited Which OFED/MOFED version does this fix appear in. For those who want to avoid patched kernels for server side at all cost

            Chris, I believe the problem has been fixed in the upstream kernel, the problem is that users are hitting this regularly on RHEL6/RHEL7 kernels (client and server) with the in-kernel OFED, so it would be good to get a fix for those systems as well.

            adilger Andreas Dilger added a comment - Chris, I believe the problem has been fixed in the upstream kernel, the problem is that users are hitting this regularly on RHEL6/RHEL7 kernels (client and server) with the in-kernel OFED, so it would be good to get a fix for those systems as well.

            Alexey, did you run any tests with "map_on_demand=32"? I think the default value is 256, but reducing this is important for reducing memory usage.

            adilger Andreas Dilger added a comment - Alexey, did you run any tests with " map_on_demand=32 "? I think the default value is 256, but reducing this is important for reducing memory usage.

            We were informed these patches are in mellanox ofed 4.2 GA.

            There are similar patches applied to kvzalloc for the mlx5 ethernet driver.

            1) mm: introduce kv[mz]alloc helpers
            https://lwn.net/Articles/708739/
            https://patchwork.kernel.org/patch/9493657/

            upstream mlx5 patches were commited May 2017:
            2) {net, IB}/mlx5: Replace mlx5_vzalloc with kvzalloc
            https://github.com/torvalds/linux/commit/1b9a07ee25049724ab7f7c32282fbf5452530cea#diff-3c967034ac4fb744a569c1a4d3a115d3

            and Aug 2017:
            3) IB/mlx5: use kvmalloc_array for mlx5_ib_wq
            https://github.com/torvalds/linux/commit/b588300801f3502a7de5ca897af68019fbb3bc79#diff-06ae82013eb36f3b0e0eeb9c37040f37
            https://www.spinics.net/lists/linux-rdma/msg53756.html

            also upstream patch for mlx4:
            4) IB/mlx4: use kvmalloc_array to allocate wrid
            https://github.com/torvalds/linux/commit/e9105cdefbf64cd7aea300f934c92051e7cb7cff#diff-66b8f4939fabacf90437a794c44b9081
            https://www.spinics.net/lists/linux-rdma/msg53441.html

             

            chunteraa Chris Hunter (Inactive) added a comment - We were informed these patches are in mellanox ofed 4.2 GA. There are similar patches applied to kvzalloc for the mlx5 ethernet driver. 1) mm: introduce kv [mz] alloc helpers https://lwn.net/Articles/708739/ https://patchwork.kernel.org/patch/9493657/ upstream mlx5 patches were commited May 2017: 2) {net, IB}/mlx5: Replace mlx5_vzalloc with kvzalloc https://github.com/torvalds/linux/commit/1b9a07ee25049724ab7f7c32282fbf5452530cea#diff-3c967034ac4fb744a569c1a4d3a115d3 and Aug 2017: 3) IB/mlx5: use kvmalloc_array for mlx5_ib_wq https://github.com/torvalds/linux/commit/b588300801f3502a7de5ca897af68019fbb3bc79#diff-06ae82013eb36f3b0e0eeb9c37040f37 https://www.spinics.net/lists/linux-rdma/msg53756.html also upstream patch for mlx4: 4) IB/mlx4: use kvmalloc_array to allocate wrid https://github.com/torvalds/linux/commit/e9105cdefbf64cd7aea300f934c92051e7cb7cff#diff-66b8f4939fabacf90437a794c44b9081 https://www.spinics.net/lists/linux-rdma/msg53441.html  

            My tests with map_on_demand=256 say 1%-2% perf drop for this case. It's not a big changes i think.

            shadow Alexey Lyashkov added a comment - My tests with map_on_demand=256 say 1%-2% perf drop for this case. It's not a big changes i think.

            One thing we should do before making this the default is some performance testing to see how setting map-on-demand to 32 will impact mlx4 and mlx5. It will reduce memory usage per qp, but we need to double check any performance impact.

            ashehata Amir Shehata (Inactive) added a comment - One thing we should do before making this the default is some performance testing to see how setting map-on-demand to 32 will impact mlx4 and mlx5. It will reduce memory usage per qp, but we need to double check any performance impact.

            People

              ashehata Amir Shehata (Inactive)
              cliffw Cliff White (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              35 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: