Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4357

page allocation failure. mode:0x40 caused by missing __GFP_WAIT flag

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.6.0, Lustre 2.5.1
    • Lustre 2.5.0, Lustre 2.6.0, Lustre 2.5.1, Lustre 2.4.3
    • server: lustre-master build # 1791 RHEL6 ldiskf
      client: 2.5.0
    • 3
    • 11936

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/5efd6668-5e3c-11e3-ae30-52540035b04c.

      The sub-test test_iorssf failed with the following error:

      ior failed! 1

      Not sure if this is a dup of LU-3585 while LU-3585 has been fixed. client console

      12:02:03:Lustre: DEBUG MARKER: == parallel-scale test iorssf: iorssf == 11:54:49 (1386273289)
      12:02:03:ptlrpcd_0: page allocation failure. order:1, mode:0x40
      12:02:03:IOR: page allocation failure. order:1, mode:0x40
      12:02:03:Pid: 22725, comm: IOR Not tainted 2.6.32-358.18.1.el6.x86_64 #1
      12:02:03:Call Trace:
      12:02:03: [<ffffffff8112c257>] ? __alloc_pages_nodemask+0x757/0x8d0
      12:02:03: [<ffffffff81166d92>] ? kmem_getpages+0x62/0x170
      12:02:03: [<ffffffff811679aa>] ? fallback_alloc+0x1ba/0x270
      12:02:03: [<ffffffff811673ff>] ? cache_grow+0x2cf/0x320
      12:02:03: [<ffffffff81167729>] ? ____cache_alloc_node+0x99/0x160
      12:02:03: [<ffffffffa0967c08>] ? ptlrpc_new_bulk+0x48/0x280 [ptlrpc]
      12:02:03: [<ffffffff811684f9>] ? __kmalloc+0x189/0x220
      12:02:03: [<ffffffffa0967c08>] ? ptlrpc_new_bulk+0x48/0x280 [ptlrpc]
      12:02:03: [<ffffffffa0967e98>] ? ptlrpc_prep_bulk_imp+0x58/0x190 [ptlrpc]
      12:02:03: [<ffffffffa03ca054>] ? osc_brw_prep_request+0x294/0x11e0 [osc]
      12:02:03: [<ffffffffa03de4ac>] ? osc_req_attr_set+0x16c/0x5b0 [osc]
      12:02:04: [<ffffffffa088dd21>] ? cl_req_attr_set+0xd1/0x230 [obdclass]
      12:02:04: [<ffffffffa03d01d0>] ? osc_build_rpc+0x870/0x1850 [osc]
      12:02:04: [<ffffffffa03ea1a7>] ? osc_io_unplug0+0x1257/0x1f00 [osc]
      12:02:04: [<ffffffffa0881285>] ? cl_page_slice_add+0x55/0x140 [obdclass]
      12:02:04: [<ffffffffa03ecbd1>] ? osc_io_unplug+0x11/0x20 [osc]
      12:02:04: [<ffffffffa03ecdb0>] ? osc_queue_sync_pages+0x1d0/0x360 [osc]
      12:02:04: [<ffffffffa03def80>] ? osc_io_submit+0x340/0x4b0 [osc]
      12:02:04: [<ffffffffa088d68c>] ? cl_io_submit_rw+0x6c/0x160 [obdclass]
      12:02:04: [<ffffffffa04ba3c2>] ? lov_io_submit+0x3a2/0xbb0 [lov]
      12:02:04: [<ffffffffa088d68c>] ? cl_io_submit_rw+0x6c/0x160 [obdclass]
      12:02:04: [<ffffffffa088fc9e>] ? cl_io_read_page+0xae/0x170 [obdclass]
      12:02:04: [<ffffffffa0883a77>] ? cl_page_assume+0xf7/0x220 [obdclass]
      12:02:05: [<ffffffffa0768176>] ? ll_readpage+0x96/0x1a0 [lustre]
      12:02:05: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50
      12:02:05: [<ffffffff8111b7ec>] ? generic_file_aio_read+0x1fc/0x700
      12:02:05: [<ffffffffa0797797>] ? vvp_io_read_start+0x257/0x470 [lustre]
      12:02:05: [<ffffffffa088d7ea>] ? cl_io_start+0x6a/0x140 [obdclass]
      12:02:05: [<ffffffffa0891ef4>] ? cl_io_loop+0xb4/0x1b0 [obdclass]
      12:02:05: [<ffffffffa0739f9f>] ? ll_file_io_generic+0x33f/0x610 [lustre]
      12:02:05: [<ffffffffa073a3af>] ? ll_file_aio_read+0x13f/0x2c0 [lustre]
      12:02:05: [<ffffffffa073ac4c>] ? ll_file_read+0x16c/0x2a0 [lustre]
      12:02:05: [<ffffffff81181a95>] ? vfs_read+0xb5/0x1a0
      12:02:06: [<ffffffff81181bd1>] ? sys_read+0x51/0x90
      12:02:06: [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290
      12:02:06: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
      12:02:06:Mem-Info:
      12:02:06:Node 0 DMA per-cpu:
      12:02:06:CPU    0: hi:    0, btch:   1 usd:   0
      12:02:06:CPU    1: hi:    0, btch:   1 usd:   0
      12:02:07:Node 0 DMA32 per-cpu:
      12:02:07:CPU    0: hi:  186, btch:  31 usd:   0
      12:02:07:CPU    1: hi:  186, btch:  31 usd:   4
      12:02:07:active_anon:3193 inactive_anon:6004 isolated_anon:0
      12:02:07: active_file:8566 inactive_file:234838 isolated_file:0
      12:02:07: unevictable:0 dirty:1 writeback:0 unstable:0
      12:02:07: free:13356 slab_reclaimable:3430 slab_unreclaimable:197269
      12:02:07: mapped:4138 shmem:39 pagetables:1255 bounce:0
      12:02:07:Node 0 DMA free:8236kB min:332kB low:412kB high:496kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:5824kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15324kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:52kB slab_unreclaimable:1620kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      12:02:07:lowmem_reserve[]: 0 2003 2003 2003
      12:02:07:Node 0 DMA32 free:45188kB min:44720kB low:55900kB high:67080kB active_anon:12772kB inactive_anon:24016kB active_file:34264kB inactive_file:933528kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2052064kB mlocked:0kB dirty:4kB writeback:0kB mapped:16552kB shmem:156kB slab_reclaimable:13668kB slab_unreclaimable:787456kB kernel_stack:1904kB pagetables:5020kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      12:02:07:lowmem_reserve[]: 0 0 0 0
      12:02:07:Node 0 DMA: 3*4kB 2*8kB 3*16kB 3*32kB 4*64kB 1*128kB 2*256kB 2*512kB 2*1024kB 2*2048kB 0*4096kB = 8236kB
      12:02:08:Node 0 DMA32: 9673*4kB 6*8kB 23*16kB 2*32kB 2*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 45188kB
      12:02:08:243500 total pagecache pages
      12:02:08:0 pages in swap cache
      12:02:08:Swap cache stats: add 0, delete 0, find 0/0
      12:02:08:Free swap  = 4128760kB
      12:02:08:Total swap = 4128760kB
      12:02:08:Pid: 11613, comm: ptlrpcd_0 Not tainted 2.6.32-358.18.1.el6.x86_64 #1
      12:02:08:Call Trace:
      12:02:08: [<ffffffff8112c257>] ? __alloc_pages_nodemask+0x757/0x8d0
      12:02:08: [<ffffffffa0707a78>] ? ksocknal_queue_tx_msg_v2+0x88/0xe0 [ksocklnd]
      12:02:08: [<ffffffff81166d92>] ? kmem_getpages+0x62/0x170
      12:02:08: [<ffffffff811679aa>] ? fallback_alloc+0x1ba/0x270
      12:02:09: [<ffffffff811673ff>] ? cache_grow+0x2cf/0x320
      12:02:09: [<ffffffff81167729>] ? ____cache_alloc_node+0x99/0x160
      12:02:09: [<ffffffffa069aed7>] ? LNetMDAttach+0x157/0x5a0 [lnet]
      12:02:09: [<ffffffff811684f9>] ? __kmalloc+0x189/0x220
      12:02:09: [<ffffffffa069aed7>] ? LNetMDAttach+0x157/0x5a0 [lnet]
      12:02:09: [<ffffffffa096fac5>] ? ptlrpc_register_bulk+0x265/0x9d0 [ptlrpc]
      12:02:10: [<ffffffffa0970f22>] ? ptl_send_rpc+0x232/0xc40 [ptlrpc]
      12:02:10: [<ffffffff81281734>] ? snprintf+0x34/0x40
      12:02:10: [<ffffffffa05c77b1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      12:02:10: [<ffffffffa0966654>] ? ptlrpc_send_new_req+0x454/0x790 [ptlrpc]
      12:02:10: [<ffffffffa096a3c8>] ? ptlrpc_check_set+0x888/0x1b40 [ptlrpc]
      12:02:10: [<ffffffffa099520b>] ? ptlrpcd_check+0x53b/0x560 [ptlrpc]
      12:02:10: [<ffffffffa099572b>] ? ptlrpcd+0x20b/0x370 [ptlrpc]
      12:02:10: [<ffffffff81063410>] ? default_wake_function+0x0/0x20
      12:02:10: [<ffffffffa0995520>] ? ptlrpcd+0x0/0x370 [ptlrpc]
      12:02:10: [<ffffffff81096a36>] ? kthread+0x96/0xa0
      12:02:10: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
      12:02:10: [<ffffffff810969a0>] ? kthread+0x0/0xa0
      12:02:10: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      12:02:11:Mem-Info:
      12:02:11:Node 0 DMA per-cpu:
      12:02:11:CPU    0: hi:    0, btch:   1 usd:   0
      12:02:11:CPU    1: hi:    0, btch:   1 usd:   0
      12:02:12:Node 0 DMA32 per-cpu:
      12:02:12:CPU    0: hi:  186, btch:  31 usd:   0
      12:02:12:CPU    1: hi:  186, btch:  31 usd:   4
      12:02:12:active_anon:3193 inactive_anon:6004 isolated_anon:0
      12:02:12: active_file:8566 inactive_file:234838 isolated_file:0
      12:02:13: unevictable:0 dirty:1 writeback:0 unstable:0
      12:02:13: free:13356 slab_reclaimable:3430 slab_unreclaimable:197269
      12:02:13: mapped:4138 shmem:39 pagetables:1255 bounce:0
      12:02:13:Node 0 DMA free:8236kB min:332kB low:412kB high:496kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:5824kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15324kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:52kB slab_unreclaimable:1620kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      12:02:13:lowmem_reserve[]: 0 2003 2003 2003
      12:02:13:Node 0 DMA32 free:45188kB min:44720kB low:55900kB high:67080kB active_anon:12772kB inactive_anon:24016kB active_file:34264kB inactive_file:933528kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2052064kB mlocked:0kB dirty:4kB writeback:0kB mapped:16552kB shmem:156kB slab_reclaimable:13668kB slab_unreclaimable:787456kB kernel_stack:1904kB pagetables:5020kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      12:02:13:lowmem_reserve[]: 0 0 0 0
      12:02:13:Node 0 DMA: 3*4kB 2*8kB 3*16kB 3*32kB 4*64kB 1*128kB 2*256kB 2*512kB 2*1024kB 2*2048kB 0*4096kB = 8236kB
      12:02:14:Node 0 DMA32: 9673*4kB 6*8kB 23*16kB 2*32kB 2*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 45188kB
      12:02:14:243500 total pagecache pages
      12:02:14:0 pages in swap cache
      12:02:14:Swap cache stats: add 0, delete 0, find 0/0
      12:02:14:Free swap  = 4128760kB
      12:02:15:Total swap = 4128760kB
      12:02:15:524284 pages RAM
      12:02:15:43709 pages reserved
      12:02:15:259403 pages shared
      12:02:16:221861 pages non-shared
      12:02:16:LustreError: 22725:0:(osc_request.c:2161:osc_build_rpc()) prep_req failed: -12
      12:02:16:LustreError: 22725:0:(osc_cache.c:2142:osc_check_rpcs()) Read request failed with -12
      12:02:16:IOR: page allocation failure. order:1, mode:0x40
      12:02:16:Pid: 22725, comm: IOR Not tainted 2.6.32-358.18.1.el6.x86_64 #1
      12:02:16:Call Trace:
      12:02:16: [<ffffffff8112c257>] ? __alloc_pages_nodemask+0x757/0x8d0
      12:02:17: [<ffffffff81282796>] ? __const_udelay+0x46/0x50
      12:02:17: [<ffffffff81166d92>] ? kmem_getpages+0x62/0x170
      12:02:17: [<ffffffff811679aa>] ? fallback_alloc+0x1ba/0x270
      12:02:17:524284 pages RAM
      12:02:17:43709 pages reserved
      12:02:17:259403 pages shared
      12:02:17:221861 pages non-shared
      12:02:18:LNetError: 11613:0:(lib-lnet.h:457:lnet_md_alloc()) LNET: out of memory at /var/lib/jenkins/workspace/lustre-b2_5/arch/x86_64/build_type/client/distro/el6/ib_stack/inkernel/BUILD/BUILD/lustre-2.5.0/lnet/include/lnet/lib-lnet.h:457 (tried to alloc '(md)' = 4208)
      

      Attachments

        Issue Links

          Activity

            [LU-4357] page allocation failure. mode:0x40 caused by missing __GFP_WAIT flag
            pjones Peter Jones added a comment -

            Landed for 2.5.1 and 2.6

            pjones Peter Jones added a comment - Landed for 2.5.1 and 2.6
            bogl Bob Glossman (Inactive) added a comment - backport to b2_5: http://review.whamcloud.com/9382
            pjones Peter Jones added a comment -

            ok. Thanks for the quick feedback Ann.

            Emoly,

            Could you please revise Ann's patch to reflect the review feedback from Andreas?

            Thanks

            Peter

            pjones Peter Jones added a comment - ok. Thanks for the quick feedback Ann. Emoly, Could you please revise Ann's patch to reflect the review feedback from Andreas? Thanks Peter

            Peter,

            I am pretty busy with other bugs at the moment. I have no objections at all to one of your engineers updating the patch. On the contrary, I would really appreciate it.

            Thanks,
            Ann

            amk Ann Koehler (Inactive) added a comment - Peter, I am pretty busy with other bugs at the moment. I have no objections at all to one of your engineers updating the patch. On the contrary, I would really appreciate it. Thanks, Ann
            pjones Peter Jones added a comment -

            Ann,

            Do you expect to be able to refresh the patch in the near future? If you have other commitments, would you mind if one of our engineers do so?

            Thanks

            Peter

            pjones Peter Jones added a comment - Ann, Do you expect to be able to refresh the patch in the near future? If you have other commitments, would you mind if one of our engineers do so? Thanks Peter

            This was introduced in http://review.whamcloud.com/2831 commit 1b2547843817b4b7adbeb87ea9b070d9cac35c90.

            Tao, could you please make a version of this patch for the upstream kernel.

            adilger Andreas Dilger added a comment - This was introduced in http://review.whamcloud.com/2831 commit 1b2547843817b4b7adbeb87ea9b070d9cac35c90. Tao, could you please make a version of this patch for the upstream kernel.

            People

              emoly.liu Emoly Liu
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: