[LU-514] irqbalance invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0 Created: 19/Jul/11 Updated: 28/May/17 Resolved: 28/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 4248 |
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/66cc6868-b26c-11e0-b33f-52540025f9af. |
| Comments |
| Comment by Sarah Liu [ 20/Jul/11 ] |
|
sanity-quota also got oom, not sure if it's the same one, if not, I will open a new ticket for tracking. |
| Comment by Oleg Drokin [ 24/Jul/11 ] |
|
wow, 12G RAM and it went oom anyway. |
| Comment by Oleg Drokin [ 24/Jul/11 ] |
|
btw, I think it would be great idea to grab slab statistics for a case like this |
| Comment by Jian Yu [ 24/Aug/11 ] |
|
Lustre Tag: v2_1_0_0_RC1 While running sanity-benchmark iozone test, the same issue occurred: Lustre: DEBUG MARKER: == sanity-benchmark test iozone: iozone == 23:27:27 (1314167247) Lustre: DEBUG MARKER: min OST has 1765892kB available, using 7946514kB file size irqbalance invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0 irqbalance cpuset=/ mems_allowed=0 Pid: 1518, comm: irqbalance Tainted: G ---------------- T 2.6.32-131.6.1.el6.i686 #1 Call Trace: [<c04df510>] ? oom_kill_process+0xb0/0x2d0 [<c04dfbba>] ? __out_of_memory+0x4a/0x90 [<c04dfc55>] ? out_of_memory+0x55/0xb0 [<c04eda4b>] ? __alloc_pages_nodemask+0x7fb/0x810 [<c0519a5c>] ? cache_alloc_refill+0x2bc/0x510 [<c0519734>] ? kmem_cache_alloc+0xa4/0x110 [<c0532968>] ? getname+0x28/0xe0 [<c052540e>] ? do_sys_open+0x1e/0x130 [<c04adc5c>] ? audit_syscall_entry+0x21c/0x240 [<c052559c>] ? sys_open+0x2c/0x40 [<c0409bdf>] ? sysenter_do_call+0x12/0x28 Mem-Info: DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 23 CPU 1: hi: 186, btch: 31 usd: 61 CPU 2: hi: 186, btch: 31 usd: 12 CPU 3: hi: 186, btch: 31 usd: 98 HighMem per-cpu: CPU 0: hi: 186, btch: 31 usd: 0 CPU 1: hi: 186, btch: 31 usd: 5 CPU 2: hi: 186, btch: 31 usd: 0 CPU 3: hi: 186, btch: 31 usd: 30 active_anon:7375 inactive_anon:768 isolated_anon:0 active_file:2703 inactive_file:812642 isolated_file:0 unevictable:0 dirty:1 writeback:0 unstable:0 free:2085189 slab_reclaimable:4050 slab_unreclaimable:128176 mapped:3276 shmem:46 pagetables:478 bounce:0 DMA free:3500kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15792kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:140kB slab_unreclaimable:4536kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 863 12159 12159 Normal free:3664kB min:3724kB low:4652kB high:5584kB active_anon:0kB inactive_anon:0kB active_file:12kB inactive_file:72kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:883912kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:16060kB slab_unreclaimable:508132kB kernel_stack:1648kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:178041 all_unreclaimable? yes lowmem_reserve[]: 0 0 90370 90370 HighMem free:8333468kB min:512kB low:12700kB high:24888kB active_anon:29500kB inactive_anon:3072kB active_file:10728kB inactive_file:3250568kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:11567412kB mlocked:0kB dirty:4kB writeback:0kB mapped:13100kB shmem:184kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1912kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 DMA: 1*4kB 1*8kB 2*16kB 2*32kB 1*64kB 10*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3500kB Normal: 402*4kB 1*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3664kB HighMem: 61*4kB 29*8kB 12*16kB 4*32kB 8*64kB 13*128kB 7*256kB 5*512kB 3*1024kB 2*2048kB 2031*4096kB = 8333468kB 815399 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 14565368kB Total swap = 14565368kB 3145712 pages RAM 2918914 pages HighMem 62162 pages reserved 819682 pages shared 156112 pages non-shared Out of memory: kill process 23877 (iozone) score 10248 or a child Killed process 23877 (iozone) vsz:40992kB, anon-rss:17040kB, file-rss:676kB iozone: page allocation failure. order:0, mode:0x50 Pid: 23877, comm: iozone Tainted: G ---------------- T 2.6.32-131.6.1.el6.i686 #1 Call Trace: [<c04ed8e6>] ? __alloc_pages_nodemask+0x696/0x810 [<c0519a5c>] ? cache_alloc_refill+0x2bc/0x510 [<c0519734>] ? kmem_cache_alloc+0xa4/0x110 [<fe298582>] ? cl_page_find0+0x102/0xd60 [obdclass] [<fe290955>] ? cl_object_attr_unlock+0x5/0x10 [obdclass] [<fb0439e2>] ? osc_io_commit_write+0x1b2/0x2b0 [osc] [<fb899303>] ? lov_page_stripe+0x53/0x240 [lov] [<fe293b4a>] ? cl_page_cache_add+0x27a/0x3b0 [obdclass] [<fe290c66>] ? cl_env_get+0x16/0x480 [obdclass] [<fe299230>] ? cl_page_find+0x20/0x30 [obdclass] [<fc4f3a8b>] ? ll_cl_init+0x10b/0x480 [lustre] [<c04ed344>] ? __alloc_pages_nodemask+0xf4/0x810 [<c0520a33>] ? __mem_cgroup_commit_charge.clone.4+0x33/0x80 [<c0522771>] ? mem_cgroup_charge_common+0x61/0x80 [<fc4f4112>] ? ll_prepare_write+0x52/0x2e0 [lustre] [<c04dcd14>] ? add_to_page_cache_locked+0xa4/0x110 [<c04dcdd4>] ? add_to_page_cache_lru+0x54/0x70 [<c04dee1e>] ? grab_cache_page_write_begin+0x8e/0xc0 [<fc517363>] ? ll_write_begin+0x83/0x370 [lustre] [<fc5172be>] ? ll_write_end+0x3e/0x60 [lustre] [<c04dd1bf>] ? generic_file_buffered_write+0xef/0x270 [<c04de5a0>] ? __generic_file_aio_write+0x200/0x4e0 [<fe29c9a5>] ? cl_wait_try+0xa5/0x390 [obdclass] [<c04de8de>] ? generic_file_aio_write+0x5e/0xc0 [<fe29d12e>] ? cl_lock_mutex_put+0x3e/0x80 [obdclass] [<fc530b16>] ? vvp_io_write_start+0xd6/0x420 [lustre] [<fc532f40>] ? vvp_io_write_lock+0x70/0x80 [lustre] [<fe2a3cc2>] ? cl_io_start+0x82/0x270 [obdclass] [<fda0a852>] ? cfs_hash_bd_lookup_intent+0xa2/0xf0 [libcfs] [<fda0a45e>] ? cfs_hash_bd_from_key+0x2e/0xa0 [libcfs] [<fe2ab6e5>] ? cl_io_loop+0x135/0x2a0 [obdclass] [<fe2a938a>] ? cl_io_fini+0x9a/0x260 [obdclass] [<fe29099e>] ? cl_env_peek+0x2e/0x180 [obdclass] [<fc4abaa1>] ? ll_file_io_generic+0x541/0x6c0 [lustre] [<fe290c66>] ? cl_env_get+0x16/0x480 [obdclass] [<fda09d62>] ? cfs_hash_dual_bd_unlock+0x22/0x50 [libcfs] [<fda0dcfe>] ? cfs_hash_find_or_add+0x7e/0x160 [libcfs] [<fc4b83c1>] ? ll_file_aio_write+0x121/0x4d0 [lustre] [<fe28ed64>] ? cl_env_put+0x134/0x300 [obdclass] [<c0427293>] ? lapic_next_event+0x13/0x20 [<c0481abc>] ? clockevents_program_event+0x8c/0x120 [<fc4b88d5>] ? ll_file_write+0x165/0x440 [lustre] [<c04b63ea>] ? __rcu_process_callbacks+0x1fa/0x2d0 [<c059d18c>] ? security_file_permission+0xc/0x10 [<c0527c06>] ? rw_verify_area+0x66/0xe0 [<fc4b8770>] ? ll_file_write+0x0/0x440 [lustre] [<c0527d20>] ? vfs_write+0xa0/0x190 [<c04adc5c>] ? audit_syscall_entry+0x21c/0x240 [<c05287a1>] ? sys_write+0x41/0x70 [<c0409bdf>] ? sysenter_do_call+0x12/0x28 Mem-Info: DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 23 CPU 1: hi: 186, btch: 31 usd: 61 CPU 2: hi: 186, btch: 31 usd: 12 CPU 3: hi: 186, btch: 31 usd: 98 HighMem per-cpu: CPU 0: hi: 186, btch: 31 usd: 0 CPU 1: hi: 186, btch: 31 usd: 5 CPU 2: hi: 186, btch: 31 usd: 0 CPU 3: hi: 186, btch: 31 usd: 30 active_anon:7375 inactive_anon:768 isolated_anon:0 active_file:2683 inactive_file:812670 isolated_file:0 unevictable:0 dirty:1 writeback:0 unstable:0 free:2085158 slab_reclaimable:4050 slab_unreclaimable:128167 mapped:3276 shmem:46 pagetables:478 bounce:0 DMA free:3500kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15792kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:140kB slab_unreclaimable:4536kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 863 12159 12159 Normal free:3664kB min:3724kB low:4652kB high:5584kB active_anon:0kB inactive_anon:0kB active_file:4kB inactive_file:112kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:883912kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:16060kB slab_unreclaimable:508132kB kernel_stack:1648kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:178041 all_unreclaimable? yes lowmem_reserve[]: 0 0 90370 90370 HighMem free:8333468kB min:512kB low:12700kB high:24888kB active_anon:29500kB inactive_anon:3072kB active_file:10728kB inactive_file:3250568kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:11567412kB mlocked:0kB dirty:4kB writeback:0kB mapped:13100kB shmem:184kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1912kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 DMA: 1*4kB 1*8kB 2*16kB 2*32kB 1*64kB 10*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3500kB Normal: 402*4kB 1*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3664kB HighMem: 61*4kB 29*8kB 12*16kB 4*32kB 8*64kB 13*128kB 7*256kB 5*512kB 3*1024kB 2*2048kB 2031*4096kB = 8333468kB 815399 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 14565368kB Total swap = 14565368kB 3145712 pages RAM 2918914 pages HighMem 62162 pages reserved 819682 pages shared 156112 pages non-shared Maloo report: https://maloo.whamcloud.com/test_sets/b4d6442e-ce1a-11e0-8d02-52540025f9af ost-pools test 23 also hit the same issue: https://maloo.whamcloud.com/test_sets/7ad68e6e-cec9-11e0-8d02-52540025f9af |
| Comment by Andreas Dilger [ 28/May/17 ] |
|
Close old issue. |