[LU-514] irqbalance invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0 Created: 19/Jul/11  Updated: 28/May/17  Resolved: 28/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 4248

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/66cc6868-b26c-11e0-b33f-52540025f9af.
ENV: RHEL6-x86_64 server with i686 client, quota enabled.



 Comments   
Comment by Sarah Liu [ 20/Jul/11 ]

sanity-quota also got oom, not sure if it's the same one, if not, I will open a new ticket for tracking.
https://maloo.whamcloud.com/test_sets/376d7774-b258-11e0-b33f-52540025f9af

Comment by Oleg Drokin [ 24/Jul/11 ]

wow, 12G RAM and it went oom anyway.

Comment by Oleg Drokin [ 24/Jul/11 ]

btw, I think it would be great idea to grab slab statistics for a case like this

Comment by Jian Yu [ 24/Aug/11 ]

Lustre Tag: v2_1_0_0_RC1
Lustre Build: http://newbuild.whamcloud.com/job/lustre-master/271/
Distro/Arch: RHEL6/x86_64(server), RHEL6/i686(client)

While running sanity-benchmark iozone test, the same issue occurred:

Lustre: DEBUG MARKER: == sanity-benchmark test iozone: iozone == 23:27:27 (1314167247)
Lustre: DEBUG MARKER: min OST has 1765892kB available, using 7946514kB file size
irqbalance invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0
irqbalance cpuset=/ mems_allowed=0
Pid: 1518, comm: irqbalance Tainted: G           ---------------- T 2.6.32-131.6.1.el6.i686 #1
Call Trace:
 [<c04df510>] ? oom_kill_process+0xb0/0x2d0
 [<c04dfbba>] ? __out_of_memory+0x4a/0x90
 [<c04dfc55>] ? out_of_memory+0x55/0xb0
 [<c04eda4b>] ? __alloc_pages_nodemask+0x7fb/0x810
 [<c0519a5c>] ? cache_alloc_refill+0x2bc/0x510
 [<c0519734>] ? kmem_cache_alloc+0xa4/0x110
 [<c0532968>] ? getname+0x28/0xe0
 [<c052540e>] ? do_sys_open+0x1e/0x130
 [<c04adc5c>] ? audit_syscall_entry+0x21c/0x240
 [<c052559c>] ? sys_open+0x2c/0x40
 [<c0409bdf>] ? sysenter_do_call+0x12/0x28
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  23
CPU    1: hi:  186, btch:  31 usd:  61
CPU    2: hi:  186, btch:  31 usd:  12
CPU    3: hi:  186, btch:  31 usd:  98
HighMem per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   5
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:  30
active_anon:7375 inactive_anon:768 isolated_anon:0
 active_file:2703 inactive_file:812642 isolated_file:0
 unevictable:0 dirty:1 writeback:0 unstable:0
 free:2085189 slab_reclaimable:4050 slab_unreclaimable:128176
 mapped:3276 shmem:46 pagetables:478 bounce:0
DMA free:3500kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15792kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:140kB slab_unreclaimable:4536kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 863 12159 12159
Normal free:3664kB min:3724kB low:4652kB high:5584kB active_anon:0kB inactive_anon:0kB active_file:12kB inactive_file:72kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:883912kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:16060kB slab_unreclaimable:508132kB kernel_stack:1648kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:178041 all_unreclaimable? yes
lowmem_reserve[]: 0 0 90370 90370
HighMem free:8333468kB min:512kB low:12700kB high:24888kB active_anon:29500kB inactive_anon:3072kB active_file:10728kB inactive_file:3250568kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:11567412kB mlocked:0kB dirty:4kB writeback:0kB mapped:13100kB shmem:184kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1912kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 1*8kB 2*16kB 2*32kB 1*64kB 10*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3500kB
Normal: 402*4kB 1*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3664kB
HighMem: 61*4kB 29*8kB 12*16kB 4*32kB 8*64kB 13*128kB 7*256kB 5*512kB 3*1024kB 2*2048kB 2031*4096kB = 8333468kB
815399 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap  = 14565368kB
Total swap = 14565368kB
3145712 pages RAM
2918914 pages HighMem
62162 pages reserved
819682 pages shared
156112 pages non-shared
Out of memory: kill process 23877 (iozone) score 10248 or a child
Killed process 23877 (iozone) vsz:40992kB, anon-rss:17040kB, file-rss:676kB
iozone: page allocation failure. order:0, mode:0x50
Pid: 23877, comm: iozone Tainted: G           ---------------- T 2.6.32-131.6.1.el6.i686 #1
Call Trace:
 [<c04ed8e6>] ? __alloc_pages_nodemask+0x696/0x810
 [<c0519a5c>] ? cache_alloc_refill+0x2bc/0x510
 [<c0519734>] ? kmem_cache_alloc+0xa4/0x110
 [<fe298582>] ? cl_page_find0+0x102/0xd60 [obdclass]
 [<fe290955>] ? cl_object_attr_unlock+0x5/0x10 [obdclass]
 [<fb0439e2>] ? osc_io_commit_write+0x1b2/0x2b0 [osc]
 [<fb899303>] ? lov_page_stripe+0x53/0x240 [lov]
 [<fe293b4a>] ? cl_page_cache_add+0x27a/0x3b0 [obdclass]
 [<fe290c66>] ? cl_env_get+0x16/0x480 [obdclass]
 [<fe299230>] ? cl_page_find+0x20/0x30 [obdclass]
 [<fc4f3a8b>] ? ll_cl_init+0x10b/0x480 [lustre]
 [<c04ed344>] ? __alloc_pages_nodemask+0xf4/0x810
 [<c0520a33>] ? __mem_cgroup_commit_charge.clone.4+0x33/0x80
 [<c0522771>] ? mem_cgroup_charge_common+0x61/0x80
 [<fc4f4112>] ? ll_prepare_write+0x52/0x2e0 [lustre]
 [<c04dcd14>] ? add_to_page_cache_locked+0xa4/0x110
 [<c04dcdd4>] ? add_to_page_cache_lru+0x54/0x70
 [<c04dee1e>] ? grab_cache_page_write_begin+0x8e/0xc0
 [<fc517363>] ? ll_write_begin+0x83/0x370 [lustre]
 [<fc5172be>] ? ll_write_end+0x3e/0x60 [lustre]
 [<c04dd1bf>] ? generic_file_buffered_write+0xef/0x270
 [<c04de5a0>] ? __generic_file_aio_write+0x200/0x4e0
 [<fe29c9a5>] ? cl_wait_try+0xa5/0x390 [obdclass]
 [<c04de8de>] ? generic_file_aio_write+0x5e/0xc0
 [<fe29d12e>] ? cl_lock_mutex_put+0x3e/0x80 [obdclass]
 [<fc530b16>] ? vvp_io_write_start+0xd6/0x420 [lustre]
 [<fc532f40>] ? vvp_io_write_lock+0x70/0x80 [lustre]
 [<fe2a3cc2>] ? cl_io_start+0x82/0x270 [obdclass]
 [<fda0a852>] ? cfs_hash_bd_lookup_intent+0xa2/0xf0 [libcfs]
 [<fda0a45e>] ? cfs_hash_bd_from_key+0x2e/0xa0 [libcfs]
 [<fe2ab6e5>] ? cl_io_loop+0x135/0x2a0 [obdclass]
 [<fe2a938a>] ? cl_io_fini+0x9a/0x260 [obdclass]
 [<fe29099e>] ? cl_env_peek+0x2e/0x180 [obdclass]
 [<fc4abaa1>] ? ll_file_io_generic+0x541/0x6c0 [lustre]
 [<fe290c66>] ? cl_env_get+0x16/0x480 [obdclass]
 [<fda09d62>] ? cfs_hash_dual_bd_unlock+0x22/0x50 [libcfs]
 [<fda0dcfe>] ? cfs_hash_find_or_add+0x7e/0x160 [libcfs]
 [<fc4b83c1>] ? ll_file_aio_write+0x121/0x4d0 [lustre]
 [<fe28ed64>] ? cl_env_put+0x134/0x300 [obdclass]
 [<c0427293>] ? lapic_next_event+0x13/0x20
 [<c0481abc>] ? clockevents_program_event+0x8c/0x120
 [<fc4b88d5>] ? ll_file_write+0x165/0x440 [lustre]
 [<c04b63ea>] ? __rcu_process_callbacks+0x1fa/0x2d0
 [<c059d18c>] ? security_file_permission+0xc/0x10
 [<c0527c06>] ? rw_verify_area+0x66/0xe0
 [<fc4b8770>] ? ll_file_write+0x0/0x440 [lustre]
 [<c0527d20>] ? vfs_write+0xa0/0x190
 [<c04adc5c>] ? audit_syscall_entry+0x21c/0x240
 [<c05287a1>] ? sys_write+0x41/0x70
 [<c0409bdf>] ? sysenter_do_call+0x12/0x28
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  23
CPU    1: hi:  186, btch:  31 usd:  61
CPU    2: hi:  186, btch:  31 usd:  12
CPU    3: hi:  186, btch:  31 usd:  98
HighMem per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   5
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:  30
active_anon:7375 inactive_anon:768 isolated_anon:0
 active_file:2683 inactive_file:812670 isolated_file:0
 unevictable:0 dirty:1 writeback:0 unstable:0
 free:2085158 slab_reclaimable:4050 slab_unreclaimable:128167
 mapped:3276 shmem:46 pagetables:478 bounce:0
DMA free:3500kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15792kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:140kB slab_unreclaimable:4536kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 863 12159 12159
Normal free:3664kB min:3724kB low:4652kB high:5584kB active_anon:0kB inactive_anon:0kB active_file:4kB inactive_file:112kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:883912kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:16060kB slab_unreclaimable:508132kB kernel_stack:1648kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:178041 all_unreclaimable? yes
lowmem_reserve[]: 0 0 90370 90370
HighMem free:8333468kB min:512kB low:12700kB high:24888kB active_anon:29500kB inactive_anon:3072kB active_file:10728kB inactive_file:3250568kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:11567412kB mlocked:0kB dirty:4kB writeback:0kB mapped:13100kB shmem:184kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1912kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 1*8kB 2*16kB 2*32kB 1*64kB 10*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3500kB
Normal: 402*4kB 1*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3664kB
HighMem: 61*4kB 29*8kB 12*16kB 4*32kB 8*64kB 13*128kB 7*256kB 5*512kB 3*1024kB 2*2048kB 2031*4096kB = 8333468kB
815399 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap  = 14565368kB
Total swap = 14565368kB
3145712 pages RAM
2918914 pages HighMem
62162 pages reserved
819682 pages shared
156112 pages non-shared

Maloo report: https://maloo.whamcloud.com/test_sets/b4d6442e-ce1a-11e0-8d02-52540025f9af

ost-pools test 23 also hit the same issue: https://maloo.whamcloud.com/test_sets/7ad68e6e-cec9-11e0-8d02-52540025f9af

Comment by Andreas Dilger [ 28/May/17 ]

Close old issue.

Generated at Sat Feb 10 01:07:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.