[LU-851] Test failure on test suite parallel-scale, subtest test_iorssf Created: 15/Nov/11 Updated: 29/May/17 Resolved: 29/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 5426 | ||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/08d44894-1008-11e1-8338-52540025f9af. The sub-test test_iorssf failed with the following error:
Info required for matching: parallel-scale iorssf |
| Comments |
| Comment by Johann Lombardi (Inactive) [ 24/Nov/11 ] |
|
OOM issue, ouch Lustre: DEBUG MARKER: == parallel-scale test iorssf: iorssf == 19:33:32 (1321414412) automount invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0 automount cpuset=/ mems_allowed=0 Pid: 1617, comm: automount Not tainted 2.6.32-131.6.1.el6.i686 #1 Call Trace: [<c04df510>] ? oom_kill_process+0xb0/0x2d0 [<c04dfbba>] ? __out_of_memory+0x4a/0x90 [<c04dfc55>] ? out_of_memory+0x55/0xb0 [<c04eda4b>] ? __alloc_pages_nodemask+0x7fb/0x810 [<c0519a5c>] ? cache_alloc_refill+0x2bc/0x510 [<c0519734>] ? kmem_cache_alloc+0xa4/0x110 [<c0574d6f>] ? proc_self_follow_link+0x5f/0x90 [<c053d972>] ? touch_atime+0xf2/0x140 [<c0534452>] ? do_follow_link+0xe2/0x3d0 [<c053bfa8>] ? __d_instantiate+0x38/0xd0 [<c0533e70>] ? __link_path_walk+0x1b0/0x6b0 [<c05344be>] ? do_follow_link+0x14e/0x3d0 [<c05342f7>] ? __link_path_walk+0x637/0x6b0 [<c0534951>] ? path_walk+0x51/0xc0 [<c0534ad9>] ? do_path_lookup+0x59/0x90 [<c05355b4>] ? do_filp_open+0xc4/0xb30 [<c0519982>] ? cache_alloc_refill+0x1e2/0x510 [<c0525448>] ? do_sys_open+0x58/0x130 [<c04adc5c>] ? audit_syscall_entry+0x21c/0x240 [<c04ad970>] ? __audit_syscall_exit+0x220/0x250 [<c052559c>] ? sys_open+0x2c/0x40 [<c0409bdf>] ? sysenter_do_call+0x12/0x28 Mem-Info: DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 0 CPU 1: hi: 186, btch: 31 usd: 0 CPU 2: hi: 186, btch: 31 usd: 0 CPU 3: hi: 186, btch: 31 usd: 0 HighMem per-cpu: CPU 0: hi: 186, btch: 31 usd: 0 CPU 1: hi: 186, btch: 31 usd: 0 CPU 2: hi: 186, btch: 31 usd: 0 CPU 3: hi: 186, btch: 31 usd: 0 active_anon:15644 inactive_anon:1589 isolated_anon:0 active_file:3561 inactive_file:934696 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 free:1953182 slab_reclaimable:4012 slab_unreclaimable:127469 mapped:4625 shmem:46 pagetables:569 bounce:0 DMA free:3464kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15792kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:148kB slab_unreclaimable:4560kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 863 12159 12159 Normal free:3808kB min:3724kB low:4652kB high:5584kB active_anon:0kB inactive_anon:0kB active_file:64kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:883912kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:15900kB slab_unreclaimable:505316kB kernel_stack:2256kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:168 all_unreclaimable? yes lowmem_reserve[]: 0 0 90370 90370 HighMem free:7805456kB min:512kB low:12700kB high:24888kB active_anon:62576kB inactive_anon:6356kB active_file:14180kB inactive_file:3738784kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:11567412kB mlocked:0kB dirty:0kB writeback:0kB mapped:18496kB shmem:184kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:2276kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 DMA: 2*4kB 0*8kB 2*16kB 3*32kB 2*64kB 9*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3464kB Normal: 69*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3844kB HighMem: 45*4kB 27*8kB 4*16kB 10*32kB 14*64kB 4*128kB 6*256kB 6*512kB 2*1024kB 3*2048kB 1902*4096kB = 7805580kB 803188 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 14565368kB Total swap = 14565368kB 3145712 pages RAM 2918914 pages HighMem 62162 pages reserved 823317 pages shared 285915 pages non-shared Out of memory: kill process 1536 (mpirun) score 5602 or a child Killed process 1540 (IOR) vsz:118652kB, anon-rss:26332kB, file-rss:2704kB IOR: page allocation failure. order:0, mode:0x50 Pid: 1540, comm: IOR Not tainted 2.6.32-131.6.1.el6.i686 #1 Call Trace: [<c04ed8e6>] ? __alloc_pages_nodemask+0x696/0x810 [<c0519a5c>] ? cache_alloc_refill+0x2bc/0x510 [<c0519734>] ? kmem_cache_alloc+0xa4/0x110 [<fc8d0c2a>] ? osc_page_init+0x3a/0x3f0 [osc] [<fd139463>] ? lovsub_page_init+0x183/0x4e0 [lov] [<fc8d0bf0>] ? osc_page_init+0x0/0x3f0 [osc] [<fa0c3271>] ? cl_page_find0+0x251/0xd60 [obdclass] [<fd133016>] ? lov_sub_get+0xf6/0x900 [lov] [<fa0c3d9f>] ? cl_page_find_sub+0x1f/0x30 [obdclass] [<fd12a2f8>] ? lov_page_init_raid0+0x218/0xb30 [lov] [<fddea077>] ? vvp_page_init+0x187/0x380 [lustre] [<fd125d4f>] ? lov_page_init+0x4f/0xa0 [lov] [<fd125d00>] ? lov_page_init+0x0/0xa0 [lov] [<fa0c3271>] ? cl_page_find0+0x251/0xd60 [obdclass] [<c04dcd14>] ? add_to_page_cache_locked+0xa4/0x110 [<fa0c3dd0>] ? cl_page_find+0x20/0x30 [obdclass] [<fddadc5c>] ? ll_readahead+0x10ec/0x1c10 [lustre] [<fddeb73d>] ? vvp_io_read_page+0x40d/0x5b0 [lustre] [<fa0d2d36>] ? cl_io_read_page+0xa6/0x2b0 [obdclass] [<fddaec99>] ? ll_readpage+0x99/0x2c0 [lustre] [<c04dc75d>] ? find_get_page+0x1d/0x90 [<c04ddb86>] ? generic_file_aio_read+0x1e6/0x780 [<c05f845a>] ? vsnprintf+0x2ea/0x3f0 [<fddec09b>] ? vvp_io_read_start+0x1cb/0x5a0 [lustre] [<fa0ce862>] ? cl_io_start+0x82/0x270 [obdclass] [<fa0d6285>] ? cl_io_loop+0x135/0x2a0 [obdclass] [<fdd6656a>] ? ll_file_io_generic+0x41a/0x6c0 [lustre] [<fdd66931>] ? ll_file_aio_read+0x121/0x4d0 [lustre] [<fdd727f5>] ? ll_file_read+0x165/0x440 [lustre] [<c059d18c>] ? security_file_permission+0xc/0x10 [<c0527c06>] ? rw_verify_area+0x66/0xe0 [<fdd72690>] ? ll_file_read+0x0/0x440 [lustre] [<c05285fd>] ? vfs_read+0x9d/0x190 [<c04adc5c>] ? audit_syscall_entry+0x21c/0x240 [<c0528731>] ? sys_read+0x41/0x70 [<c0409bdf>] ? sysenter_do_call+0x12/0x28 |
| Comment by Johann Lombardi (Inactive) [ 24/Nov/11 ] |
|
I wonder if this bug is due to the combination of 32-bit clients with bugzilla 23529. |
| Comment by Peter Jones [ 24/Nov/11 ] |
|
We are deprecating i686 clients for 2.2 so if this problem is limited to i686 then we should not worry about it |
| Comment by Oleg Drokin [ 03/Jan/12 ] |
|
Is this another "async journal OOM" issue I wonder? |
| Comment by Andreas Dilger [ 29/May/17 ] |
|
Close old ticket. |