[LU-11934] replay-single test_70c: Oom on client Created: 06/Feb/19 Updated: 19/Feb/19 Resolved: 19/Feb/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.13.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alexander Boyko | Assignee: | Alexander Boyko |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
[ 8500.380063] Kernel panic - not syncing: Out of memory and no killable processes...
[ 8500.385004] CPU: 0 PID: 25664 Comm: kworker/u4:0 Kdump: loaded Tainted: G OE ------------ 3.10.0-862.14.4.el7.x86_64 #1
[ 8500.390771] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 8500.393649] Call Trace:
[ 8500.395723] [<ffffffffa0313754>] dump_stack+0x19/0x1b
[ 8500.398474] [<ffffffffa030d29f>] panic+0xe8/0x21f
[ 8500.401112] [<ffffffff9fd9b50a>] out_of_memory+0x4ea/0x4f0
[ 8500.403935] [<ffffffffa030f423>] __alloc_pages_slowpath+0x5d6/0x724
[ 8500.406940] [<ffffffff9fda18b5>] __alloc_pages_nodemask+0x405/0x420
[ 8500.409955] [<ffffffff9fdec058>] alloc_pages_current+0x98/0x110
[ 8500.412877] [<ffffffff9fd9bf3e>] __get_free_pages+0xe/0x40
[ 8500.415664] [<ffffffff9fc775b2>] pgd_alloc+0x22/0x150
[ 8500.418237] [<ffffffff9fc90958>] mm_init+0x158/0x1b0
[ 8500.420734] [<ffffffff9fc90ee0>] mm_alloc+0x80/0x110
[ 8500.423203] [<ffffffff9fe279d9>] do_execve_common.isra.24+0x249/0x6e0
[ 8500.426004] [<ffffffff9fe34d1c>] ? poll_select_copy_remaining+0xfc/0x150
[ 8500.428928] [<ffffffff9fe30900>] ? vfs_unlink+0x170/0x190
[ 8500.431440] [<ffffffff9fe27e88>] do_execve+0x18/0x20
[ 8500.433811] [<ffffffff9fcb2bef>] ____call_usermodehelper+0xff/0x140
[ 8500.436484] [<ffffffff9fcb2c30>] ? ____call_usermodehelper+0x140/0x140
[ 8500.439191] [<ffffffff9fcb2c4e>] call_helper+0x1e/0x20
[ 8500.441533] [<ffffffffa03255f7>] ret_from_fork_nospec_begin+0x21/0x21
[ 8500.444160] [<ffffffff9fcb2c30>] ? ____call_usermodehelper+0x140/0x140
crash-7.2.5> kmem -i
PAGES TOTAL PERCENTAGE
TOTAL MEM 945937 3.6 GB ----
FREE 21480 83.9 MB 2% of TOTAL MEM
USED 924457 3.5 GB 97% of TOTAL MEM
SHARED 64 256 KB 0% of TOTAL MEM
BUFFERS 35 140 KB 0% of TOTAL MEM
CACHED 388 1.5 MB 0% of TOTAL MEM
SLAB 11322 44.2 MB 1% of TOTAL MEM
crash-7.2.5> kmem -p | awk '/head/ { print extent ; extent=1 } /tail/ { extent++ }' | sort -n | uniq -c
1
1734 2
820 4
210 8
2 16
41 32
1 64
1472 512
So vmcore shows 1472 block with size 2MB. During memory analyze we've found that 2MB chunks belongs to REINT_SETATTR request. This size is set at mdc_setattr() function
mdc_setattr(){
....
req_capsule_set_size(&req->rq_pill, &RMF_ACL, RCL_SERVER,
req->rq_import->imp_connect_data.ocd_max_easize);
..
}
ocd_max_easize is 1MB, a reply is bit larger and roundup set it to 2MB. The Patrick's patch 4f78164f helps here and set 64KB. |
| Comments |
| Comment by Gerrit Updater [ 06/Feb/19 ] |
|
Alexandr Boyko (c17825@cray.com) uploaded a new patch: https://review.whamcloud.com/34194 |
| Comment by Patrick Farrell (Inactive) [ 06/Feb/19 ] |
|
Alex, You might take a look at: https://review.whamcloud.com/#/c/34058/ The maximum allowed xattr size in Linux is 64 KiB, more than that and tar (and other tools) break. Even if you don't want the whole patch, you might consider just the change to the max xattr size on ldiskfs. It improves memory behavior with ea_inode a bunch. |
| Comment by Patrick Farrell (Inactive) [ 06/Feb/19 ] |
|
Oh, wait, I think you found my patch. "The Patrick's patch 4f78164f helps here and set 64KB." But your patch is correct and obviously still good. Saves memory.
|
| Comment by Gerrit Updater [ 18/Feb/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34194/ |
| Comment by Peter Jones [ 19/Feb/19 ] |
|
Landed for 2.13 |