[LU-14607] osd xattr cache wasting memory Created: 12/Apr/21 Updated: 18/Feb/23 Resolved: 09/Jun/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | John Hammond | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
In some cases we pass a 64K xattr value buffer to osp_oac_xattr_find_or_add() (see below). The turns into a 128K alloc since we allocate a single memory block for oxe, the xattr name, and the xattr value. If oxe used a separate memory block for the xattr value then the allocation would be down to 64K plus change. pr 9 16:21:54 mds-0 kernel: mdt06_014: page allocation failure: order:5, mode:0xc050 Apr 9 16:21:54 mds-0 kernel: CPU: 12 PID: 94381 Comm: mdt06_014 Kdump: loaded Tainted: G OE ------------ T 3.10.0-1062.18.1.el7_lustre.ddn12.x86_64 #1 Apr 9 16:21:54 mds-0 kernel: Hardware name: /0XFK4K, BIOS 2.7.7 05/04/2020 Apr 9 16:21:54 mds-0 kernel: Call Trace: Apr 9 16:21:54 mds-0 kernel: [<ffffffff8697b416>] dump_stack+0x19/0x1b Apr 9 16:21:54 mds-0 kernel: [<ffffffff863c3fc0>] warn_alloc_failed+0x110/0x180 Apr 9 16:21:54 mds-0 kernel: [<ffffffff8697698a>] __alloc_pages_slowpath+0x6bb/0x729 Apr 9 16:21:54 mds-0 kernel: [<ffffffff863c8636>] __alloc_pages_nodemask+0x436/0x450 Apr 9 16:21:54 mds-0 kernel: [<ffffffff86416c58>] alloc_pages_current+0x98/0x110 Apr 9 16:21:54 mds-0 kernel: [<ffffffff863e3658>] kmalloc_order+0x18/0x40 Apr 9 16:21:54 mds-0 kernel: [<ffffffff86422216>] kmalloc_order_trace+0x26/0xa0 Apr 9 16:21:54 mds-0 kernel: [<ffffffff864261a1>] __kmalloc+0x211/0x230 Apr 9 16:21:54 mds-0 kernel: [<ffffffffc1ac43f2>] osp_oac_xattr_find_or_add+0x72/0x270 [osp] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc1ac8639>] osp_xattr_get+0xd29/0x1140 [osp] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc1ac7f61>] ? osp_xattr_get+0x651/0x1140 [osp] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc17b828a>] ? ldiskfs_xattr_trusted_get+0x2a/0x30 [ldiskfs] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc1a47b8e>] lod_xattr_get+0xee/0x700 [lod] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc16f294c>] __mdd_permission_internal+0x71c/0x9a0 [mdd] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc16cc96f>] __mdd_lookup.isra.17+0x19f/0x440 [mdd] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc16cccbf>] mdd_lookup+0xaf/0x170 [mdd] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc1970332>] mdt_lookup_version_check+0x72/0x2c0 [mdt] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc1971efb>] mdt_reint_rename+0xddb/0x28a0 [mdt] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc12dd826>] ? null_alloc_rs+0x186/0x340 [ptlrpc] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc197a533>] mdt_reint_rec+0x83/0x210 [mdt] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc1956483>] mdt_reint_internal+0x6e3/0xaf0 [mdt] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc1961e37>] mdt_reint+0x67/0x140 [mdt] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc130ff9e>] tgt_request_handle+0xaee/0x15f0 [ptlrpc] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc12e70a1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc0e73bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc12b237b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc12af195>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] Apr 9 16:21:54 mds-0 kernel: [<ffffffff862d3a33>] ? __wake_up+0x13/0x20 Apr 9 16:21:54 mds-0 kernel: [<ffffffffc12b5ce4>] ptlrpc_main+0xb34/0x1470 [ptlrpc] Apr 9 16:21:54 mds-0 kernel: [<ffffffffc12b51b0>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] Apr 9 16:21:54 mds-0 kernel: [<ffffffff862c6321>] kthread+0xd1/0xe0 Apr 9 16:21:54 mds-0 kernel: [<ffffffff862c6250>] ? insert_kthread_work+0x40/0x40 Apr 9 16:21:54 mds-0 kernel: [<ffffffff8698ed1d>] ret_from_fork_nospec_begin+0x7/0x21 Apr 9 16:21:54 mds-0 kernel: [<ffffffff862c6250>] ? insert_kthread_work+0x40/0x40 Apr 9 16:21:54 mds-0 kernel: Mem-Info: |
| Comments |
| Comment by Peter Jones [ 13/Apr/21 ] |
|
Lai Could you please look into this one? Peter |
| Comment by Andreas Dilger [ 08/May/21 ] |
|
I was looking into this briefly, and it makes sense to split the cache allocation into two parts if "size > PAGE_SIZE". For the small xattr/common case, having a single allocation is more efficient, but in case of a large xattr just the value part should be allocated with a separate OBD_ALLOC_LARGE(). The data struct already tracks namelen, buflen separately, and has a separate pointer to the value (which can be inline for small xattrs and point to a separate buffer for large xattrs). |
| Comment by Gerrit Updater [ 19/May/21 ] |
|
Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43736 |
| Comment by Gerrit Updater [ 08/Jun/21 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43736/ |
| Comment by Peter Jones [ 09/Jun/21 ] |
|
Landed for 2.15 |