[LU-823] Lustre breaks cgroups accounting Created: 03/Nov/11  Updated: 24/Nov/22  Resolved: 10/Apr/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0, Lustre 2.2.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Minor
Reporter: Christopher Morrone Assignee: Niu Yawei (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Environment:

RHEL 6.1, 6.2 (possibly earlier), other kernels with cgroups support


Issue Links:
Related
Severity: 3
Rank (Obsolete): 6527

 Description   

Lustre implements its own copy of truncate_complete_page() in lustre_patchless_compat.h to allow the client to build against an unpatched kernel. Unfortunately, of course, this means that lustre can break the kernel if it doesn't keep its copy in sync.

Lustre's copy of truncate_complete_page() is out of date and breaks, at a minimum, linux's cgroup accounting. To work around this problem we have decided to temporarily patch our kernel to have it export truncate_complete_page() and allow Lustre to use the kernel's function.

Obviously the quick-and-dirty fix for Lustre is to add additional autoconf checks and update its copy of truncate_complete_page() and other associated functions. But that whole approach is pretty unsettling.

I would first like to check if there is some other similar function that the kernel exports now that we can start using. Failing that, perhaps we can borrow Brian's trick from ZFS for using a symbol that hasn't been exported.



 Comments   
Comment by Peter Jones [ 03/Nov/11 ]

Chris

Could you check whether the tip of master still exhibits this problem? Oleg wonders whether this landing may have helped - http://git.whamcloud.com/?p=fs/lustre-release.git;a=commit;h=1515e409cc57af5eaef809eee6d8f8d6725d092b

Peter

Comment by Christopher Morrone [ 03/Nov/11 ]

I think that patch comment could have used a larger comment. Why is it ok to switch from calling truncate_complete_page() to calling remove_page_from_cache() when remove_page_from_cache() is only one of several things that truncate_complete_page() does?

But unfortunately, RHEL 6.2 doesn't export either delete_from_page_cache or remove_from_page_cache, so that page doesn't address the problem.

Comment by Christopher Morrone [ 03/Nov/11 ]

The upstream linux commit (a52116aba5b3eed0ee41f70b794cc1937acd5cb8) to export remove_from_page_cache is just a one-line that doesn't do anything else. We are going to take a stab at getting RHEL to cherry-pick that into the RHEL6.2 kernel, but they are close to freezing the kernel so we probably shouldn't hold our breath.

In any event, it would not help folks using cgroups on earlier RHEL6 releases.

Comment by Christopher Morrone [ 03/Nov/11 ]

Oh, nevermind my comment about replacing truncate_complete_page with the other calls. Now I see how the functions nest.

But the problem still remains: no call to mem_cgroup_uncharge_cache_page

And isn't it incorrect to use cfs_* lock functions when the kernel is using the normal kernel locking functions on the same locks? Granted, the cfs_* functions are just #defines to the kernel functions, but it doesn't seem correct to use cfs_* functions when these are cfs_* locks.

Comment by Andreas Dilger [ 03/Nov/11 ]

I agree - cfs_* wrappers shouldn't be used on kernel structures. I've noticed this in a few places.

Comment by Peter Jones [ 07/Nov/11 ]

Niu

Could you please look into what changes are needed here?

Thanks

Peter

Comment by Niu Yawei (Inactive) [ 08/Nov/11 ]

truncate_inode_pages_range() can serve the similar function like truncate_complete_page(), but it's not as efficient as calling truncate_complete_page() directly, since it'll re-lookup & re-lock the page internally, I think that's why we didn't use it at the very begining, and given that remove_page_from_cache() will be exported in later kernel, I don't think it's wise to make changes(quite a few) to use the truncate_inode_pages_range().

If uer really want both cgroup and patchless client, I think we have to adopt the way of hacking to use un-exported symboles (Chirs, could you provide the details of this?), otherwise, we can document that patchless client doesn't support cgroup for those kernels (early 2.6.32).

Comment by Niu Yawei (Inactive) [ 10/Apr/12 ]

Chris, are you ok with my previous comment? Declare that cgroup isn't supportted on patchless client with early 2.6.32 kernels, and use remove_from_page_cache() whenever it's exported in later kernel version. Thanks.

Comment by Christopher Morrone [ 10/Apr/12 ]

I suppose I don't care too much since we decided to backport the kernel export.

This business of copying kernel functions into lustre and hoping they are correct is very very ugly. But since this particular problem will go away in the future with newer kernels that export more useful symbols, I think we just leave it as it is for now.

Comment by Peter Jones [ 10/Apr/12 ]

ok thanks Chris

Comment by Mark Hills [ 11/Apr/12 ]

FYI, I think this is a straight duplicate of LU-620

Generated at Sat Feb 10 01:10:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.