[LU-17183] sanity.sh test_411b: cgroups OOM on ARM Created: 11/Oct/23  Updated: 17/Nov/23  Resolved: 17/Nov/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Tim Day Assignee: Qian Yingjin
Resolution: Duplicate Votes: 0
Labels: arm

Issue Links:
Related
is related to LU-17151 sanity: test_411b Error: '(3) failed ... Reopened
is related to LU-16713 Writeback and commit pages under memo... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The 411b test added in https://review.whamcloud.com/c/fs/lustre-release/+/50544/ regularly ran into OOM issues before the memory limit was increased significantly in https://review.whamcloud.com/c/fs/lustre-release/+/52610.

 

The memory required for this test to pass consistently on ARM is around 3x that of x86. This seems to suggest an issue (either with Lustre or the kernel) with cgroups on ARM.



 Comments   
Comment by Patrick Farrell [ 11/Oct/23 ]

Thanks, Tim!

Comment by Xing Huang [ 28/Oct/23 ]

2023-10-28: To be investigated.

Comment by Li Xi [ 17/Nov/23 ]

After applied the patch from LU-17151, the problem is gone.

Comment by Qian Yingjin [ 17/Nov/23 ]

After the patch in LU-17151 was merged into master branch with much larger memory limits on memcg, the failure on ARM does not happen any more.

 

Generated at Sat Feb 10 03:33:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.