[LU-2084] Kernel freeze allocating more memory than there is RAM - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.15.0
Affects Version/s: Lustre 2.2.0, Lustre 2.3.0, Lustre 2.4.0, Lustre 2.1.3, Lustre 1.8.8
Labels:
None

Severity:
3
Rank (Obsolete):
4350

Description

While working with router buffers, I set the number of large buffers to a number beyond the amount of memory I had assigned to the VM running Lustre. Number of large buffer: 1024, amount of memory: 1G. The VM froze with all 3 virtual cpu's running at 100%.

Looking deeper into this, I found that the Linux memory allocation system will keep trying to free up memory to satisfy the request. However, even after waiting 15 minutes, the VM did not "unfreeze".

I changed the default flags we use for memory allocation to include __GFP_NORETRY to stop the memory allocator from looping. When re-running the above test, I found the system no longer froze but returned -ENOMEM to the caller as expected.

This bug is to track a discussion as to whether we should start using __GFP_NORETRY and if so, how widespread.

Attachments

Activity

People

Assignee:: Andreas Dilger

Reporter:: Doug Oucharek (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 03/Oct/12 7:20 PM

Updated:: 27/Oct/21 3:48 AM

Resolved:: 27/Oct/21 3:48 AM