[LU-2139] Tracking unstable pages - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.6.0
Affects Version/s: Lustre 2.4.0
Labels:
- JL
- MB
Environment:
2_3_49_92_1-llnl

Severity:
3
Rank (Obsolete):
3127

Description

We've been seeing strange caching behavior on our PPC IO nodes, eventually resulting in OOM events. This is particularly harmful for us because there are critical system components running in user space on these nodes, forcing us to run with "panic_on_oom" enabled.

We see a large amount of "Active File" pages as reported by /proc/vmstat and /proc/meminfo which spikes during Lustre IOR jobs. For the test I am running that is unusual since I'm not running any executables out of Lustre, it should only be "inactive" IOR data accumulating in the page cache as a result of the Lustre IO. The really strange thing is, prior to testing Orion rebased code, "Active Files" would sometimes stay low (in the 100's Meg range) and sometimes it would grow very large (in the 5 Gig range). It's hard to tell if the variation still exists in the rebased code because the OOM events are hitting more frequently, basically every time I run an IOR.

We also see a large amount of "Inactive File" pages which we believe should be limited by the patch we carry from ~~LU-744~~, but doesn't seem to be the case:

commit 98400981e6d6e5707233be2c090e4227a77e2c46
Author: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Date:   Tue May 15 20:11:37 2012 -0700

    LU-744 osc: add lru pages management - new RPC 
    
    Add a cache management at osc layer, this way we can control how much
    memory can be used to cache lustre pages and avoid complex solution
    as what we did in b1_8.
    
    In this patch, admins can set how much memory will be used for caching
    lustre pages per file system. A self-adapative algorithm is used to
    balance those budget among OSCs.
    
    Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
    Change-Id: I76c840aef5ca9a3a4619f06fcaee7de7f95b05f5
    Revision-Id: 21

From what I can tell, Lustre is trying to limit the cache to the value we are setting 4G. When I dump the lustre page cache I roughly see 4G worth of pages, but the number of pages listed does not reflect the values seen in vmstat and meminfo.

So I have a few questions which I'd like to get an answer to:

 1. Why are Lustre pages being marked as "referenced" and moved to the 
    Active list in the first place? Without any running executables
    coming from Lustre I would not expect this to happen.

 2. Why more "Inactive File" pages are accumulating on the system past
    the 4G limit we are trying to set within Lustre?

 3. Why these "Inactive File" pages are unable to be reclaimed  when we
    hit a low memory situation? Ultimately resulting in an out of memory
    event and panic_on_oom triggering. This _might_ be related to (1) 
    above.

I added a systemtap script to disable the panic_on_oom flag and dump the Lustre page cache, /proc/vmstat, and /proc/meminfo file to try and gain some understanding into the problem. I'll upload those files as attachments in case they prove useful.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

fio-vulcanio65-ib0-2012-09-20-2.png
8 kB
20/Sep/12 5:25 PM
fio-vulcanio65-ib0-2012-09-20-2.tar.gz
62 kB
20/Sep/12 5:25 PM
fio-vulcanio65-ib0-2012-09-20-3.png
8 kB
20/Sep/12 5:25 PM
fio-vulcanio65-ib0-2012-09-20-3.tar.gz
57 kB
20/Sep/12 5:25 PM
isolate_lru_page.diff
2 kB
14/Sep/12 8:42 PM
vulcanio10-ib0-1347641093961.lpagecache.bz2
842 kB
14/Sep/12 2:05 PM
vulcanio10-ib0-1347641093961.meminfo
0.9 kB
14/Sep/12 2:05 PM
vulcanio10-ib0-1347641093961.vmstat
1 kB
14/Sep/12 2:05 PM
vulcanio18.tar.gz
12 kB
17/Sep/12 7:35 PM
vulcanio18-ib0-2012-09-17-0.eps
20 kB
17/Sep/12 7:35 PM
vulcanio18-ib0-2012-09-17-0.png
8 kB
18/Sep/12 1:17 PM
vulcanio3-ib0-2012-09-18-0.png
8 kB
18/Sep/12 7:56 PM
vulcanio3-ib0-2012-09-18-0.tar.gz
13 kB
18/Sep/12 7:56 PM

Issue Links

is blocked by

LU-3274 osc_cache.c:1774:osc_dec_unstable_pages()) ASSERTION( atomic_read(&cli->cl_cache->ccc_unstable_nr) >= 0 ) failed

Resolved

LU-3277 LU-2139 may cause the performance regression

Resolved

is duplicated by

LU-70 Lustre client OOM with async journal enabled

Resolved

is related to

LU-5483 recovery-mds-scale test failover_mds: oom failure on client

Reopened

LU-2576 Hangs in osc_enter_cache due to dirty pages not being flushed

Resolved

LU-3321 2.x single thread/process throughput degraded from 1.8

Resolved

LU-3277 LU-2139 may cause the performance regression

Resolved

LU-3910 Interop 2.4.0<->2.5 failure on test suite parallel-scale-nfsv4 test_iorssf: MDS OOM

Resolved

(3 is related to)

Activity

[LU-2139] Tracking unstable pages

Jian Yu added a comment - 07/Nov/14 8:27 AM

Here are the back-ported patches for Lustre b2_5 branch:

Jian Yu added a comment - 07/Nov/14 8:27 AM Here are the back-ported patches for Lustre b2_5 branch: http://review.whamcloud.com/12604 (from http://review.whamcloud.com/6284 ) http://review.whamcloud.com/12605 (from http://review.whamcloud.com/4374 ) http://review.whamcloud.com/12606 (from http://review.whamcloud.com/4375 ) http://review.whamcloud.com/12612 (from http://review.whamcloud.com/5935 )

Peter Jones added a comment - 09/Dec/13 2:26 PM

All patches now landed to master

Peter Jones added a comment - 09/Dec/13 2:26 PM All patches now landed to master

Christopher Morrone (Inactive) added a comment - 12/Nov/13 10:10 PM

Remaining:

http://review.whamcloud.com/5935

Christopher Morrone (Inactive) added a comment - 12/Nov/13 10:10 PM Remaining: http://review.whamcloud.com/5935

Peter Jones added a comment - 03/Oct/13 2:15 PM

Prakash

Thanks for the refresh. Yes, we will review these patches shortly

Peter

Peter Jones added a comment - 03/Oct/13 2:15 PM Prakash Thanks for the refresh. Yes, we will review these patches shortly Peter

Prakash Surya (Inactive) added a comment - 02/Oct/13 11:42 PM

I've just refreshed these three patches onto HEAD of master:

1) http://review.whamcloud.com/6284
2) http://review.whamcloud.com/4374
3) http://review.whamcloud.com/4375

It would be nice to get some feedback on them, we've been running with previous versions of these patches on Sequoia/Grove for nearly a year now.

Prakash Surya (Inactive) added a comment - 02/Oct/13 11:42 PM I've just refreshed these three patches onto HEAD of master: 1) http://review.whamcloud.com/6284 2) http://review.whamcloud.com/4374 3) http://review.whamcloud.com/4375 It would be nice to get some feedback on them, we've been running with previous versions of these patches on Sequoia/Grove for nearly a year now.

Peter Jones added a comment - 02/Oct/13 7:55 PM

This remains a support priority to get these patches refreshed and landed but the patches are not presently ready for consideration to include in 2.5.0

Peter Jones added a comment - 02/Oct/13 7:55 PM This remains a support priority to get these patches refreshed and landed but the patches are not presently ready for consideration to include in 2.5.0

Peter Jones added a comment - 22/Sep/13 2:04 PM

Lai

Could you please take care of refreshing Jinshan's patches so they are suitable for landing to master?

Thanks

Peter

Peter Jones added a comment - 22/Sep/13 2:04 PM Lai Could you please take care of refreshing Jinshan's patches so they are suitable for landing to master? Thanks Peter

Oleg Drokin added a comment - 05/May/13 4:50 AM - edited

Hit assertion introduced by first patch in the series:
"LustreError: 17404:0:(osc_cache.c:1774:osc_dec_unstable_pages()) ASSERTION( atomic_read(&cli->cl_cache->ccc_unstable_nr) >= 0 ) failed"

See ~~LU-3274~~ for more details.

Oleg Drokin added a comment - 05/May/13 4:50 AM - edited Hit assertion introduced by first patch in the series: "LustreError: 17404:0:(osc_cache.c:1774:osc_dec_unstable_pages()) ASSERTION( atomic_read(&cli->cl_cache->ccc_unstable_nr) >= 0 ) failed" See LU-3274 for more details.

Prakash Surya (Inactive) added a comment - 19/Apr/13 3:20 PM

efocht Please make sure you are using patch-set 30 of http://review.whamcloud.com/4245. The earlier patch-sets had deficiencies in them, and patch-set 29 specifically had a bug in it causing umounts to hang in ll_put_super (which looks like the problem you are having).

Prakash Surya (Inactive) added a comment - 19/Apr/13 3:20 PM efocht Please make sure you are using patch-set 30 of http://review.whamcloud.com/4245 . The earlier patch-sets had deficiencies in them, and patch-set 29 specifically had a bug in it causing umounts to hang in ll_put_super (which looks like the problem you are having).

Erich Focht added a comment - 17/Apr/13 8:12 AM

Applied the patch stack (4245, 4374, 4375) to a 2.6.32-358.0.1.el6.x86_64 kernel. Getting soft lockups when unmounting. The servers run Lustre 2.1.5.

Pid: 3792, comm: umount Not tainted 2.6.32-358.0.1.el6.x86_64 #1 Supermicro X9DRT/X9DRT
RIP: 0010:[<ffffffffa0b65e2c>]  [<ffffffffa0b65e2c>] ll_put_super+0x10c/0x510 [lustre]
RSP: 0018:ffff881016853d28  EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff881016853e28 RCX: ffff881027b6c110
RDX: ffff88102d7d1540 RSI: 000000000000005a RDI: ffff88104a73fc00
RBP: ffffffff8100bb8e R08: 0000000000000000 R09: ffff881049e980c0
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff881016853d98 R14: ffff88106ebfba00 R15: ffff881027afa044
FS:  00002ab82d4c7740(0000) GS:ffff88089c520000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002ab82d167360 CR3: 000000102d0ca000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 3792, threadinfo ffff881016852000, task ffff88102d7d1540)
Stack:
 ffffffff81fcb700 01ff88086e17b148 ffff88102c932400 ffff88102c932000
<d> ffff88102d7d1540 ffff88106ebfba00 ffff88086e17b350 ffff88086e17b138
<d> ffff881016853d88 ffffffff8119cccf ffff88086e17b138 ffff88086e17b138
Call Trace:
 [<ffffffff8119cccf>] ? destroy_inode+0x2f/0x60
 [<ffffffff8119d19c>] ? dispose_list+0xfc/0x120
 [<ffffffff8119d596>] ? invalidate_inodes+0xf6/0x190
 [<ffffffff8118334b>] ? generic_shutdown_super+0x5b/0xe0
 [<ffffffff81183436>] ? kill_anon_super+0x16/0x60
 [<ffffffffa06e82ea>] ? lustre_kill_super+0x4a/0x60 [obdclass]
 [<ffffffff81183bd7>] ? deactivate_super+0x57/0x80
 [<ffffffff811a1c4f>] ? mntput_no_expire+0xbf/0x110
 [<ffffffff811a26bb>] ? sys_umount+0x7b/0x3a0
 [<ffffffff810863b1>] ? sigprocmask+0x71/0x110
 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Code: c0 4c 89 a5 10 ff ff ff 4c 8d b5 40 ff ff ff 48 89 95 20 ff ff ff 49 89 cd 49 89 d4 0f 85 ad 00 00 00 45 85 ff 0f 85 a4 00 00 00 <8b> 83 0c 01 00 00 85 c0 74 f6 4c 89 f7 e8 72 f3 a4 ff 4c 89 f6
Call Trace:
 [<ffffffffa0b65dc7>] ? ll_put_super+0xa7/0x510 [lustre]
 [<ffffffff8119cccf>] ? destroy_inode+0x2f/0x60
 [<ffffffff8119d19c>] ? dispose_list+0xfc/0x120
 [<ffffffff8119d596>] ? invalidate_inodes+0xf6/0x190
 [<ffffffff8118334b>] ? generic_shutdown_super+0x5b/0xe0
 [<ffffffff81183436>] ? kill_anon_super+0x16/0x60
 [<ffffffffa06e82ea>] ? lustre_kill_super+0x4a/0x60 [obdclass]
 [<ffffffff81183bd7>] ? deactivate_super+0x57/0x80
 [<ffffffff811a1c4f>] ? mntput_no_expire+0xbf/0x110
 [<ffffffff811a26bb>] ? sys_umount+0x7b/0x3a0
 [<ffffffff810863b1>] ? sigprocmask+0x71/0x110
 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
BUG: soft lockup - CPU#25 stuck for 67s! [umount:3792]

Erich Focht added a comment - 17/Apr/13 8:12 AM Applied the patch stack (4245, 4374, 4375) to a 2.6.32-358.0.1.el6.x86_64 kernel. Getting soft lockups when unmounting. The servers run Lustre 2.1.5. Pid: 3792, comm: umount Not tainted 2.6.32-358.0.1.el6.x86_64 #1 Supermicro X9DRT/X9DRT RIP: 0010:[<ffffffffa0b65e2c>] [<ffffffffa0b65e2c>] ll_put_super+0x10c/0x510 [lustre] RSP: 0018:ffff881016853d28 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff881016853e28 RCX: ffff881027b6c110 RDX: ffff88102d7d1540 RSI: 000000000000005a RDI: ffff88104a73fc00 RBP: ffffffff8100bb8e R08: 0000000000000000 R09: ffff881049e980c0 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff881016853d98 R14: ffff88106ebfba00 R15: ffff881027afa044 FS: 00002ab82d4c7740(0000) GS:ffff88089c520000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002ab82d167360 CR3: 000000102d0ca000 CR4: 00000000000407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process umount (pid: 3792, threadinfo ffff881016852000, task ffff88102d7d1540) Stack: ffffffff81fcb700 01ff88086e17b148 ffff88102c932400 ffff88102c932000 <d> ffff88102d7d1540 ffff88106ebfba00 ffff88086e17b350 ffff88086e17b138 <d> ffff881016853d88 ffffffff8119cccf ffff88086e17b138 ffff88086e17b138 Call Trace: [<ffffffff8119cccf>] ? destroy_inode+0x2f/0x60 [<ffffffff8119d19c>] ? dispose_list+0xfc/0x120 [<ffffffff8119d596>] ? invalidate_inodes+0xf6/0x190 [<ffffffff8118334b>] ? generic_shutdown_super+0x5b/0xe0 [<ffffffff81183436>] ? kill_anon_super+0x16/0x60 [<ffffffffa06e82ea>] ? lustre_kill_super+0x4a/0x60 [obdclass] [<ffffffff81183bd7>] ? deactivate_super+0x57/0x80 [<ffffffff811a1c4f>] ? mntput_no_expire+0xbf/0x110 [<ffffffff811a26bb>] ? sys_umount+0x7b/0x3a0 [<ffffffff810863b1>] ? sigprocmask+0x71/0x110 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b Code: c0 4c 89 a5 10 ff ff ff 4c 8d b5 40 ff ff ff 48 89 95 20 ff ff ff 49 89 cd 49 89 d4 0f 85 ad 00 00 00 45 85 ff 0f 85 a4 00 00 00 <8b> 83 0c 01 00 00 85 c0 74 f6 4c 89 f7 e8 72 f3 a4 ff 4c 89 f6 Call Trace: [<ffffffffa0b65dc7>] ? ll_put_super+0xa7/0x510 [lustre] [<ffffffff8119cccf>] ? destroy_inode+0x2f/0x60 [<ffffffff8119d19c>] ? dispose_list+0xfc/0x120 [<ffffffff8119d596>] ? invalidate_inodes+0xf6/0x190 [<ffffffff8118334b>] ? generic_shutdown_super+0x5b/0xe0 [<ffffffff81183436>] ? kill_anon_super+0x16/0x60 [<ffffffffa06e82ea>] ? lustre_kill_super+0x4a/0x60 [obdclass] [<ffffffff81183bd7>] ? deactivate_super+0x57/0x80 [<ffffffff811a1c4f>] ? mntput_no_expire+0xbf/0x110 [<ffffffff811a26bb>] ? sys_umount+0x7b/0x3a0 [<ffffffff810863b1>] ? sigprocmask+0x71/0x110 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b BUG: soft lockup - CPU#25 stuck for 67s! [umount:3792]

People

Assignee:: Lai Siyao

Reporter:: Prakash Surya (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 19 Start watching this issue

Dates

Created:: 14/Sep/12 2:05 PM

Updated:: 28/Aug/24 7:38 PM

Resolved:: 09/Dec/13 2:26 PM