[LU-6842] Seeking the option to hook cl_page LRU up to kernel cache shrinker Created: 13/Jul/15  Updated: 08/Oct/19  Resolved: 09/Oct/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.8.0

Type: Improvement Priority: Minor
Reporter: Jinshan Xiong (Inactive) Assignee: Zhenyu Xu
Resolution: Fixed Votes: 0
Labels: llnl

Attachments: Microsoft Word Hyperion Performance LU-6842.xlsx     Microsoft Word Hyperion Performance LU-6842.xlsx    
Issue Links:
Related
is related to LU-5841 Lustre 2.4.2 MDS, hitting OOM errors Resolved
Rank (Obsolete): 9223372036854775807

 Description   

The purpose is that the client can cache max_cached_mb at maximum on the client side but if the memory is in pressure it should consume less by hooking it up to cache shrinker.



 Comments   
Comment by Peter Jones [ 14/Jul/15 ]

Bobijam

Could you please investigate what is possible here?

Peter

Comment by Zhenyu Xu [ 16/Jul/15 ]

So this shrinker just reduce the ccc_lru_max, and of course shrink LRU if ccc_lru_left is not enough. I'm wondering when the ccc_lru_max should be restored gradually when the memory is not under such pressure?

Comment by Gerrit Updater [ 17/Jul/15 ]

Bobi Jam (bobijam@hotmail.com) uploaded a new patch: http://review.whamcloud.com/15630
Subject: LU-6842 clio: limit max cl_page LRU thru shrinker
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a7cf7b259c93d0f30f390b142878f137d7791ba4

Comment by Jinshan Xiong (Inactive) [ 17/Jul/15 ]

So this shrinker just reduce the ccc_lru_max, and of course shrink LRU if ccc_lru_left is not enough. I'm wondering when the ccc_lru_max should be restored gradually when the memory is not under such pressure?

Sorry for delay response.

This is not about reducing ccc_lru_max, which we don't need to adjust in this case. Actually the meaning of ccc_lru_max should be 'to cache this many pages if memory is enough'. When the system memory is under pressure, Lustre shouldn't cache that much memory at all.

We should register a cache shrinker on the OSC layer to get rid of some pages under memory pressure. Please take a look at osc_lru_reclaim(), but instead we are going to iterate each individual client_obd and destroy pages from them by the policies of last use, # of pages cached, etc.

Comment by Zhenyu Xu [ 22/Jul/15 ]

patch updated.

Comment by Cliff White (Inactive) [ 19/Aug/15 ]

Test results from Hyperion

Comment by Patrick Farrell (Inactive) [ 19/Aug/15 ]

Cliff - Can you explain the tests results a bit more? What's memhog, for example? Is all the testing with the patch installed? If so, are there non-patched results somewhere to compare to?

Comment by Jinshan Xiong (Inactive) [ 20/Aug/15 ]

memhog is a test program under $LUSTRE/lustre/test. It just allocates a huge amount of memory so that we can investigate if Lustre does well under memory pressure.

Comment by Cliff White (Inactive) [ 20/Aug/15 ]

Added 2.7.56 results for comparison

Comment by Gerrit Updater [ 09/Oct/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15630/
Subject: LU-6842 clio: add cl_page LRU shrinker
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 888a3141e72a25bef8daf822325b4295e5a0d5e8

Comment by Peter Jones [ 09/Oct/15 ]

Landed for 2.8

Comment by Andrew Perepechko [ 08/Oct/19 ]

bobijam,

+#ifndef HAVE_SHRINKER_COUNT
+static int osc_cache_shrink(SHRINKER_ARGS(sc, nr_to_scan, gfp_mask))
+{
+       struct shrink_control scv = {
+               .nr_to_scan = shrink_param(sc, nr_to_scan),
+               .gfp_mask   = shrink_param(sc, gfp_mask)
+       };
+#if !defined(HAVE_SHRINKER_WANT_SHRINK_PTR) && !defined(HAVE_SHRINK_CONTROL)
+       struct shrinker *shrinker = NULL;
+#endif
+
+       (void)osc_cache_shrink_scan(shrinker, &scv);
+
+       return osc_cache_shrink_count(shrinker, &scv);
+}
+#endif

Is there any particular reason to return the value from osc_cache_shrink_count() instead of osc_cache_shrink_scan() itself?

         kswapd0-43    [003] ....  2885.831749: mm_shrink_slab_start: osc_cache_shrink+0x0/0x60 [osc] ffff8800cd67c1c0: objects to shrink 56 gfp_flags GFP_KERNEL pgs_scanned 222 lru_pgs 665761 cache items 645137 delta 430 total_scan 486
         kswapd0-43    [003] d...  2885.831751: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x0
         kswapd0-43    [003] d...  2885.831752: r_osc_cache_shrink_0: (shrink_slab+0x15c/0x340 <- osc_cache_shrink) arg1=0x9d811
         kswapd0-43    [003] d...  2885.832371: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x80
         kswapd0-43    [003] d...  2885.832374: r_osc_cache_shrink_0: (shrink_slab+0x175/0x340 <- osc_cache_shrink) arg1=0x9d791
         kswapd0-43    [003] d...  2885.832377: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x0
         kswapd0-43    [003] d...  2885.832378: r_osc_cache_shrink_0: (shrink_slab+0x15c/0x340 <- osc_cache_shrink) arg1=0x9d791
         kswapd0-43    [003] d...  2885.833002: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x80
         kswapd0-43    [003] d...  2885.833004: r_osc_cache_shrink_0: (shrink_slab+0x175/0x340 <- osc_cache_shrink) arg1=0x9d711
         kswapd0-43    [003] d...  2885.833008: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x0
         kswapd0-43    [003] d...  2885.833009: r_osc_cache_shrink_0: (shrink_slab+0x15c/0x340 <- osc_cache_shrink) arg1=0x9d711
         kswapd0-43    [003] d...  2885.833569: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x80
         kswapd0-43    [003] d...  2885.833571: r_osc_cache_shrink_0: (shrink_slab+0x175/0x340 <- osc_cache_shrink) arg1=0x9d691
         kswapd0-43    [003] ....  2885.833573: mm_shrink_slab_end: osc_cache_shrink+0x0/0x60 [osc] ffff8800cd67c1c0: unused scan count 56 new scan count 102 total_scan 46 last shrinker return val 644753

It seems like in such a scenario, vmscan requested 3 times to scan 128 objects, osc_cache_shrink_scan() reported 3 times that 128 objects were freed. However, the shrinker returned not 0x80;0x80;0x80, but 0x9d791; 0x9d711; 0x9d691, reported as "last shrinker return val 644753".

P.S. arg1 is $retval for osc_cache* retprobes.

Generated at Sat Feb 10 02:03:44 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.