Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6842

Seeking the option to hook cl_page LRU up to kernel cache shrinker

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.8.0
    • None
    • 9223372036854775807

    Description

      The purpose is that the client can cache max_cached_mb at maximum on the client side but if the memory is in pressure it should consume less by hooking it up to cache shrinker.

      Attachments

        Issue Links

          Activity

            [LU-6842] Seeking the option to hook cl_page LRU up to kernel cache shrinker
            panda Andrew Perepechko added a comment - - edited

            bobijam,

            +#ifndef HAVE_SHRINKER_COUNT
            +static int osc_cache_shrink(SHRINKER_ARGS(sc, nr_to_scan, gfp_mask))
            +{
            +       struct shrink_control scv = {
            +               .nr_to_scan = shrink_param(sc, nr_to_scan),
            +               .gfp_mask   = shrink_param(sc, gfp_mask)
            +       };
            +#if !defined(HAVE_SHRINKER_WANT_SHRINK_PTR) && !defined(HAVE_SHRINK_CONTROL)
            +       struct shrinker *shrinker = NULL;
            +#endif
            +
            +       (void)osc_cache_shrink_scan(shrinker, &scv);
            +
            +       return osc_cache_shrink_count(shrinker, &scv);
            +}
            +#endif
            

            Is there any particular reason to return the value from osc_cache_shrink_count() instead of osc_cache_shrink_scan() itself?

                     kswapd0-43    [003] ....  2885.831749: mm_shrink_slab_start: osc_cache_shrink+0x0/0x60 [osc] ffff8800cd67c1c0: objects to shrink 56 gfp_flags GFP_KERNEL pgs_scanned 222 lru_pgs 665761 cache items 645137 delta 430 total_scan 486
                     kswapd0-43    [003] d...  2885.831751: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x0
                     kswapd0-43    [003] d...  2885.831752: r_osc_cache_shrink_0: (shrink_slab+0x15c/0x340 <- osc_cache_shrink) arg1=0x9d811
                     kswapd0-43    [003] d...  2885.832371: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x80
                     kswapd0-43    [003] d...  2885.832374: r_osc_cache_shrink_0: (shrink_slab+0x175/0x340 <- osc_cache_shrink) arg1=0x9d791
                     kswapd0-43    [003] d...  2885.832377: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x0
                     kswapd0-43    [003] d...  2885.832378: r_osc_cache_shrink_0: (shrink_slab+0x15c/0x340 <- osc_cache_shrink) arg1=0x9d791
                     kswapd0-43    [003] d...  2885.833002: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x80
                     kswapd0-43    [003] d...  2885.833004: r_osc_cache_shrink_0: (shrink_slab+0x175/0x340 <- osc_cache_shrink) arg1=0x9d711
                     kswapd0-43    [003] d...  2885.833008: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x0
                     kswapd0-43    [003] d...  2885.833009: r_osc_cache_shrink_0: (shrink_slab+0x15c/0x340 <- osc_cache_shrink) arg1=0x9d711
                     kswapd0-43    [003] d...  2885.833569: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x80
                     kswapd0-43    [003] d...  2885.833571: r_osc_cache_shrink_0: (shrink_slab+0x175/0x340 <- osc_cache_shrink) arg1=0x9d691
                     kswapd0-43    [003] ....  2885.833573: mm_shrink_slab_end: osc_cache_shrink+0x0/0x60 [osc] ffff8800cd67c1c0: unused scan count 56 new scan count 102 total_scan 46 last shrinker return val 644753
            

            It seems like in such a scenario, vmscan requested 3 times to scan 128 objects, osc_cache_shrink_scan() reported 3 times that 128 objects were freed. However, the shrinker returned not 0x80;0x80;0x80, but 0x9d791; 0x9d711; 0x9d691, reported as "last shrinker return val 644753".

            P.S. arg1 is $retval for osc_cache* retprobes.

            panda Andrew Perepechko added a comment - - edited bobijam , +#ifndef HAVE_SHRINKER_COUNT + static int osc_cache_shrink(SHRINKER_ARGS(sc, nr_to_scan, gfp_mask)) +{ + struct shrink_control scv = { + .nr_to_scan = shrink_param(sc, nr_to_scan), + .gfp_mask = shrink_param(sc, gfp_mask) + }; +# if !defined(HAVE_SHRINKER_WANT_SHRINK_PTR) && !defined(HAVE_SHRINK_CONTROL) + struct shrinker *shrinker = NULL; +#endif + + (void)osc_cache_shrink_scan(shrinker, &scv); + + return osc_cache_shrink_count(shrinker, &scv); +} +#endif Is there any particular reason to return the value from osc_cache_shrink_count() instead of osc_cache_shrink_scan() itself? kswapd0-43 [003] .... 2885.831749: mm_shrink_slab_start: osc_cache_shrink+0x0/0x60 [osc] ffff8800cd67c1c0: objects to shrink 56 gfp_flags GFP_KERNEL pgs_scanned 222 lru_pgs 665761 cache items 645137 delta 430 total_scan 486 kswapd0-43 [003] d... 2885.831751: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x0 kswapd0-43 [003] d... 2885.831752: r_osc_cache_shrink_0: (shrink_slab+0x15c/0x340 <- osc_cache_shrink) arg1=0x9d811 kswapd0-43 [003] d... 2885.832371: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x80 kswapd0-43 [003] d... 2885.832374: r_osc_cache_shrink_0: (shrink_slab+0x175/0x340 <- osc_cache_shrink) arg1=0x9d791 kswapd0-43 [003] d... 2885.832377: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x0 kswapd0-43 [003] d... 2885.832378: r_osc_cache_shrink_0: (shrink_slab+0x15c/0x340 <- osc_cache_shrink) arg1=0x9d791 kswapd0-43 [003] d... 2885.833002: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x80 kswapd0-43 [003] d... 2885.833004: r_osc_cache_shrink_0: (shrink_slab+0x175/0x340 <- osc_cache_shrink) arg1=0x9d711 kswapd0-43 [003] d... 2885.833008: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x0 kswapd0-43 [003] d... 2885.833009: r_osc_cache_shrink_0: (shrink_slab+0x15c/0x340 <- osc_cache_shrink) arg1=0x9d711 kswapd0-43 [003] d... 2885.833569: r_osc_cache_shrink_scan_0: (osc_cache_shrink+0x36/0x60 [osc] <- osc_cache_shrink_scan) arg1=0x80 kswapd0-43 [003] d... 2885.833571: r_osc_cache_shrink_0: (shrink_slab+0x175/0x340 <- osc_cache_shrink) arg1=0x9d691 kswapd0-43 [003] .... 2885.833573: mm_shrink_slab_end: osc_cache_shrink+0x0/0x60 [osc] ffff8800cd67c1c0: unused scan count 56 new scan count 102 total_scan 46 last shrinker return val 644753 It seems like in such a scenario, vmscan requested 3 times to scan 128 objects, osc_cache_shrink_scan() reported 3 times that 128 objects were freed. However, the shrinker returned not 0x80;0x80;0x80, but 0x9d791; 0x9d711; 0x9d691, reported as "last shrinker return val 644753". P.S. arg1 is $retval for osc_cache* retprobes.
            pjones Peter Jones added a comment -

            Landed for 2.8

            pjones Peter Jones added a comment - Landed for 2.8

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15630/
            Subject: LU-6842 clio: add cl_page LRU shrinker
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 888a3141e72a25bef8daf822325b4295e5a0d5e8

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15630/ Subject: LU-6842 clio: add cl_page LRU shrinker Project: fs/lustre-release Branch: master Current Patch Set: Commit: 888a3141e72a25bef8daf822325b4295e5a0d5e8

            Added 2.7.56 results for comparison

            cliffw Cliff White (Inactive) added a comment - Added 2.7.56 results for comparison

            memhog is a test program under $LUSTRE/lustre/test. It just allocates a huge amount of memory so that we can investigate if Lustre does well under memory pressure.

            jay Jinshan Xiong (Inactive) added a comment - memhog is a test program under $LUSTRE/lustre/test. It just allocates a huge amount of memory so that we can investigate if Lustre does well under memory pressure.

            Cliff - Can you explain the tests results a bit more? What's memhog, for example? Is all the testing with the patch installed? If so, are there non-patched results somewhere to compare to?

            paf Patrick Farrell (Inactive) added a comment - Cliff - Can you explain the tests results a bit more? What's memhog, for example? Is all the testing with the patch installed? If so, are there non-patched results somewhere to compare to?

            Test results from Hyperion

            cliffw Cliff White (Inactive) added a comment - Test results from Hyperion
            bobijam Zhenyu Xu added a comment -

            patch updated.

            bobijam Zhenyu Xu added a comment - patch updated.

            So this shrinker just reduce the ccc_lru_max, and of course shrink LRU if ccc_lru_left is not enough. I'm wondering when the ccc_lru_max should be restored gradually when the memory is not under such pressure?

            Sorry for delay response.

            This is not about reducing ccc_lru_max, which we don't need to adjust in this case. Actually the meaning of ccc_lru_max should be 'to cache this many pages if memory is enough'. When the system memory is under pressure, Lustre shouldn't cache that much memory at all.

            We should register a cache shrinker on the OSC layer to get rid of some pages under memory pressure. Please take a look at osc_lru_reclaim(), but instead we are going to iterate each individual client_obd and destroy pages from them by the policies of last use, # of pages cached, etc.

            jay Jinshan Xiong (Inactive) added a comment - So this shrinker just reduce the ccc_lru_max, and of course shrink LRU if ccc_lru_left is not enough. I'm wondering when the ccc_lru_max should be restored gradually when the memory is not under such pressure? Sorry for delay response. This is not about reducing ccc_lru_max, which we don't need to adjust in this case. Actually the meaning of ccc_lru_max should be 'to cache this many pages if memory is enough'. When the system memory is under pressure, Lustre shouldn't cache that much memory at all. We should register a cache shrinker on the OSC layer to get rid of some pages under memory pressure. Please take a look at osc_lru_reclaim(), but instead we are going to iterate each individual client_obd and destroy pages from them by the policies of last use, # of pages cached, etc.

            Bobi Jam (bobijam@hotmail.com) uploaded a new patch: http://review.whamcloud.com/15630
            Subject: LU-6842 clio: limit max cl_page LRU thru shrinker
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: a7cf7b259c93d0f30f390b142878f137d7791ba4

            gerrit Gerrit Updater added a comment - Bobi Jam (bobijam@hotmail.com) uploaded a new patch: http://review.whamcloud.com/15630 Subject: LU-6842 clio: limit max cl_page LRU thru shrinker Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: a7cf7b259c93d0f30f390b142878f137d7791ba4

            People

              bobijam Zhenyu Xu
              jay Jinshan Xiong (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: