Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-569

make lu_object cache size adjustable

Details

    • Improvement
    • Resolution: Won't Fix
    • Minor
    • None
    • None
    • None
    • 4900

    Description

      lu_object cache is specified to consume 20% of total memory. This limits 200 clients can be mounted on one node. We should make it adjustable so that customers have a chance to configure it by their needs.

      Attachments

        Issue Links

          Activity

            [LU-569] make lu_object cache size adjustable

            Integrated in lustre-master » x86_64,client,el6,inkernel #285
            LU-569: Make lu_object cache size adjustable

            Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23
            Files :

            • lustre/obdclass/lu_object.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,el6,inkernel #285 LU-569 : Make lu_object cache size adjustable Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23 Files : lustre/obdclass/lu_object.c

            Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #285
            LU-569: Make lu_object cache size adjustable

            Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23
            Files :

            • lustre/obdclass/lu_object.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #285 LU-569 : Make lu_object cache size adjustable Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23 Files : lustre/obdclass/lu_object.c

            Integrated in lustre-master » i686,client,el6,inkernel #285
            LU-569: Make lu_object cache size adjustable

            Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23
            Files :

            • lustre/obdclass/lu_object.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,client,el6,inkernel #285 LU-569 : Make lu_object cache size adjustable Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23 Files : lustre/obdclass/lu_object.c

            Integrated in lustre-master » x86_64,server,el5,ofa #285
            LU-569: Make lu_object cache size adjustable

            Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23
            Files :

            • lustre/obdclass/lu_object.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,server,el5,ofa #285 LU-569 : Make lu_object cache size adjustable Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23 Files : lustre/obdclass/lu_object.c

            Integrated in lustre-master » i686,server,el6,inkernel #285
            LU-569: Make lu_object cache size adjustable

            Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23
            Files :

            • lustre/obdclass/lu_object.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » i686,server,el6,inkernel #285 LU-569 : Make lu_object cache size adjustable Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23 Files : lustre/obdclass/lu_object.c

            Integrated in lustre-master » x86_64,client,sles11,inkernel #285
            LU-569: Make lu_object cache size adjustable

            Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23
            Files :

            • lustre/obdclass/lu_object.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-master » x86_64,client,sles11,inkernel #285 LU-569 : Make lu_object cache size adjustable Oleg Drokin : c8d7c99ec50c81a33eea43ed1c535fa4d65cef23 Files : lustre/obdclass/lu_object.c

            I'll use this patch for IR test only.

            jay Jinshan Xiong (Inactive) added a comment - I'll use this patch for IR test only.
            liang Liang Zhen (Inactive) added a comment - - edited

            yes, they should be close, but it doesn't matter if they are handled by different threads on different CPUs, instead of "hog" one thread on one CPU for seconds.
            we want to rehash(or grow hash-table) just because we don't want to allocate huge amount of big hash tables, for example, obd_class::exp_lock_hash, it is possible that there are tens of thousands of locks in this hash although most cases it shouldn't be that many, so we should allocate a small hash table on initializing of export and grow it only if necessary, as we can have over hundreds of thousands exports on server, it will save a lot of memory.

            btw: although not fully tested, I remember the new cfs_hash can support "shrink" of hash-table which is non-blocking too, we probably should test and enable it in the future.

            liang Liang Zhen (Inactive) added a comment - - edited yes, they should be close, but it doesn't matter if they are handled by different threads on different CPUs, instead of "hog" one thread on one CPU for seconds. we want to rehash(or grow hash-table) just because we don't want to allocate huge amount of big hash tables, for example, obd_class::exp_lock_hash, it is possible that there are tens of thousands of locks in this hash although most cases it shouldn't be that many, so we should allocate a small hash table on initializing of export and grow it only if necessary, as we can have over hundreds of thousands exports on server, it will save a lot of memory. btw: although not fully tested, I remember the new cfs_hash can support "shrink" of hash-table which is non-blocking too, we probably should test and enable it in the future.

            It will help, but if you are using an evenly distributed hash function, I could say the time for each first-level bucket to be rehashed will be really close.
            I just don't understand the intention of rehashing feature, BTW.

            jay Jinshan Xiong (Inactive) added a comment - It will help, but if you are using an evenly distributed hash function, I could say the time for each first-level bucket to be rehashed will be really close. I just don't understand the intention of rehashing feature, BTW.

            it's kind of off-topic, I think we can improve cfs_hash to make it support rehash-in-bucket in the future:

            • user can provide (not necessary) two levels hash functions
              • the first is for bucket-hash (each bucket has one lock and N entries (hlist_head))
              • the second is for entry-hash inside bucket (hash element to hlist_head in that bucket)
            • rehash can only happen in each bucket
              • better scalability, because we don't do rehash for the whole hash table in one batch
              • no element moving between buckets, so we don't need rwlock or lock dance for bucket locking
            liang Liang Zhen (Inactive) added a comment - it's kind of off-topic, I think we can improve cfs_hash to make it support rehash-in-bucket in the future: user can provide (not necessary) two levels hash functions the first is for bucket-hash (each bucket has one lock and N entries (hlist_head)) the second is for entry-hash inside bucket (hash element to hlist_head in that bucket) rehash can only happen in each bucket better scalability, because we don't do rehash for the whole hash table in one batch no element moving between buckets, so we don't need rwlock or lock dance for bucket locking

            If the entries can be as small as 4096, I think that is absolutely fine.

            I don't know how much exactly memory it consumes - it is prorated by memory size, but after I changed lu_cache_percent from 20 to 1, I could mount 1K mountpoints - it used to be 200 at most.

            jay Jinshan Xiong (Inactive) added a comment - If the entries can be as small as 4096, I think that is absolutely fine. I don't know how much exactly memory it consumes - it is prorated by memory size, but after I changed lu_cache_percent from 20 to 1, I could mount 1K mountpoints - it used to be 200 at most.

            People

              jay Jinshan Xiong (Inactive)
              jay Jinshan Xiong (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: