Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4856

osc_lru_reserve()) ASSERTION( atomic_read(cli->cl_lru_left) >= 0 ) failed

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.7.0
    • Lustre 2.5.0, Lustre 2.6.0, Lustre 2.4.2
    • 3
    • 13394

    Description

      The atomic_t used to count LRU entries is overflowing on systems with large memory configurations:

      LustreError: 22141:0:(osc_page.c:892:osc_lru_reserve()) ASSERTION(atomic_read(cli->cl_lru_left) >= 0 ) failed:

      PID: 54214 TASK: ffff88fdef4e4100 CPU: 40 COMMAND: "cat"
      #3 [ffff88fdf0823900] lbug_with_loc at ffffffffa07fedc3 [libcfs]
      #4 [ffff88fdf0823920] osc_lru_reserve at ffffffffa0c2a28a [osc]
      #5 [ffff88fdf08239a0] cl_page_alloc at ffffffffa09a7122 [obdclass]
      #6 [ffff88fdf08239e0] cl_page_find0 at ffffffffa09a742d [obdclass]
      #7 [ffff88fdf0823a40] lov_page_init_raid0 at ffffffffa0cc0f21 [lov]
      #8 [ffff88fdf0823aa0] cl_page_alloc at ffffffffa09a7122 [obdclass]
      #9 [ffff88fdf0823ae0] cl_page_find0 at ffffffffa09a742d [obdclass]
      #10 [ffff88fdf0823b40] ll_cl_init at ffffffffa0d74123 [lustre]
      #11 [ffff88fdf0823bd0] ll_readpage at ffffffffa0d74485 [lustre]
      #12 [ffff88fdf0823c00] do_generic_file_read at ffffffff810fa39e
      #13 [ffff88fdf0823c80] generic_file_aio_read at ffffffff810fad4c
      #14 [ffff88fdf0823d40] vvp_io_read_start at ffffffffa0da2fb0 [lustre]
      #15 [ffff88fdf0823da0] cl_io_start at ffffffffa09af979 [obdclass]
      #16 [ffff88fdf0823dd0] cl_io_loop at ffffffffa09b3d33 [obdclass]
      #17 [ffff88fdf0823e00] ll_file_io_generic at ffffffffa0d49c32 [lustre]
      #18 [ffff88fdf0823e70] ll_file_aio_read at ffffffffa0d4a3b3 [lustre]
      #19 [ffff88fdf0823ec0] ll_file_read at ffffffffa0d4aec3 [lustre]
      #20 [ffff88fdf0823f10] vfs_read at ffffffff8115b237
      #21 [ffff88fdf0823f40] sys_read at ffffffff8115b3a3

      In this case, the atomic_t (signed int) held:
      crash> pd (int)0xffff943de11780fc
      $10 = -1506317746

      We've triggered this specific problem with configurations down to 11TB of physmem. A 10.5TB system can cat a small file without crashing.

      I noticed several other cases where page counts are handled using a signed int, and suspect anything more than 4TB is problematic. The kernel itself is consistently using unsigned long for page counts on all architectures.

      Attachments

        Activity

          [LU-4856] osc_lru_reserve()) ASSERTION( atomic_read(cli->cl_lru_left) >= 0 ) failed

          Grégoire Pichon (gregoire.pichon@bull.net) uploaded a new patch: http://review.whamcloud.com/16697
          Subject: LU-4856 misc: Reduce exposure to overflow on page counters.
          Project: fs/lustre-release
          Branch: b2_5
          Current Patch Set: 1
          Commit: f14f45c4e52246efe2c478b87c703705a30b3774

          gerrit Gerrit Updater added a comment - Grégoire Pichon (gregoire.pichon@bull.net) uploaded a new patch: http://review.whamcloud.com/16697 Subject: LU-4856 misc: Reduce exposure to overflow on page counters. Project: fs/lustre-release Branch: b2_5 Current Patch Set: 1 Commit: f14f45c4e52246efe2c478b87c703705a30b3774

          Hi

          We are seeing same issues with SLES 11 + SP3 and we using Lustre version 2.4.3

          luster client installed with 2048 core SGI UV1000 running:

           cat /etc/SuSE-release
          
          SUSE Linux Enterprise Server 11 (x86_64)
          VERSION = 11
          PATCHLEVEL = 3
          hungabee:~ # lsb_release -a
          LSB Version: core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
          Distributor ID: SUSE LINUX
          Description: SUSE Linux Enterprise Server 11 (x86_64)
          Release: 11
          Codename: n/a
          hungabee:~ # rpm -qa | egrep "(lustre|ofed)"
          lustre-client-modules-2.4.3-3.0.101_0.29_default
          ofed-doc-1.5.4.1-0.11.5
          ofed-1.5.4.1-0.11.5
          ofed-kmp-trace-1.5.4.1_3.0.76_0.11-0.11.5
          ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5
          lustre-client-2.4.3-3.0.101_0.29_default
          

          So can we have a backport path for Lustre client v2.4.3 and is this patch included in 2.5.x branches if not then can we have backport patch for 2.5 branch too.

          Thank You,
          Manish

          manish Manish Patel (Inactive) added a comment - Hi We are seeing same issues with SLES 11 + SP3 and we using Lustre version 2.4.3 luster client installed with 2048 core SGI UV1000 running: cat /etc/SuSE-release SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 3 hungabee:~ # lsb_release -a LSB Version: core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch Distributor ID: SUSE LINUX Description: SUSE Linux Enterprise Server 11 (x86_64) Release: 11 Codename: n/a hungabee:~ # rpm -qa | egrep "(lustre|ofed)" lustre-client-modules-2.4.3-3.0.101_0.29_default ofed-doc-1.5.4.1-0.11.5 ofed-1.5.4.1-0.11.5 ofed-kmp-trace-1.5.4.1_3.0.76_0.11-0.11.5 ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5 lustre-client-2.4.3-3.0.101_0.29_default So can we have a backport path for Lustre client v2.4.3 and is this patch included in 2.5.x branches if not then can we have backport patch for 2.5 branch too. Thank You, Manish
          yujian Jian Yu added a comment -

          Here is the patch for master branch to resolve the issue that if "val" is larger than 2^32 on a 32-bit system, the code in proc_max_dirty_pages_in_mb() may truncate "val" when assigning it to obd_max_dirty_pages: http://review.whamcloud.com/12269/

          yujian Jian Yu added a comment - Here is the patch for master branch to resolve the issue that if "val" is larger than 2^32 on a 32-bit system, the code in proc_max_dirty_pages_in_mb() may truncate "val" when assigning it to obd_max_dirty_pages: http://review.whamcloud.com/12269/
          pjones Peter Jones added a comment -

          Landed for 2.7

          pjones Peter Jones added a comment - Landed for 2.7

          Made it through autotest.
          ~ jfc.

          jfc John Fuchs-Chesney (Inactive) added a comment - Made it through autotest. ~ jfc.

          http://review.whamcloud.com/#/c/10537/7 eliminates the lprocfs_..long functions entirely.

          schamp Stephen Champion added a comment - http://review.whamcloud.com/#/c/10537/7 eliminates the lprocfs_..long functions entirely.

          In the middle of it right now. Had to rebase to master again.

          I am hesitant to simply #define the lprocfs_.._long functions to _u64 functions, as sign conversion hazards might catch unsuspecting users. Seems like a great way to introduce very obscure bugs.

          I think I can eliminate the introduction of the long function by using the _64 functions in the cases my patch was using them.
          This does force 32 bit systems to unnecessarily use 64 bit types, but not in critical paths. This is what I have started on.

          schamp Stephen Champion added a comment - In the middle of it right now. Had to rebase to master again. I am hesitant to simply #define the lprocfs_.._long functions to _u64 functions, as sign conversion hazards might catch unsuspecting users. Seems like a great way to introduce very obscure bugs. I think I can eliminate the introduction of the long function by using the _64 functions in the cases my patch was using them. This does force 32 bit systems to unnecessarily use 64 bit types, but not in critical paths. This is what I have started on.

          Stephen,
          Can you please review the comments on patch set 5?

          Thanks,
          ~ jfc.

          jfc John Fuchs-Chesney (Inactive) added a comment - Stephen, Can you please review the comments on patch set 5? Thanks, ~ jfc.

          http://review.whamcloud.com/#/c/10537/5 is confirmed as resolving this problem on a 32TB system.
          I also ran sanity and sanityn without serious failure.

          schamp Stephen Champion added a comment - http://review.whamcloud.com/#/c/10537/5 is confirmed as resolving this problem on a 32TB system. I also ran sanity and sanityn without serious failure.

          Revision 4 of http://review.whamcloud.com/#/c/10537/ is tested, working.

          1. rpm -q lustre-client
            lustre-client-2.6.51-3.0.101_0.35_default_gc69b1a0
          2. grep ^processor /proc/cpuinfo | wc -l
            3072
          3. grep ^MemTotal /proc/meminfo
            MemTotal: 32825421388 kB
          4. mount -t lustre mds1-esa@tcp0:/esa-uv /mnt/esa-uv
          5. cd /mnt/esa-uv/schamp
          6. ls -l
            total 3145740
            rw-rr- 1 schamp sgiemp_00 1073741824 Sep 4 14:36 foo.1
            rw-rr- 1 schamp sgiemp_00 1073741824 Sep 4 15:20 foo.2
          7. cp foo.2 foo.3
          8. ls -l
            total 3145740
            rw-rr- 1 schamp sgiemp_00 1073741824 Sep 4 14:36 foo.1
            rw-rr- 1 schamp sgiemp_00 1073741824 Sep 4 15:20 foo.2
            rw-rr- 1 root root 1073741824 Sep 4 18:24 foo.3
          schamp Stephen Champion added a comment - Revision 4 of http://review.whamcloud.com/#/c/10537/ is tested, working. rpm -q lustre-client lustre-client-2.6.51-3.0.101_0.35_default_gc69b1a0 grep ^processor /proc/cpuinfo | wc -l 3072 grep ^MemTotal /proc/meminfo MemTotal: 32825421388 kB mount -t lustre mds1-esa@tcp0:/esa-uv /mnt/esa-uv cd /mnt/esa-uv/schamp ls -l total 3145740 rw-r r - 1 schamp sgiemp_00 1073741824 Sep 4 14:36 foo.1 rw-r r - 1 schamp sgiemp_00 1073741824 Sep 4 15:20 foo.2 cp foo.2 foo.3 ls -l total 3145740 rw-r r - 1 schamp sgiemp_00 1073741824 Sep 4 14:36 foo.1 rw-r r - 1 schamp sgiemp_00 1073741824 Sep 4 15:20 foo.2 rw-r r - 1 root root 1073741824 Sep 4 18:24 foo.3

          People

            yujian Jian Yu
            schamp Stephen Champion
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: