Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10319

recovery-random-scale, test_fail_client_mds: test_fail_client_mds returned 4

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.11.0, Lustre 2.10.2
    • None
    • onyx, failover
      servers: sles12sp3, ldiskfs, branch b2_10, v2.10.2.RC1, b50
      clients: sles12sp3, branch b2_10, v2.10.2.RC1, b50
    • 3
    • 9223372036854775807

    Description

      This impacts the SLES client that runs the dd load during failover recovery tests.

      Note: SLES out-of-memory was first seen with LU-9601.

      recovery-mds-scale: https://testing.hpdd.intel.com/test_sets/c95ce2ce-d41a-11e7-9840-52540065bddc

      Note: LBUG/LASSERT (LU-10221) was also seen during the first recovery test run in the failover group (recovery-mds-scale).

      From the client console (vm3):

      [ 2075.737415] jbd2/vda1-8 invoked oom-killer: gfp_mask=0x1420848(GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE), nodemask=0, order=0, oom_score_adj=0
      

      followed by a core dump.

      recovery-random-scale: https://testing.hpdd.intel.com/test_sets/c9603c80-d41a-11e7-9840-52540065bddc
      recovery-double-scale: https://testing.hpdd.intel.com/test_sets/c963786e-d41a-11e7-9840-52540065bddc

      The next two recovery tests run in the failover group (recovery-random-scale, recovery-double-scale) have page allocation failures:

      From the client console (vm3):

      [  960.559009] swapper/0: page allocation failure: order:0, mode:0x1080020(GFP_ATOMIC)
      [  960.559012] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           OE   N  4.4.92-6.18-default #1
      [  960.559013] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  960.559016]  0000000000000000 ffffffff813211b0 0000000000000000 ffff88007fc03d00
      [  960.559018]  ffffffff81196022 0108002000000030 0000000000000000 0000000000000400
      [  960.559019]  ffff88007fc15f00 ffff88007fc03d28 ffff88007fc15fb8 ffff88007fc15f00
      [  960.559019] Call Trace:
      [  960.559056]  [<ffffffff81019b19>] dump_trace+0x59/0x310
      [  960.559059]  [<ffffffff81019eba>] show_stack_log_lvl+0xea/0x170
      [  960.559064]  [<ffffffff8101ac41>] show_stack+0x21/0x40
      [  960.559075]  [<ffffffff813211b0>] dump_stack+0x5c/0x7c
      [  960.559087]  [<ffffffff81196022>] warn_alloc_failed+0xe2/0x150
      [  960.559091]  [<ffffffff81196497>] __alloc_pages_nodemask+0x407/0xb80
      [  960.559093]  [<ffffffff81196d4a>] __alloc_page_frag+0x10a/0x120
      [  960.559104]  [<ffffffff81502e82>] __napi_alloc_skb+0x82/0xd0
      [  960.559110]  [<ffffffffa02b6334>] cp_rx_poll+0x1b4/0x540 [8139cp]
      [  960.559122]  [<ffffffff81511ae7>] net_rx_action+0x157/0x360
      [  960.559133]  [<ffffffff810826d2>] __do_softirq+0xe2/0x2e0
      [  960.559136]  [<ffffffff81082b8a>] irq_exit+0xfa/0x110
      [  960.559149]  [<ffffffff8160ce71>] do_IRQ+0x51/0xd0
      [  960.559152]  [<ffffffff8160ad0c>] common_interrupt+0x8c/0x8c
      [  960.560535] DWARF2 unwinder stuck at ret_from_intr+0x0/0x1b
      

      and

      [  138.384058] Leftover inexact backtrace:
      

      (many instances of this follow the page allocation traces)

      followed by core dumps.

      Attachments

        Issue Links

          Activity

            [LU-10319] recovery-random-scale, test_fail_client_mds: test_fail_client_mds returned 4

            I think we need to get some information about what is consuming the memory here. Either from the crash dump, or by running "slabtop" and "watch cat /proc/meminfo" to see where all the memory is going. I suspect something is wrong with CLIO memory management if it can't handle a write to a single large file (e.g. is a single DLM lock for the whole file pinning all of the pages, and they won't be freed until the lock is cancelled?).

            adilger Andreas Dilger added a comment - I think we need to get some information about what is consuming the memory here. Either from the crash dump, or by running "slabtop" and "watch cat /proc/meminfo" to see where all the memory is going. I suspect something is wrong with CLIO memory management if it can't handle a write to a single large file (e.g. is a single DLM lock for the whole file pinning all of the pages, and they won't be freed until the lock is cancelled?).
            green Oleg Drokin added a comment -

            for vmcore to be useful we also need a pointer at the kernel-debuginfo rpm and also lustre build pointer to get files with symbols and be able to load the file in the crash tool

            green Oleg Drokin added a comment - for vmcore to be useful we also need a pointer at the kernel-debuginfo rpm and also lustre build pointer to get files with symbols and be able to load the file in the crash tool

            Per Oleg: [Our SUSE contact] spoke with an mm guy who suggested to
            tweak some proc parameters, namely vm.min_free_kbytes.

            jcasper James Casper (Inactive) added a comment - Per Oleg: [Our SUSE contact] spoke with an mm guy who suggested to tweak some proc parameters, namely vm.min_free_kbytes.

            I confirmed with top that dd is using less than 1% of memory (but lots of CPU cycles).

            jcasper James Casper (Inactive) added a comment - I confirmed with top that dd is using less than 1% of memory (but lots of CPU cycles).
            jcasper James Casper (Inactive) added a comment - - edited

            Just looked at a system running recovery-mds-scale:

            trevis-37vm3:/mnt/lustre/d0.dd-trevis-37vm3 # ls -al
            total 2756636
            drwxr-xr-x 2 root root       4096 Dec  6 13:41 .
            drwxr-xr-x 5 root root       4096 Dec  6 13:42 ..
            -rw-r--r-- 1 root root 3032481792 Dec  6 13:42 dd-file
            

            So the dd client is working with a single large file. But memory may be freed after each 4K transfer.

            jcasper James Casper (Inactive) added a comment - - edited Just looked at a system running recovery-mds-scale: trevis-37vm3:/mnt/lustre/d0.dd-trevis-37vm3 # ls -al total 2756636 drwxr-xr-x 2 root root 4096 Dec 6 13:41 . drwxr-xr-x 5 root root 4096 Dec 6 13:42 .. -rw-r--r-- 1 root root 3032481792 Dec 6 13:42 dd-file So the dd client is working with a single large file. But memory may be freed after each 4K transfer.

            this could be a duplicate of LU-10221, the symptom is similar.

            hongchao.zhang Hongchao Zhang added a comment - this could be a duplicate of LU-10221 , the symptom is similar.
            pjones Peter Jones added a comment -

            Hongchao

            Is this a distinct issue from LU-10221?

            Peter

            pjones Peter Jones added a comment - Hongchao Is this a distinct issue from LU-10221 ? Peter

            People

              hongchao.zhang Hongchao Zhang
              jcasper James Casper (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: