Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12594

client wedged trying to free memory

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.10.6
    • None
    • 2
    • 9223372036854775807

    Description

      We are getting many clients getting wedged trying to free memory.

      PID: 3470   TASK: ffff8817cfa48cc0  CPU: 0   COMMAND: "agetty"
       #0 [ffff8817dec03a80] machine_kexec at ffffffff81059c5c
       #1 [ffff8817dec03ad0] __crash_kexec at ffffffff81119dea
       #2 [ffff8817dec03b90] crash_kexec at ffffffff81119ebc
       #3 [ffff8817dec03ba0] kdb_kdump_check at ffffffff81142566
       #4 [ffff8817dec03ba8] kdb_main_loop at ffffffff81142792
       #5 [ffff8817dec03be0] kdb_stub at ffffffff811455b8
       #6 [ffff8817dec03c18] kgdb_cpu_enter at ffffffff8113b5fa
       #7 [ffff8817dec03ce8] __kgdb_notify at ffffffff8105f4cc
       #8 [ffff8817dec03d00] kgdb_ll_trap at ffffffff8105f598
       #9 [ffff8817dec03d28] do_int3 at ffffffff81017d0e
      #10 [ffff8817dec03d40] int3 at ffffffff816210a8
      #11 [ffff8817dec03dc8] kgdb_breakpoint at ffffffff8113ad70
      #12 [ffff8817dec03df0] __handle_sysrq at ffffffff81423db2
      #13 [ffff8817dec03e18] serial8250_rx_chars at ffffffff8143c2bc
      #14 [ffff8817dec03e48] serial8250_handle_irq at ffffffff8143d295
      #15 [ffff8817dec03e78] serial8250_default_handle_irq at ffffffff8143d364
      #16 [ffff8817dec03e90] serial8250_interrupt at ffffffff814382fd
      #17 [ffff8817dec03ed0] __handle_irq_event_percpu at ffffffff810dc6bc
      #18 [ffff8817dec03f18] handle_irq_event_percpu at ffffffff810dc840
      #19 [ffff8817dec03f38] handle_irq_event at ffffffff810dc8a6
      #20 [ffff8817dec03f58] handle_edge_irq at ffffffff810dfb5e
      #21 [ffff8817dec03f78] handle_irq at ffffffff81019a9c
      #22 [ffff8817dec03f80] do_IRQ at ffffffff81621f78
      --- <IRQ stack> ---
      #23 [ffff8817ae7378b0] ret_from_intr at ffffffff8161e9c7
          [exception RIP: unknown or invalid address]
          RIP: 0000000000040000  RSP: ffffffffa12fe0c0  RFLAGS: 00f00001
          RAX: ffff8817ae737b50  RBX: ffff8817de0435c0  RCX: 0000000000000000
          RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffff8817dec1afc0
          RBP: ffffffff82367440   R8: 0000000000000080   R9: 0000000000000000
          R10: ffff88183ffd5d80  R11: 0000000000000000  R12: ffffffff811a2758
          R13: cccccccccccccccd  R14: ffff8817ae737950  R15: ffff8817ae737948
          ORIG_RAX: 0000000000000000  CS: ffff8817df5dafc0  SS: ffffffffffffffcc
      bt: WARNING: possibly bogus exception frame
      #24 [ffff8817ae737958] native_queued_spin_lock_slowpath at ffffffff810cd10f
      #25 [ffff8817ae737988] queued_spin_lock_slowpath at ffffffff811946d2
      #26 [ffff8817ae737990] osc_cache_shrink_count at ffffffffa12c6b61 [osc]
      #27 [ffff8817ae737998] shrink_slab at ffffffff811a7b12
      #28 [ffff8817ae737a68] shrink_zone at ffffffff811ac2be
      #29 [ffff8817ae737ad0] do_try_to_free_pages at ffffffff811ac63d
      #30 [ffff8817ae737b48] try_to_free_pages at ffffffff811ac9ea
      #31 [ffff8817ae737bb8] __alloc_pages_nodemask at ffffffff8119e9cb
      #32 [ffff8817ae737ca8] alloc_pages_current at ffffffff811e72cf
      #33 [ffff8817ae737ce0] filemap_fault at ffffffff81198bf0
      #34 [ffff8817ae737d40] __do_fault at ffffffff811c1ad7
      #35 [ffff8817ae737da0] handle_pte_fault at ffffffff811c535e
      #36 [ffff8817ae737e78] handle_mm_fault at ffffffff811c79ca
      #37 [ffff8817ae737ec0] __do_page_fault at ffffffff81068df7
      #38 [ffff8817ae737f28] do_page_fault at ffffffff810690cb
      #39 [ffff8817ae737f50] page_fault at ffffffff81621342
          RIP: 00007ffff7b1b953  RSP: 00007fffffffb1f8  RFLAGS: 00010246
          RAX: 0000000000000001  RBX: 00007fffffffb210  RCX: 00007ffff7b1b953
          RDX: 0000000000000000  RSI: 00007fffffffb260  RDI: 0000000000000005
          RBP: 00007fffffffb360   R8: 0000000000000000   R9: 0000000000610680
          R10: 0000000000000000  R11: 0000000000000246  R12: 0000000000000000
          R13: 0000000000000004  R14: 00007fffffffb260  R15: 0000000000000d8e
          ORIG_RAX: ffffffffffffffff  CS: 0033  SS: 002b
      

      Attachments

        Activity

          People

            adilger Andreas Dilger
            mhanafi Mahmoud Hanafi
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: