Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5717

Dead lock of nrs_tbf_timer_cb

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.8.0
    • Lustre 2.6.0
    • 3
    • 16034

    Description

      When TBF is enabled, following dead lock problem could be triggered when system is under heavy load.

      <0>Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0
      <4>Pid: 24831, comm: ll_ost_io00_074 Not tainted 2.6.32-431.23.3.el6_lustre.2.5.24.ddn3.x86_64 #1
      <4>Call Trace:
      <4> <NMI> [<ffffffff8152896c>] ? panic+0xa7/0x16f
      <4> [<ffffffff81014969>] ? sched_clock+0x9/0x10
      <4> [<ffffffff810e67fd>] ? watchdog_overflow_callback+0xcd/0xd0
      <4> [<ffffffff8111c707>] ? __perf_event_overflow+0xa7/0x240
      <4> [<ffffffff8101d93d>] ? x86_perf_event_set_period+0xdd/0x170
      <4> [<ffffffff8111ccd4>] ? perf_event_overflow+0x14/0x20
      <4> [<ffffffff81022d87>] ? intel_pmu_handle_irq+0x187/0x2f0
      <4> [<ffffffff8152e646>] ? kprobe_exceptions_notify+0x16/0x430
      <4> [<ffffffff8152d1b9>] ? perf_event_nmi_handler+0x39/0xb0
      <4> [<ffffffff8152ec75>] ? notifier_call_chain+0x55/0x80
      <4> [<ffffffffa08517c0>] ? nrs_tbf_timer_cb+0x0/0x60 [ptlrpc]
      <4> [<ffffffff8152ecda>] ? atomic_notifier_call_chain+0x1a/0x20
      <4> [<ffffffff810a11de>] ? notify_die+0x2e/0x30
      <4> [<ffffffff8152c93b>] ? do_nmi+0x1bb/0x340
      <4> [<ffffffff8152c200>] ? nmi+0x20/0x30
      <4> [<ffffffffa08517c0>] ? nrs_tbf_timer_cb+0x0/0x60 [ptlrpc]
      <4> [<ffffffff8152ba6e>] ? _spin_lock+0x1e/0x30
      <4> <<EOE>> <IRQ> [<ffffffffa08517ea>] ? nrs_tbf_timer_cb+0x2a/0x60 [ptlrpc]
      <4> [<ffffffff8109f6be>] ? __run_hrtimer+0x8e/0x1a0
      <4> [<ffffffff810a6a9f>] ? ktime_get_update_offsets+0x4f/0xd0
      <4> [<ffffffff8109fa26>] ? hrtimer_interrupt+0xe6/0x260
      <4> [<ffffffff81031f1d>] ? local_apic_timer_interrupt+0x3d/0x70
      <4> [<ffffffff81532805>] ? smp_apic_timer_interrupt+0x45/0x60
      <4> [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
      <4> <EOI> [<ffffffffa084648a>] ? nrs_resource_get_safe+0x4a/0x100 [ptlrpc]
      <4> [<ffffffffa0848a98>] ? ptlrpc_nrs_req_initialize+0x38/0x90 [ptlrpc]
      <4> [<ffffffffa080ef41>] ? ptlrpc_server_handle_req_in+0x901/0xcd0 [ptlrpc]
      <4> [<ffffffffa0815f0c>] ? ptlrpc_main+0x9ec/0x1990 [ptlrpc]
      <4> [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
      <4> [<ffffffff8152907e>] ? thread_return+0x4e/0x760
      <4> [<ffffffffa0815520>] ? ptlrpc_main+0x0/0x1990 [ptlrpc]
      <4> [<ffffffff8109abf6>] ? kthread+0x96/0xa0
      <4> [<ffffffff8100c20a>] ? child_rip+0xa/0x20
      <4> [<ffffffff8109ab60>] ? kthread+0x0/0xa0
      <4> [<ffffffff8100c200>] ? child_rip+0x0/0x20

      Attachments

        Issue Links

          Activity

            People

              niu Niu Yawei (Inactive)
              lixi Li Xi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: