Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5052

threads stuck in jbd2_journal_start

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.4.1
    • None
    • lustre: 2.1.5
      kernel: 2.6.32-279.19.1.el6.20130516.x86_64.lustre215
      build: 2nasS_ofed154

      SRC at https://github.com/jlan/lustre-nas
    • 3
    • 13955

    Description

      MDS build up high load with no cpu activity. Lustre dumping call trace to console. (looks like dup of LU-4794. If so please advise when the patch will land)

      Attached is full stack trace for all threads.

      INFO: task ldlm_cn_00:6299 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      ldlm_cn_00    D 000000000000001a     0  6299      2 0x00000080
       ffff881ec525db30 0000000000000046 0000000000000000 ffffffff8129507e
       ffff881ec525dad0 00000000dcd2dc2e ffff881fb0bd8d00 ffff881ec525dad0
       ffff881fafe73098 ffff881ec525dfd8 000000000000fc40 ffff881fafe73098
      Call Trace:
       [<ffffffff8129507e>] ? number+0x2ee/0x320
       [<ffffffffa055c14a>] start_this_handle+0x27a/0x4a0 [jbd2]
       [<ffffffff8108ff00>] ? autoremove_wake_function+0x0/0x40
       [<ffffffffa055c570>] jbd2_journal_start+0xd0/0x110 [jbd2]
       [<ffffffffa08e6338>] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs]
       [<ffffffffa072c017>] fsfilt_ldiskfs_start+0x77/0x5e0 [fsfilt_ldiskfs]
       [<ffffffffa07a9ac0>] llog_origin_handle_cancel+0x4b0/0xd70 [ptlrpc]
       [<ffffffffa076f71f>] ldlm_cancel_handler+0x1bf/0x5e0 [ptlrpc]
       [<ffffffffa079fb4e>] ptlrpc_main+0xc4e/0x1a40 [ptlrpc]
       [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
       [<ffffffff8100c0ca>] child_rip+0xa/0x20
       [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
       [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
       [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      INFO: task ldlm_cb_00:6302 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      ldlm_cb_00    D 0000000000000002     0  6302      2 0x00000080
       ffff881ec5265b20 0000000000000046 0000000000000000 000000ab00000000
       ffff881ec5265b50 ffffffff8129507e 3634333236363330 3134363536363336
       ffff881ec5263af8 ffff881ec5265fd8 000000000000fc40 ffff881ec5263af8
      Call Trace:
       [<ffffffff8129507e>] ? number+0x2ee/0x320
       [<ffffffff8151ecc5>] rwsem_down_failed_common+0x95/0x1d0
       [<ffffffff8151ee23>] rwsem_down_write_failed+0x23/0x30
       [<ffffffff812992f3>] call_rwsem_down_write_failed+0x13/0x20
       [<ffffffff8151e322>] ? down_write+0x32/0x40
       [<ffffffffa09d543e>] dqacq_handler+0x35e/0xd20 [lquota]
       [<ffffffffa07b8486>] ? __req_capsule_get+0x176/0x750 [ptlrpc]
       [<ffffffffa07921e0>] ? lustre_swab_qdata+0x0/0x30 [ptlrpc]
       [<ffffffffa075e1d8>] target_handle_dqacq_callback+0x668/0xb90 [ptlrpc]
       [<ffffffffa09d50e0>] ? dqacq_handler+0x0/0xd20 [lquota]
       [<ffffffffa076df87>] ldlm_callback_handler+0xa17/0x1ff0 [ptlrpc]
       [<ffffffffa0503ea1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
       [<ffffffffa04ff4a4>] ? libcfs_id2str+0x74/0xb0 [libcfs]
       [<ffffffffa079fb4e>] ptlrpc_main+0xc4e/0x1a40 [ptlrpc]
       [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
       [<ffffffff8100c0ca>] child_rip+0xa/0x20
       [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
       [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
       [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      INFO: task ldlm_cb_01:6303 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      ldlm_cb_01    D 000000000000000d     0  6303      2 0x00000080
       ffff881ec5267b20 0000000000000046 0000000000000000 000000ab00000000
       ffff881ec5267b50 ffffffff8129507e ffff881ec5267ad0 000000005c2ae174
       ffff881ec5263098 ffff881ec5267fd8 000000000000fc40 ffff881ec5263098
      

      Attachments

        1. service200.gz
          142 kB
          Mahmoud Hanafi

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: