Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1087

mdt thread spinning out of control

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • Lustre 2.1.0
    • None
    • lustre 2.1.0-21chaos (github.com/chaos/lustre)
    • 2
    • 6465

    Description

      On one of our filesystems that was upgraded to lustre 2.1 last week, the MDS has one thread that appears to be stick spinning:

      13711 root      20   0     0    0    0 R 99.9  0.0   6289:28 mdt_253
      

      Using sysreq (since crash can't trace a live process):

      2012-02-09 16:18:39 Call Trace:
      2012-02-09 16:18:39  [<ffffffffa06de440>] ? ldlm_cancel_locks_for_export_cb+0x0/0xd0 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa06de440>] ? ldlm_cancel_locks_for_export_cb+0x0/0xd0 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa0498f46>] cfs_hash_for_each_empty+0xc6/0x170 [libcfs]
      2012-02-09 16:18:39  [<ffffffffa06db15f>] ldlm_cancel_locks_for_export+0x1f/0x30 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa06e7c94>] server_disconnect_export+0x64/0x170 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa0c50dcb>] mdt_obd_disconnect+0x4b/0xd50 [mdt]
      2012-02-09 16:18:39  [<ffffffffa071681e>] ? lustre_pack_reply_flags+0xae/0x1b0 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa0716931>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa06e4d4a>] target_handle_disconnect+0x16a/0x3a0 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa0c4a805>] mdt_disconnect+0x35/0x100 [mdt]
      2012-02-09 16:18:39  [<ffffffffa0c4674d>] mdt_handle_common+0x73d/0x12b0 [mdt]
      2012-02-09 16:18:39  [<ffffffffa0717334>] ? lustre_msg_get_transno+0x54/0x90 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa0c47395>] mdt_regular_handle+0x15/0x20 [mdt]
      2012-02-09 16:18:39  [<ffffffffa0723181>] ptlrpc_main+0xcd1/0x1690 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa07224b0>] ? ptlrpc_main+0x0/0x1690 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffff8100c14a>] child_rip+0xa/0x20
      2012-02-09 16:18:39  [<ffffffffa07224b0>] ? ptlrpc_main+0x0/0x1690 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffffa07224b0>] ? ptlrpc_main+0x0/0x1690 [ptlrpc]
      2012-02-09 16:18:39  [<ffffffff8100c140>] ? child_rip+0x0/0x20
      

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: