Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4335

MDS hangs due to mdt thread hung/inactive

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.1.5
    • None
    • 3
    • 11860

    Description

      mdt threaded report as inacive >200s. mds backup and requires a reboot. But after the reboot the mds hung again and required a reboot and mounting with abort recovery.

      uploading these files to ftp site.
      lustre-log.1385580907.6742.gz <- inital hang
      lustre-log.1385589491.8362.gz <- Recover hang

      We have a crashdump for the recover hang.

      — initial hang----
      Nov 27 10:30:14 nbp8-mds1 kernel: Lustre: 5687:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from 377ef706-73ec-d593-7c7e-ac55fd582ec2@10.151.41.73@o2ib t0 exp (null) cur 1385577014 last 0
      Nov 27 10:30:14 nbp8-mds1 kernel: Lustre: 6643:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from c2dc3e1e-9ec6-88a0-ddcb-182e74734295@10.151.41.73@o2ib t0 exp (null) cur 1385577014 last 0
      Nov 27 10:30:53 nbp8-mds1 kernel: Lustre: nbp8-MDT0000: haven't heard from client fd1a318e-556c-0397-a95e-9d2ed1998bc0 (at 10.151.41.73@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883f83c7fc00, cur 1385577053 expire 1385576903 last 1385576826
      Nov 27 10:30:53 nbp8-mds1 kernel: Lustre: MGS: haven't heard from client 581c894b-1381-7f25-567b-57c83bbae311 (at 10.151.41.73@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883fa1722c00, cur 1385577053 expire 1385576903 last 1385576826
      Nov 27 10:34:25 nbp8-mds1 kernel: LustreError: 5515:0:(o2iblnd_cb.c:2992:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds
      Nov 27 10:34:25 nbp8-mds1 kernel: LustreError: 5515:0:(o2iblnd_cb.c:3055:kiblnd_check_conns()) Timed out RDMA with 10.151.32.5@o2ib (152): c: 6, oc: 0, rc: 8
      Nov 27 10:34:52 nbp8-mds1 kernel: Lustre: 5687:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from 243f629c-ac60-5881-7a9e-e96a02c21f7d@10.151.32.5@o2ib t0 exp (null) cur 1385577292 last 0
      Nov 27 10:35:00 nbp8-mds1 kernel: LustreError: 5996:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
      Nov 27 10:35:00 nbp8-mds1 kernel: LustreError: 5996:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 311 previous similar messages
      Nov 27 10:35:39 nbp8-mds1 kernel: Lustre: nbp8-MDT0000: haven't heard from client 3b493920-c724-792f-5966-cf83ffa67f75 (at 10.151.32.5@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881e3176fc00, cur 1385577339 expire 1385577189 last 1385577112
      Nov 27 11:01:27 nbp8-mds1 kernel: LustreError: 5515:0:(o2iblnd_cb.c:2992:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds
      Nov 27 11:01:27 nbp8-mds1 kernel: LustreError: 5515:0:(o2iblnd_cb.c:3055:kiblnd_check_conns()) Timed out RDMA with 10.151.27.18@o2ib (162): c: 7, oc: 0, rc: 8
      Nov 27 11:03:43 nbp8-mds1 kernel: Lustre: 5686:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from 236f42d4-1a48-7ef0-79e8-c65ae40bc796@10.151.27.18@o2ib t0 exp (null) cur 1385579023 last 0
      Nov 27 11:03:43 nbp8-mds1 kernel: Lustre: 5686:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1 previous similar message
      Nov 27 11:03:43 nbp8-mds1 kernel: Lustre: 7068:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 5b0af5d5-d93c-d433-ec5c-7b17cd82c746@10.151.27.18@o2ib t0 exp (null) cur 1385579023 last 0
      Nov 27 11:35:07 nbp8-mds1 kernel: Lustre: Service thread pid 6742 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      Nov 27 11:35:07 nbp8-mds1 kernel: Pid: 6742, comm: mdt_137
      Nov 27 11:35:16 nbp8-mds1 kernel:
      Nov 27 11:35:16 nbp8-mds1 kernel: Call Trace:
      Nov 27 11:35:16 nbp8-mds1 kernel: [<ffffffffa04f819b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
      Nov 27 11:35:16 nbp8-mds1 kernel: [<ffffffffa04f960e>] cfs_waitq_wait+0xe/0x10 [libcfs]
      Nov 27 11:35:16 nbp8-mds1 kernel: [<ffffffffa0a9f6de>] qos_statfs_update+0x7fe/0xa70 [lov]
      Nov 27 11:35:16 nbp8-mds1 kernel: [<ffffffff8110e42e>] ? find_get_page+0x1e/0xa0
      Nov 27 11:35:16 nbp8-mds1 kernel: [<ffffffff8105fab0>] ? default_wake_function+0x0/0x20
      Nov 27 11:35:16 nbp8-mds1 kernel: [<ffffffffa0aa00fd>] alloc_qos+0x1ad/0x21a0 [lov]
      Nov 27 11:35:16 nbp8-mds1 kernel: [<ffffffffa0aa5fdf>] ? lsm_alloc_plain+0xff/0x930 [lov]
      Nov 27 11:35:16 nbp8-mds1 kernel: [<ffffffffa0aa306c>] qos_prep_create+0x1ec/0x2380 [lov]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0a9c63a>] lov_prep_create_set+0xea/0x390 [lov]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0a84b0c>] lov_create+0x1ac/0x1400 [lov]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0d8b0d6>] ? mdd_get_md+0x96/0x2f0 [mdd]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0ea2f13>] ? osd_object_read_unlock+0x53/0xa0 [osd_ldiskfs]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0dab916>] ? mdd_read_unlock+0x26/0x30 [mdd]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0d8f90e>] mdd_lov_create+0x9ee/0x1ba0 [mdd]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0da1871>] mdd_create+0xf81/0x1a90 [mdd]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0ea9df3>] ? osd_oi_lookup+0x83/0x110 [osd_ldiskfs]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0ea456c>] ? osd_object_init+0xdc/0x3e0 [osd_ldiskfs]
      Nov 27 11:35:22 nbp8-mds1 kernel: [<ffffffffa0eda3f7>] cml_create+0x97/0x250 [cmm]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0e165e1>] ? mdt_version_get_save+0x91/0xd0 [mdt]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0e2c06e>] mdt_reint_open+0x1aae/0x28a0 [mdt]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa078f724>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0da456e>] ? md_ucred+0x1e/0x60 [mdd]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0e14c81>] mdt_reint_rec+0x41/0xe0 [mdt]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0e0bed4>] mdt_reint_internal+0x544/0x8e0 [mdt]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0e0c53d>] mdt_intent_reint+0x1ed/0x530 [mdt]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0e0ac09>] mdt_intent_policy+0x379/0x690 [mdt]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa074b351>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa07711ad>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0e0b586>] mdt_enqueue+0x46/0x130 [mdt]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0e00772>] mdt_handle_common+0x932/0x1750 [mdt]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa0e01665>] mdt_regular_handle+0x15/0x20 [mdt]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa079fb4e>] ptlrpc_main+0xc4e/0x1a40 [ptlrpc]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
      Nov 27 11:35:23 nbp8-mds1 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      Nov 27 11:35:23 nbp8-mds1 kernel:
      Nov 27 11:35:28 nbp8-mds1 kernel: LustreError: dumping log to /tmp/lustre-log.1385580907.6742
      Nov 27 11:35:28 nbp8-mds1 kernel: Lustre: Service thread pid 6645 was inactive for 200.01s. The thread might be hung, or it might only be slow and will resume l

      — after reboot hang —
      ov 27 13:57:46 nbp8-mds1 kernel: LustreError: 6771:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 14 previous similar messages
      Nov 27 13:58:11 nbp8-mds1 kernel: Lustre: Service thread pid 8362 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      Nov 27 13:58:11 nbp8-mds1 kernel: Pid: 8362, comm: mdt_454
      Nov 27 13:58:14 nbp8-mds1 kernel:
      Nov 27 13:58:14 nbp8-mds1 kernel: Call Trace:
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffff8151d552>] schedule_timeout+0x192/0x2e0
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffff8107bf80>] ? process_timeout+0x0/0x10
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0764c60>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa04f95e1>] cfs_waitq_timedwait+0x11/0x20 [libcfs]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0768d0d>] ldlm_completion_ast+0x48d/0x720 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffff8105fab0>] ? default_wake_function+0x0/0x20
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0768506>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0768880>] ? ldlm_completion_ast+0x0/0x720 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0df9e60>] ? mdt_blocking_ast+0x0/0x2a0 [mdt]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0dfd2a0>] mdt_object_lock+0x320/0xb70 [mdt]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0df9e60>] ? mdt_blocking_ast+0x0/0x2a0 [mdt]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0768880>] ? ldlm_completion_ast+0x0/0x720 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0e0dc62>] mdt_getattr_name_lock+0xe22/0x1880 [mdt]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa078eb1d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa07b8486>] ? __req_capsule_get+0x176/0x750 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0790da4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0e0ec1d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0e0ac09>] mdt_intent_policy+0x379/0x690 [mdt]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa074b351>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa07711ad>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0e0b586>] mdt_enqueue+0x46/0x130 [mdt]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0e00772>] mdt_handle_common+0x932/0x1750 [mdt]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa0e01665>] mdt_regular_handle+0x15/0x20 [mdt]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa079fb4e>] ptlrpc_main+0xc4e/0x1a40 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
      Nov 27 13:58:14 nbp8-mds1 kernel: [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
      Nov 27 13:58:15 nbp8-mds1 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      Nov 27 13:58:15 nbp8-mds1 kernel:
      Nov 27 13:58:15 nbp8-mds1 kernel: LustreError: dumping log to /tmp/lustre-log.1385589491.8362

      Attachments

        Issue Links

          Activity

            [LU-4335] MDS hangs due to mdt thread hung/inactive
            pjones Peter Jones added a comment -

            ok thanks Mahmoud!

            pjones Peter Jones added a comment - ok thanks Mahmoud!

            please close. We are running 2.4.3 and have not see the issue.

            mhanafi Mahmoud Hanafi added a comment - please close. We are running 2.4.3 and have not see the issue.

            we have seen this at least once on nbp7.

            mhanafi Mahmoud Hanafi added a comment - we have seen this at least once on nbp7.

            If cfs_atomic_inc(&set->set_completes)
            is supposed to fix the problem as Liang Zhen commented in LU-4733,
            the fix is also in 2.4.0.

            Have we seen this problem in nbp7, which runs 2.4.1 server, Mahmoud?

            jaylan Jay Lan (Inactive) added a comment - If cfs_atomic_inc(&set->set_completes) is supposed to fix the problem as Liang Zhen commented in LU-4733 , the fix is also in 2.4.0. Have we seen this problem in nbp7, which runs 2.4.1 server, Mahmoud?

            FYI, please check my comment on LU-4733

            liang Liang Zhen (Inactive) added a comment - FYI, please check my comment on LU-4733

            twelve processes in spin_lock in lnet code.

            right, it looks quite similar to the trace of LU-4195, did you try the check you network as Amir suggested.

            niu Niu Yawei (Inactive) added a comment - twelve processes in spin_lock in lnet code. right, it looks quite similar to the trace of LU-4195 , did you try the check you network as Amir suggested.

            twelve processes in spin_lock in lnet code.

            jaylan Jay Lan (Inactive) added a comment - twelve processes in spin_lock in lnet code.

            we don't have full bt for all the tasks at the initial hang. But i am attaching full bt for the hang after recovery.

            attach: bta_afterrecover.gz

            mhanafi Mahmoud Hanafi added a comment - we don't have full bt for all the tasks at the initial hang. But i am attaching full bt for the hang after recovery. attach: bta_afterrecover.gz

            I didn't see anything wrong in the log files, but the messages in crash dump is suspicious:

            Nov 27 11:01:27 nbp8-mds1 kernel: LustreError: 5515:0:(o2iblnd_cb.c:2992:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds
            Nov 27 11:01:27 nbp8-mds1 kernel: LustreError: 5515:0:(o2iblnd_cb.c:3055:kiblnd_check_conns()) Timed out RDMA with 10.151.27.18@o2ib (162): c: 7, oc: 0, rc: 8
            

            Looks like there is network problem in your system, could it related to LU-4195?

            niu Niu Yawei (Inactive) added a comment - I didn't see anything wrong in the log files, but the messages in crash dump is suspicious: Nov 27 11:01:27 nbp8-mds1 kernel: LustreError: 5515:0:(o2iblnd_cb.c:2992:kiblnd_check_txs_locked()) Timed out tx: active_txs, 1 seconds Nov 27 11:01:27 nbp8-mds1 kernel: LustreError: 5515:0:(o2iblnd_cb.c:3055:kiblnd_check_conns()) Timed out RDMA with 10.151.27.18@o2ib (162): c: 7, oc: 0, rc: 8 Looks like there is network problem in your system, could it related to LU-4195 ?

            Here is console log of the same system hanging after recover. One thing to note we have turn quota off on this filesystem.

            1. lfs quota -u mhanafi /nobackupp8
              user quotas are not enabled.
            1. lfs quota -g css /nobackupp8
              group quotas are not enabled.

            Lustre: 6851:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 87e9617b-6e16-e700-13c0-c4c0d2508a27@10.151.32.180@o2ib recovering/t181073412670 exp (null) cur 1385588905 last 0
            Lustre: 6851:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1518 previous similar messages
            Lustre: nbp8-MDT0000: Denying connection for new client 10.151.46.187@o2ib (at 24819010-1767-a13e-301e-056fd24f419a), waiting for 209 clients in recovery for 10:10
            Lustre: Skipped 1516 previous similar messages
            Lustre: nbp8-MDT0000: disconnecting 1643 stale clients
            Lustre: nbp8-MDT0000: sending delayed replies to recovered clients
            Lustre: MDS mdd_obd-nbp8-MDT0000: nbp8-OST003c_UUID now active, resetting orphans
            Lustre: MDS mdd_obd-nbp8-MDT0000: nbp8-OST0043_UUID now active, resetting orphans
            Lustre: Skipped 19 previous similar messages
            LustreError: 7607:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
            LustreError: 7613:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
            LustreError: 7607:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 1217 previous similar messages
            LustreError: 7836:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
            LustreError: 7836:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 19290 previous similar messages
            LustreError: 7622:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 30242 rc:-3)
            LustreError: 7614:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 4127 rc:-3)
            LustreError: 7673:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 30193 rc:-3)
            LustreError: 7673:0:(quota_master.c:1727:qmaster_recovery_main()) Skipped 22 previous similar messages
            LustreError: 7818:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
            LustreError: 7818:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 43032 previous similar messages
            LustreError: 7902:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 11816 rc:-3)
            LustreError: 7902:0:(quota_master.c:1727:qmaster_recovery_main()) Skipped 143 previous similar messages
            LustreError: 6794:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589017, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f5fec5900/0x8702b298b8f52a74 lrc: 3/1,0 mode: --/PR res: 8885893120/2 bits 0x3 rrc: 11 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6794 timeout: 0
            LustreError: 6826:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589017, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f8ae696c0/0x8702b298b8f52cb9 lrc: 3/1,0 mode: --/PR res: 8885893120/2 bits 0x3 rrc: 11 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6826 timeout: 0
            LustreError: 6826:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 1 previous similar message
            LustreError: 6794:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 62 previous similar messages
            LustreError: dumping log to /tmp/lustre-log.1385589318.6876
            LustreError: 6982:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589018, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f77321000/0x8702b298b8f6ba77 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6982 timeout: 0
            LustreError: 6982:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 7 previous similar messages
            LustreError: 7028:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589019, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f76160d80/0x8702b298b8f9b156 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 7028 timeout: 0
            LustreError: 7028:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 13 previous similar messages
            LustreError: 6989:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589021, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f610786c0/0x8702b298b8ff4cbf lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6989 timeout: 0
            LustreError: 6989:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 20 previous similar messages
            LustreError: 7098:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589025, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883fead1c480/0x8702b298b90a8a36 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 7098 timeout: 0
            LustreError: 7098:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 12 previous similar messages
            LustreError: 6235:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589036, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f76a1c900/0x8702b298b929da01 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6235 timeout: 0
            LustreError: 6235:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 9 previous similar messages
            LustreError: 6884:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589052, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff881e3cc1cb40/0x8702b298b94d4e22 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 915 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6884 timeout: 0
            LustreError: 6884:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 13 previous similar messages
            LustreError: 6961:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589094, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f81faf000/0x8702b298b94e7b6f lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 931 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6961 timeout: 0
            LustreError: 6961:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 167 previous similar messages
            Lustre: 8467:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 29c09c23-d6c1-7349-2728-df8c815d4c8a@10.151.34.98@o2ib t181060241980 exp (null) cur 1385589462 last 0
            Lustre: 8467:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1490 previous similar messages
            LustreError: 6771:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589166, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff881e4ac88d80/0x8702b298b94ec86f lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 939 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6771 timeout: 0
            LustreError: 6771:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 14 previous similar messages
            Lustre: Service thread pid 8362 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
            Pid: 8362, comm: mdt_454

            LustreError: 6400:0:(mdt_open.c:1314:mdt_reint_open()) @@@ OPEN & CREAT not in open replay. req@ffff881e44969400 x1449477118027704/t0(180696561882) o101->78e59f5c-45f5-0184-142c-ed51e4a48000@10.151.50.23@o2ib:0/0 lens 712/4936 e 0 to 0 dl 1385589463 ref 1 fl Interpret:/4/0 rc 0/0
            LustreError: 6400:0:(mdt_open.c:1314:mdt_reint_open()) Skipped 8446 previous similar messages
            Lustre: nbp8-MDT0000: Denying connection for new client 10.151.40.177@o2ib (at bdedff9d-8f09-576d-7f58-9d0094abc313), waiting for 965 clients in recovery for 14:12
            Lustre: Skipped 137 previous similar messages
            Lustre: 6342:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from beb87751-b41b-41c0-bfcb-75b4366a7947@10.153.1.48@o2ib233 recovering/t0 exp ffff883fb3381000 cur 1385588713 last 1385588655
            Lustre: 6342:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 14289 previous similar messages
            LustreError: 6367:0:(mdt_open.c:1314:mdt_reint_open()) @@@ OPEN & CREAT not in open replay. req@ffff881e414c9c00 x1449981538622728/t0(180476772260) o101->816e39dc-d0f1-19d4-c94f-fb65f7237bef@10.151.44.134@o2ib:0/0 lens 712/4936 e 0 to 0 dl 1385589479 ref 1 fl Interpret:/4/0 rc 0/0
            LustreError: 6367:0:(mdt_open.c:1314:mdt_reint_open()) Skipped 15806 previous similar messages
            Lustre: nbp8-MDT0000: Denying connection for new client 10.151.47.216@o2ib (at 101285a6-9cd9-2721-9df7-40da5c311017), waiting for 649 clients in recovery for 13:56
            Lustre: Skipped 159 previous similar messages
            LustreError: 6371:0:(mdt_open.c:1314:mdt_reint_open()) @@@ OPEN & CREAT not in open replay. req@ffff883f60a91800 x1452353705630036/t0(180790013583) o101->89a6c2ec-6b29-6fd2-59c8-8f63a0841cdf@10.151.18.205@o2ib:0/0 lens 744/4936 e 0 to 0 dl 1385589511 ref 1 fl Interpret:/4/0 rc 0/0
            LustreError: 6371:0:(mdt_open.c:1314:mdt_reint_open()) Skipped 22830 previous similar messages
            Lustre: nbp8-MDT0000: Denying connection for new client 10.151.49.135@o2ib (at 36f753ed-36ea-320f-74ee-115d64eccd14), waiting for 1197 clients in recovery for 13:23
            Lustre: Skipped 332 previous similar messages
            Lustre: 7000:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 101285a6-9cd9-2721-9df7-40da5c311017@10.151.47.216@o2ib recovering/t180859536925 exp (null) cur 1385588777 last 0
            Lustre: 7000:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 10109 previous similar messages
            LustreError: 6316:0:(mdt_open.c:1314:mdt_reint_open()) @@@ OPEN & CREAT not in open replay. req@ffff881e3abeb400 x1452345264923640/t0(180596528255) o101->d30b2de1-2db4-65f9-e7db-5607bcfd9cd0@10.151.19.38@o2ib:0/0 lens 744/4936 e 0 to 0 dl 1385589575 ref 1 fl Interpret:/4/0 rc 0/0
            LustreError: 6316:0:(mdt_open.c:1314:mdt_reint_open()) Skipped 138328 previous similar messages
            Lustre: nbp8-MDT0000: Denying connection for new client 10.151.29.245@o2ib (at 8e68d8e7-cd33-7f68-d453-41814d65f55a), waiting for 784 clients in recovery for 12:18
            Lustre: Skipped 867 previous similar messages
            Lustre: 6851:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 87e9617b-6e16-e700-13c0-c4c0d2508a27@10.151.32.180@o2ib recovering/t181073412670 exp (null) cur 1385588905 last 0
            Lustre: 6851:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1518 previous similar messages
            Lustre: nbp8-MDT0000: Denying connection for new client 10.151.46.187@o2ib (at 24819010-1767-a13e-301e-056fd24f419a), waiting for 209 clients in recovery for 10:10
            Lustre: Skipped 1516 previous similar messages
            Lustre: nbp8-MDT0000: disconnecting 1643 stale clients
            Lustre: nbp8-MDT0000: sending delayed replies to recovered clients
            Lustre: MDS mdd_obd-nbp8-MDT0000: nbp8-OST003c_UUID now active, resetting orphans
            Lustre: MDS mdd_obd-nbp8-MDT0000: nbp8-OST0043_UUID now active, resetting orphans
            Lustre: Skipped 19 previous similar messages
            LustreError: 7607:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
            LustreError: 7613:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
            LustreError: 7607:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 1217 previous similar messages
            LustreError: 7836:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
            LustreError: 7836:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 19290 previous similar messages
            LustreError: 7622:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 30242 rc:-3)
            LustreError: 7614:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 4127 rc:-3)
            LustreError: 7673:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 30193 rc:-3)
            LustreError: 7673:0:(quota_master.c:1727:qmaster_recovery_main()) Skipped 22 previous similar messages
            LustreError: 7818:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3
            LustreError: 7818:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 43032 previous similar messages
            LustreError: 7902:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 11816 rc:-3)
            LustreError: 7902:0:(quota_master.c:1727:qmaster_recovery_main()) Skipped 143 previous similar messages
            LustreError: 6794:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589017, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f5fec5900/0x8702b298b8f52a74 lrc: 3/1,0 mode: --/PR res: 8885893120/2 bits 0x3 rrc: 11 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6794 timeout: 0
            LustreError: 6826:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589017, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f8ae696c0/0x8702b298b8f52cb9 lrc: 3/1,0 mode: --/PR res: 8885893120/2 bits 0x3 rrc: 11 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6826 timeout: 0
            LustreError: 6826:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 1 previous similar message
            LustreError: 6794:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 62 previous similar messages
            LustreError: dumping log to /tmp/lustre-log.1385589318.6876
            LustreError: 6982:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589018, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f77321000/0x8702b298b8f6ba77 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6982 timeout: 0
            LustreError: 6982:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 7 previous similar messages
            LustreError: 7028:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589019, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f76160d80/0x8702b298b8f9b156 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 7028 timeout: 0
            LustreError: 7028:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 13 previous similar messages
            LustreError: 6989:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589021, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f610786c0/0x8702b298b8ff4cbf lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6989 timeout: 0
            LustreError: 6989:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 20 previous similar messages
            LustreError: 7098:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589025, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883fead1c480/0x8702b298b90a8a36 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 7098 timeout: 0
            LustreError: 7098:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 12 previous similar messages
            LustreError: 6235:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589036, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f76a1c900/0x8702b298b929da01 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6235 timeout: 0
            LustreError: 6235:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 9 previous similar messages
            LustreError: 6884:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589052, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff881e3cc1cb40/0x8702b298b94d4e22 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 915 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6884 timeout: 0
            LustreError: 6884:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 13 previous similar messages
            LustreError: 6961:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589094, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f81faf000/0x8702b298b94e7b6f lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 931 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6961 timeout: 0
            LustreError: 6961:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 167 previous similar messages
            Lustre: 8467:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 29c09c23-d6c1-7349-2728-df8c815d4c8a@10.151.34.98@o2ib t181060241980 exp (null) cur 1385589462 last 0
            Lustre: 8467:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1490 previous similar messages
            LustreError: 6771:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589166, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff881e4ac88d80/0x8702b298b94ec86f lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 939 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6771 timeout: 0
            LustreError: 6771:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 14 previous similar messages
            Lustre: Service thread pid 8362 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
            Pid: 8362, comm: mdt_454

            Call Trace:
            [<ffffffff8151d552>] schedule_timeout+0x192/0x2e0
            [<ffffffff8107bf80>] ? process_timeout+0x0/0x10
            [<ffffffffa0764c60>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc]
            [<ffffffffa04f95e1>] cfs_waitq_timedwait+0x11/0x20 [libcfs]
            [<ffffffffa0768d0d>] ldlm_completion_ast+0x48d/0x720 [ptlrpc]
            [<ffffffff8105fab0>] ? default_wake_function+0x0/0x20
            [<ffffffffa0768506>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc]
            [<ffffffffa0768880>] ? ldlm_completion_ast+0x0/0x720 [ptlrpc]
            [<ffffffffa0df9e60>] ? mdt_blocking_ast+0x0/0x2a0 [mdt]
            [<ffffffffa0dfd2a0>] mdt_object_lock+0x320/0xb70 [mdt]
            [<ffffffffa0df9e60>] ? mdt_blocking_ast+0x0/0x2a0 [mdt]
            [<ffffffffa0768880>] ? ldlm_completion_ast+0x0/0x720 [ptlrpc]
            [<ffffffffa0e0dc62>] mdt_getattr_name_lock+0xe22/0x1880 [mdt]
            [<ffffffffa078eb1d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc]
            [<ffffffffa07b8486>] ? __req_capsule_get+0x176/0x750 [ptlrpc]
            [<ffffffffa0790da4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc]
            [<ffffffffa0e0ec1d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt]
            [<ffffffffa0e0ac09>] mdt_intent_policy+0x379/0x690 [mdt]
            [<ffffffffa074b351>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc]
            [<ffffffffa07711ad>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc]
            [<ffffffffa0e0b586>] mdt_enqueue+0x46/0x130 [mdt]
            [<ffffffffa0e00772>] mdt_handle_common+0x932/0x1750 [mdt]
            [<ffffffffa0e01665>] mdt_regular_handle+0x15/0x20 [mdt]
            [<ffffffffa079fb4e>] ptlrpc_main+0xc4e/0x1a40 [ptlrpc]
            [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
            [<ffffffff8100c0ca>] child_rip+0xa/0x20
            [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
            [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc]
            [<ffffffff8100c0c0>] ? child_rip+0x0/0x20

            LustreError: dumping log to /tmp/lustre-log.1385589491.8362
            Lustre: Service thread pid 8370 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
            Pid: 8370, comm: mdt_459

            mhanafi Mahmoud Hanafi added a comment - Here is console log of the same system hanging after recover. One thing to note we have turn quota off on this filesystem. lfs quota -u mhanafi /nobackupp8 user quotas are not enabled. lfs quota -g css /nobackupp8 group quotas are not enabled. Lustre: 6851:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 87e9617b-6e16-e700-13c0-c4c0d2508a27@10.151.32.180@o2ib recovering/t181073412670 exp (null) cur 1385588905 last 0 Lustre: 6851:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1518 previous similar messages Lustre: nbp8-MDT0000: Denying connection for new client 10.151.46.187@o2ib (at 24819010-1767-a13e-301e-056fd24f419a), waiting for 209 clients in recovery for 10:10 Lustre: Skipped 1516 previous similar messages Lustre: nbp8-MDT0000: disconnecting 1643 stale clients Lustre: nbp8-MDT0000: sending delayed replies to recovered clients Lustre: MDS mdd_obd-nbp8-MDT0000: nbp8-OST003c_UUID now active, resetting orphans Lustre: MDS mdd_obd-nbp8-MDT0000: nbp8-OST0043_UUID now active, resetting orphans Lustre: Skipped 19 previous similar messages LustreError: 7607:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3 LustreError: 7613:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3 LustreError: 7607:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 1217 previous similar messages LustreError: 7836:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3 LustreError: 7836:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 19290 previous similar messages LustreError: 7622:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 30242 rc:-3) LustreError: 7614:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 4127 rc:-3) LustreError: 7673:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 30193 rc:-3) LustreError: 7673:0:(quota_master.c:1727:qmaster_recovery_main()) Skipped 22 previous similar messages LustreError: 7818:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3 LustreError: 7818:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 43032 previous similar messages LustreError: 7902:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 11816 rc:-3) LustreError: 7902:0:(quota_master.c:1727:qmaster_recovery_main()) Skipped 143 previous similar messages LustreError: 6794:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589017, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f5fec5900/0x8702b298b8f52a74 lrc: 3/1,0 mode: --/PR res: 8885893120/2 bits 0x3 rrc: 11 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6794 timeout: 0 LustreError: 6826:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589017, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f8ae696c0/0x8702b298b8f52cb9 lrc: 3/1,0 mode: --/PR res: 8885893120/2 bits 0x3 rrc: 11 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6826 timeout: 0 LustreError: 6826:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 1 previous similar message LustreError: 6794:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 62 previous similar messages LustreError: dumping log to /tmp/lustre-log.1385589318.6876 LustreError: 6982:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589018, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f77321000/0x8702b298b8f6ba77 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6982 timeout: 0 LustreError: 6982:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 7 previous similar messages LustreError: 7028:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589019, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f76160d80/0x8702b298b8f9b156 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 7028 timeout: 0 LustreError: 7028:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 13 previous similar messages LustreError: 6989:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589021, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f610786c0/0x8702b298b8ff4cbf lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6989 timeout: 0 LustreError: 6989:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 20 previous similar messages LustreError: 7098:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589025, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883fead1c480/0x8702b298b90a8a36 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 7098 timeout: 0 LustreError: 7098:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 12 previous similar messages LustreError: 6235:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589036, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f76a1c900/0x8702b298b929da01 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6235 timeout: 0 LustreError: 6235:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 9 previous similar messages LustreError: 6884:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589052, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff881e3cc1cb40/0x8702b298b94d4e22 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 915 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6884 timeout: 0 LustreError: 6884:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 13 previous similar messages LustreError: 6961:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589094, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f81faf000/0x8702b298b94e7b6f lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 931 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6961 timeout: 0 LustreError: 6961:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 167 previous similar messages Lustre: 8467:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 29c09c23-d6c1-7349-2728-df8c815d4c8a@10.151.34.98@o2ib t181060241980 exp (null) cur 1385589462 last 0 Lustre: 8467:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1490 previous similar messages LustreError: 6771:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589166, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff881e4ac88d80/0x8702b298b94ec86f lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 939 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6771 timeout: 0 LustreError: 6771:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 14 previous similar messages Lustre: Service thread pid 8362 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 8362, comm: mdt_454 LustreError: 6400:0:(mdt_open.c:1314:mdt_reint_open()) @@@ OPEN & CREAT not in open replay. req@ffff881e44969400 x1449477118027704/t0(180696561882) o101->78e59f5c-45f5-0184-142c-ed51e4a48000@10.151.50.23@o2ib:0/0 lens 712/4936 e 0 to 0 dl 1385589463 ref 1 fl Interpret:/4/0 rc 0/0 LustreError: 6400:0:(mdt_open.c:1314:mdt_reint_open()) Skipped 8446 previous similar messages Lustre: nbp8-MDT0000: Denying connection for new client 10.151.40.177@o2ib (at bdedff9d-8f09-576d-7f58-9d0094abc313), waiting for 965 clients in recovery for 14:12 Lustre: Skipped 137 previous similar messages Lustre: 6342:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from beb87751-b41b-41c0-bfcb-75b4366a7947@10.153.1.48@o2ib233 recovering/t0 exp ffff883fb3381000 cur 1385588713 last 1385588655 Lustre: 6342:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 14289 previous similar messages LustreError: 6367:0:(mdt_open.c:1314:mdt_reint_open()) @@@ OPEN & CREAT not in open replay. req@ffff881e414c9c00 x1449981538622728/t0(180476772260) o101->816e39dc-d0f1-19d4-c94f-fb65f7237bef@10.151.44.134@o2ib:0/0 lens 712/4936 e 0 to 0 dl 1385589479 ref 1 fl Interpret:/4/0 rc 0/0 LustreError: 6367:0:(mdt_open.c:1314:mdt_reint_open()) Skipped 15806 previous similar messages Lustre: nbp8-MDT0000: Denying connection for new client 10.151.47.216@o2ib (at 101285a6-9cd9-2721-9df7-40da5c311017), waiting for 649 clients in recovery for 13:56 Lustre: Skipped 159 previous similar messages LustreError: 6371:0:(mdt_open.c:1314:mdt_reint_open()) @@@ OPEN & CREAT not in open replay. req@ffff883f60a91800 x1452353705630036/t0(180790013583) o101->89a6c2ec-6b29-6fd2-59c8-8f63a0841cdf@10.151.18.205@o2ib:0/0 lens 744/4936 e 0 to 0 dl 1385589511 ref 1 fl Interpret:/4/0 rc 0/0 LustreError: 6371:0:(mdt_open.c:1314:mdt_reint_open()) Skipped 22830 previous similar messages Lustre: nbp8-MDT0000: Denying connection for new client 10.151.49.135@o2ib (at 36f753ed-36ea-320f-74ee-115d64eccd14), waiting for 1197 clients in recovery for 13:23 Lustre: Skipped 332 previous similar messages Lustre: 7000:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 101285a6-9cd9-2721-9df7-40da5c311017@10.151.47.216@o2ib recovering/t180859536925 exp (null) cur 1385588777 last 0 Lustre: 7000:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 10109 previous similar messages LustreError: 6316:0:(mdt_open.c:1314:mdt_reint_open()) @@@ OPEN & CREAT not in open replay. req@ffff881e3abeb400 x1452345264923640/t0(180596528255) o101->d30b2de1-2db4-65f9-e7db-5607bcfd9cd0@10.151.19.38@o2ib:0/0 lens 744/4936 e 0 to 0 dl 1385589575 ref 1 fl Interpret:/4/0 rc 0/0 LustreError: 6316:0:(mdt_open.c:1314:mdt_reint_open()) Skipped 138328 previous similar messages Lustre: nbp8-MDT0000: Denying connection for new client 10.151.29.245@o2ib (at 8e68d8e7-cd33-7f68-d453-41814d65f55a), waiting for 784 clients in recovery for 12:18 Lustre: Skipped 867 previous similar messages Lustre: 6851:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 87e9617b-6e16-e700-13c0-c4c0d2508a27@10.151.32.180@o2ib recovering/t181073412670 exp (null) cur 1385588905 last 0 Lustre: 6851:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1518 previous similar messages Lustre: nbp8-MDT0000: Denying connection for new client 10.151.46.187@o2ib (at 24819010-1767-a13e-301e-056fd24f419a), waiting for 209 clients in recovery for 10:10 Lustre: Skipped 1516 previous similar messages Lustre: nbp8-MDT0000: disconnecting 1643 stale clients Lustre: nbp8-MDT0000: sending delayed replies to recovered clients Lustre: MDS mdd_obd-nbp8-MDT0000: nbp8-OST003c_UUID now active, resetting orphans Lustre: MDS mdd_obd-nbp8-MDT0000: nbp8-OST0043_UUID now active, resetting orphans Lustre: Skipped 19 previous similar messages LustreError: 7607:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3 LustreError: 7613:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3 LustreError: 7607:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 1217 previous similar messages LustreError: 7836:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3 LustreError: 7836:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 19290 previous similar messages LustreError: 7622:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 30242 rc:-3) LustreError: 7614:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 4127 rc:-3) LustreError: 7673:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 30193 rc:-3) LustreError: 7673:0:(quota_master.c:1727:qmaster_recovery_main()) Skipped 22 previous similar messages LustreError: 7818:0:(quota_ctl.c:330:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -3 LustreError: 7818:0:(quota_ctl.c:330:client_quota_ctl()) Skipped 43032 previous similar messages LustreError: 7902:0:(quota_master.c:1727:qmaster_recovery_main()) mdd_obd-nbp8-MDT0000: qmaster recovery failed for uid 11816 rc:-3) LustreError: 7902:0:(quota_master.c:1727:qmaster_recovery_main()) Skipped 143 previous similar messages LustreError: 6794:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589017, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f5fec5900/0x8702b298b8f52a74 lrc: 3/1,0 mode: --/PR res: 8885893120/2 bits 0x3 rrc: 11 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6794 timeout: 0 LustreError: 6826:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589017, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f8ae696c0/0x8702b298b8f52cb9 lrc: 3/1,0 mode: --/PR res: 8885893120/2 bits 0x3 rrc: 11 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6826 timeout: 0 LustreError: 6826:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 1 previous similar message LustreError: 6794:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 62 previous similar messages LustreError: dumping log to /tmp/lustre-log.1385589318.6876 LustreError: 6982:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589018, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f77321000/0x8702b298b8f6ba77 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6982 timeout: 0 LustreError: 6982:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 7 previous similar messages LustreError: 7028:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589019, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f76160d80/0x8702b298b8f9b156 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 7028 timeout: 0 LustreError: 7028:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 13 previous similar messages LustreError: 6989:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589021, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f610786c0/0x8702b298b8ff4cbf lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6989 timeout: 0 LustreError: 6989:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 20 previous similar messages LustreError: 7098:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589025, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883fead1c480/0x8702b298b90a8a36 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 7098 timeout: 0 LustreError: 7098:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 12 previous similar messages LustreError: 6235:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589036, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f76a1c900/0x8702b298b929da01 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 913 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6235 timeout: 0 LustreError: 6235:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 9 previous similar messages LustreError: 6884:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589052, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff881e3cc1cb40/0x8702b298b94d4e22 lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 915 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6884 timeout: 0 LustreError: 6884:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 13 previous similar messages LustreError: 6961:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589094, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff883f81faf000/0x8702b298b94e7b6f lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 931 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6961 timeout: 0 LustreError: 6961:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 167 previous similar messages Lustre: 8467:0:(ldlm_lib.c:952:target_handle_connect()) nbp8-MDT0000: connection from 29c09c23-d6c1-7349-2728-df8c815d4c8a@10.151.34.98@o2ib t181060241980 exp (null) cur 1385589462 last 0 Lustre: 8467:0:(ldlm_lib.c:952:target_handle_connect()) Skipped 1490 previous similar messages LustreError: 6771:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1385589166, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-ffff883fcadeb000 lock: ffff881e4ac88d80/0x8702b298b94ec86f lrc: 3/1,0 mode: --/PR res: 77032705/3805586675 bits 0x3 rrc: 939 type: IBT flags: 0x4004000 remote: 0x0 expref: -99 pid: 6771 timeout: 0 LustreError: 6771:0:(ldlm_request.c:91:ldlm_expired_completion_wait()) Skipped 14 previous similar messages Lustre: Service thread pid 8362 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 8362, comm: mdt_454 Call Trace: [<ffffffff8151d552>] schedule_timeout+0x192/0x2e0 [<ffffffff8107bf80>] ? process_timeout+0x0/0x10 [<ffffffffa0764c60>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [<ffffffffa04f95e1>] cfs_waitq_timedwait+0x11/0x20 [libcfs] [<ffffffffa0768d0d>] ldlm_completion_ast+0x48d/0x720 [ptlrpc] [<ffffffff8105fab0>] ? default_wake_function+0x0/0x20 [<ffffffffa0768506>] ldlm_cli_enqueue_local+0x1e6/0x560 [ptlrpc] [<ffffffffa0768880>] ? ldlm_completion_ast+0x0/0x720 [ptlrpc] [<ffffffffa0df9e60>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [<ffffffffa0dfd2a0>] mdt_object_lock+0x320/0xb70 [mdt] [<ffffffffa0df9e60>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] [<ffffffffa0768880>] ? ldlm_completion_ast+0x0/0x720 [ptlrpc] [<ffffffffa0e0dc62>] mdt_getattr_name_lock+0xe22/0x1880 [mdt] [<ffffffffa078eb1d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [<ffffffffa07b8486>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [<ffffffffa0790da4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [<ffffffffa0e0ec1d>] mdt_intent_getattr+0x2cd/0x4a0 [mdt] [<ffffffffa0e0ac09>] mdt_intent_policy+0x379/0x690 [mdt] [<ffffffffa074b351>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] [<ffffffffa07711ad>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] [<ffffffffa0e0b586>] mdt_enqueue+0x46/0x130 [mdt] [<ffffffffa0e00772>] mdt_handle_common+0x932/0x1750 [mdt] [<ffffffffa0e01665>] mdt_regular_handle+0x15/0x20 [mdt] [<ffffffffa079fb4e>] ptlrpc_main+0xc4e/0x1a40 [ptlrpc] [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc] [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc] [<ffffffffa079ef00>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc] [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1385589491.8362 Lustre: Service thread pid 8370 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 8370, comm: mdt_459

            People

              niu Niu Yawei (Inactive)
              mhanafi Mahmoud Hanafi
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: