Aug 20 09:51:09 atlas-oss1c7.ccs.ornl.gov kernel: [3693273.756268] Lustre: atlas1-OST0016: haven't heard from client ea01e44d-f562-e1e9-0228-5b63fad9427b (at 10.36.202.172@o2ib) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff88102d1fec00, cur 1408542669 expire 1408541769 last 1408541317
Aug 20 09:51:09 atlas-oss1c7.ccs.ornl.gov kernel: [3693273.825765] Lustre: Skipped 17 previous similar messages
Aug 20 09:51:09 atlas-oss1c7.ccs.ornl.gov kernel: [3693273.952287] Lustre: atlas1-OST0376: haven't heard from client ea01e44d-f562-e1e9-0228-5b63fad9427b (at 10.36.202.172@o2ib) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff88102c992000, cur 1408542669 expire 1408541769 last 1408541317
Aug 20 09:51:09 atlas-oss1c7.ccs.ornl.gov kernel: [3693274.016013] Lustre: Skipped 1 previous similar message
Aug 20 09:53:39 atlas-oss1c7.ccs.ornl.gov kernel: [3693423.916671] Lustre: atlas1-OST0136: haven't heard from client ea01e44d-f562-e1e9-0228-5b63fad9427b (at 10.36.202.172@o2ib) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff8810306a9400, cur 1408542819 expire 1408541919 last 1408541467
Aug 20 09:53:39 atlas-oss1c7.ccs.ornl.gov kernel: [3693423.982531] Lustre: Skipped 1 previous similar message
Aug 20 09:54:44 atlas-oss1c7.ccs.ornl.gov kernel: [3693488.931359] Lustre: atlas1-OST0256: haven't heard from client ea01e44d-f562-e1e9-0228-5b63fad9427b (at 10.36.202.172@o2ib) in 1417 seconds. I think it's dead, and I am evicting it. exp ffff88102f30e400, cur 1408542884 expire 1408541984 last 1408541467
Aug 20 09:56:32 atlas-oss1c7.ccs.ornl.gov kernel: [3693597.191787] Lustre: atlas1-OST01c6: haven't heard from client ea01e44d-f562-e1e9-0228-5b63fad9427b (at 10.36.202.172@o2ib) in 1525 seconds. I think it's dead, and I am evicting it. exp ffff88102e38d400, cur 1408542992 expire 1408542092 last 1408541467
Aug 20 10:05:50 atlas-oss1c7.ccs.ornl.gov kernel: [3694155.953582] INFO: task ldlm_cn03_002:92611 blocked for more than 120 seconds.
Aug 20 10:05:50 atlas-oss1c7.ccs.ornl.gov kernel: [3694155.970271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:05:50 atlas-oss1c7.ccs.ornl.gov kernel: [3694155.999917] ldlm_cn03_002 D 000000000000000c     0 92611      2 0x00000000
Aug 20 10:05:50 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.020143]  ffff8807ce5d3ae0 0000000000000046 0000000000000030 ffff880bbca59430
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.049556]  ffff8807ce5d3a90 ffffffffa05320d7 10ff880700000010 ffffc9005374c000
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.070884]  ffff88081c3485f8 ffff8807ce5d3fd8 000000000000fb88 ffff88081c3485f8
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.099572] Call Trace:
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.101508]  [<ffffffffa05320d7>] ? htable_lookup+0xb7/0x1c0 [obdclass]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.129830]  [<ffffffffa05327af>] ? lu_object_find_at+0x8f/0x360 [obdclass]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.150556]  [<ffffffff8116879c>] ? __kmalloc+0x20c/0x220
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.170514]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.190574]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.219638]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.230922]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.260010]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.279822]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.300301]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.320736]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.340978]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.369885]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.390353]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.410970]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.440304]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.460659]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.481009]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.509747]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.531038]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.551266]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.579925]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.591067]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.611470]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.639978]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.651551] INFO: task ldlm_cn00_003:44913 blocked for more than 120 seconds.
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.680239] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.701415] ldlm_cn00_003 D 0000000000000001     0 44913      2 0x00000000
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.730417]  ffff88004cd25ae0 0000000000000046 0000000000000000 ffff88004cd25aa4
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.752098]  ffff880000000000 ffff88087fe71100 ffff88089c496740 00000000000005fe
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.780982]  ffff8806d42b9058 ffff88004cd25fd8 000000000000fb88 ffff8806d42b9058
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.809864] Call Trace:
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.811939]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.839831]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.860447]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.880086]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.900826]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.920888]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.941157]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.970153]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694156.990503]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.011262]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.040306]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.060760]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.090137]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.110362]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.130942]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.151373]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.171163]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.191445]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.211489]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.231016]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.251453]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.271484]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.291161] INFO: task ldlm_cn03_013:115950 blocked for more than 120 seconds.
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.320403] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.341418] ldlm_cn03_013 D 000000000000000c     0 115950      2 0x00000000
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.370501]  ffff880bc8b5bae0 0000000000000046 0000000000000030 ffff880b47eba030
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.392238]  ffff880bc8b5ba90 ffffffffa05320d7 10ff880b00000010 ffffc9005374c000
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.421035]  ffff880b17301058 ffff880bc8b5bfd8 000000000000fb88 ffff880b17301058
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.442237] Call Trace:
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.451211]  [<ffffffffa05320d7>] ? htable_lookup+0xb7/0x1c0 [obdclass]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.480107]  [<ffffffffa05327af>] ? lu_object_find_at+0x8f/0x360 [obdclass]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.500672]  [<ffffffff8116879c>] ? __kmalloc+0x20c/0x220
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.520292]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.540728]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.561226]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.580710]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.601566]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.621545]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.650175]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.670999]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.691193]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.720182]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.740813]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.761345]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.790467]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.810454]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.831002]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.851377]  [<ffffffff81063b80>] ? default_wake_function+0x0/0x20
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.871559]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.891701]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.920255]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.931540]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.951778]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.980271]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694157.991590] INFO: task ldlm_cn02_009:115951 blocked for more than 120 seconds.
Aug 20 10:05:52 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.020890] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.041918] ldlm_cn02_009 D 000000000000000a     0 115951      2 0x00000000
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.070908]  ffff880ea34a5ae0 0000000000000046 0000000000000000 ffff880ea34a5aa4
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.092666]  ffff880e00000000 ffff88087fe72300 ffff88089c4d6740 0000000000000400
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.121286]  ffff880ea34a3af8 ffff880ea34a5fd8 000000000000fb88 ffff880ea34a3af8
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.142836] Call Trace:
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.151674]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.180427]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.200935]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.220614]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.241495]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.261189]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.281511]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.310648]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.331034]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.351550]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.380529]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.401064]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.430482]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.450853]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.471256]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.491744]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.511476]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.531813]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.552104]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.571643]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.591983]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.612172]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.631835] INFO: task ldlm_cn00_009:116811 blocked for more than 120 seconds.
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.660772] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.681860] ldlm_cn00_009 D 0000000000000000     0 116811      2 0x00000000
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.710789]  ffff8800620d5ae0 0000000000000046 0000000000000000 ffff8800620d5aa4
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.732426]  ffff880000000000 ffff88087fe70f00 ffff8800446b6740 0000000000000400
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.761348]  ffff8806e8aac638 ffff8800620d5fd8 000000000000fb88 ffff8806e8aac638
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.782687] Call Trace:
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.791394]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.811708]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.840622]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.851942]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.880860]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.900837]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.921230]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.941799]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.970662]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694158.991088]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.011663]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.040721]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.061680]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.081788]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.110736]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.131297]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.150951]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.171314]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.191501]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.211003]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.231393]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.251564]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.271377] INFO: task ldlm_cn03_015:20661 blocked for more than 120 seconds.
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.292056] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.321523] ldlm_cn03_015 D 000000000000000c     0 20661      2 0x00000000
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.342068]  ffff880052b47ae0 0000000000000046 0000000000000030 ffff880971a19430
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.371585]  ffff880052b47a90 ffffffffa05320d7 10ff880000000010 ffffc9005374c000
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.393240]  ffff880c14d3b058 ffff880052b47fd8 000000000000fb88 ffff880c14d3b058
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.422070] Call Trace:
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.431254]  [<ffffffffa05320d7>] ? htable_lookup+0xb7/0x1c0 [obdclass]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.451779]  [<ffffffffa05327af>] ? lu_object_find_at+0x8f/0x360 [obdclass]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.472423]  [<ffffffff8116879c>] ? __kmalloc+0x20c/0x220
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.492140]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.512503]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.541290]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.560897]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.581834]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.601769]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.622010]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.650967]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.671319]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.691898]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.712467]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.741354]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.762398]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.790997]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.811790]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.841425]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.861044]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.881390]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.901620]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.921053]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.941258]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.961631]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694159.981239] INFO: task ldlm_cn02_029:112717 blocked for more than 120 seconds.
Aug 20 10:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.001980] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.031458] ldlm_cn02_029 D 0000000000000009     0 112717      2 0x00000000
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.052000]  ffff880d41b3fae0 0000000000000046 0000000000000030 ffff880fa65ed830
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.081650]  ffff880d41b3fa90 ffffffffa05320d7 10ff880d00000010 ffffc9005374c000
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.103967]  ffff880ed4196638 ffff880d41b3ffd8 000000000000fb88 ffff880ed4196638
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.131922] Call Trace:
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.141124]  [<ffffffffa05320d7>] ? htable_lookup+0xb7/0x1c0 [obdclass]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.161738]  [<ffffffffa05327af>] ? lu_object_find_at+0x8f/0x360 [obdclass]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.182254]  [<ffffffff8116879c>] ? __kmalloc+0x20c/0x220
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.202045]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.222453]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.251326]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.262565]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.291551]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.311556]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.331976]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.352472]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.381221]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.402059]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.422584]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.451276]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.472513]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.492869]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.521452]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.542017]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.561662]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.582048]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.602406]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.621673]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.641902]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.662241]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.681695] INFO: task ldlm_cn03_025:18502 blocked for more than 120 seconds.
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.702628] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.732123] ldlm_cn03_025 D 000000000000000f     0 18502      2 0x00000000
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.752643]  ffff8809cbf11ae0 0000000000000046 0000000000000030 ffff880caf2d0030
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.782282]  ffff8809cbf11a90 ffffffffa05320d7 10ff880900000010 ffffc9005374c000
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.803660]  ffff8809fd8cb098 ffff8809cbf11fd8 000000000000fb88 ffff8809fd8cb098
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.832548] Call Trace:
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.841987]  [<ffffffffa05320d7>] ? htable_lookup+0xb7/0x1c0 [obdclass]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.862364]  [<ffffffffa05327af>] ? lu_object_find_at+0x8f/0x360 [obdclass]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.882864]  [<ffffffff8116879c>] ? __kmalloc+0x20c/0x220
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.902494]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.922696]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.951449]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.962866]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694160.991860]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.011681]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.032034]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.052783]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.072948]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.102067]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.122593]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.151515]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.172256]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.192632]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.221556]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.242086]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.261951]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.282137]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.302114]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.321748]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.342124]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:05:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694161.362173]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.426863] INFO: task ldlm_cn00_000:92600 blocked for more than 120 seconds.
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.447686] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.477296] ldlm_cn00_000 D 0000000000000002     0 92600      2 0x00000000
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.498085]  ffff8807ce5a1ae0 0000000000000046 0000000000000000 ffff880405355c30
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.527681]  ffff8807ce5a1a90 ffffffffa05320d7 10ff880700000010 ffffc9005374c000
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.549010]  ffff8808397ddab8 ffff8807ce5a1fd8 000000000000fb88 ffff8808397ddab8
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.577761] Call Trace:
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.579683]  [<ffffffffa05320d7>] ? htable_lookup+0xb7/0x1c0 [obdclass]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.607561]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.627960]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.648623]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.667966]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.688671]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.708469]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.728688]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.757805]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.778179]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.798367]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.827257]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.848033]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.877628]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.897845]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.918416]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.947303]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.958726]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694281.979008]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.007322]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.018468]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.038621]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.058668]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:07:56 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.078452] INFO: task ldlm_cn00_001:92601 blocked for more than 120 seconds.
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.107554] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.128787] ldlm_cn00_001 D 0000000000000001     0 92601      2 0x00000000
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.157745]  ffff8807ce5a7ae0 0000000000000046 0000000000000000 ffff880566783430
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.179578]  ffff8807ce5a7a90 ffffffffa05320d7 10ff880700000010 ffffc9005374c000
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.208348]  ffff88081b4e8638 ffff8807ce5a7fd8 000000000000fb88 ffff88081b4e8638
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.229767] Call Trace:
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.238748]  [<ffffffffa05320d7>] ? htable_lookup+0xb7/0x1c0 [obdclass]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.267402]  [<ffffffff8150dc6e>] __mutex_lock_slowpath+0x13e/0x180
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.287587]  [<ffffffffa0bcfb47>] ? osd_xattr_get+0x97/0x2d0 [osd_ldiskfs]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.308198]  [<ffffffff8150db0b>] mutex_lock+0x2b/0x50
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.327883]  [<ffffffffa0bcb677>] osd_object_sync+0x127/0x190 [osd_ldiskfs]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.348397]  [<ffffffffa0c59c7b>] ofd_sync+0x36b/0x680 [ofd]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.368375]  [<ffffffffa0c32933>] ost_blocking_ast+0x633/0x10f0 [ost]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.388418]  [<ffffffffa06721cc>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.417483]  [<ffffffffa067235a>] ldlm_lock_cancel+0x5a/0x1e0 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.437894]  [<ffffffffa0695c94>] ldlm_request_cancel+0x254/0x410 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.458614]  [<ffffffffa0695f8d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.487508]  [<ffffffffa0697fc8>] ldlm_cancel_handler+0x3f8/0x600 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.508286]  [<ffffffffa06cc568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.537537]  [<ffffffffa03b85de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.557709]  [<ffffffffa03c9d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.578537]  [<ffffffffa06c38c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.599046]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.618727]  [<ffffffffa06cd8fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.638709]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.659200]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.678775]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.698914]  [<ffffffffa06cce30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 10:07:57 atlas-oss1c7.ccs.ornl.gov kernel: [3694282.719199]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 10:08:32 atlas-oss1c7.ccs.ornl.gov kernel: [3694317.118084] Lustre: atlas1-OST0376: haven't heard from client 74e30182-0842-d18e-1278-4ba71d9d829b (at 10.38.146.46@o2ib4) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff880468fd2000, cur 1408543712 expire 1408542812 last 1408542360
Aug 20 10:11:02 atlas-oss1c7.ccs.ornl.gov kernel: [3694467.252297] Lustre: atlas1-OST0016: haven't heard from client 74e30182-0842-d18e-1278-4ba71d9d829b (at 10.38.146.46@o2ib4) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff8804ecc31400, cur 1408543862 expire 1408542962 last 1408542510
Aug 20 10:11:02 atlas-oss1c7.ccs.ornl.gov kernel: [3694467.318566] Lustre: Skipped 1 previous similar message
Aug 20 10:15:21 atlas-oss1c7.ccs.ornl.gov kernel: [3694726.482545] Lustre: 115943:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 10:15:21 atlas-oss1c7.ccs.ornl.gov kernel: [3694726.482547]   req@ffff880f7636dc00 x1476723491220900/t0(0) o103->2e7e88a4-8918-d2b2-5d24-75c6e7a4f623@11829@gni107:0/0 lens 296/224 e 1 to 0 dl 1408544126 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:15:21 atlas-oss1c7.ccs.ornl.gov kernel: [3694726.566904] Lustre: 115943:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 7 previous similar messages
Aug 20 10:15:30 atlas-oss1c7.ccs.ornl.gov kernel: [3694736.023915] Lustre: 18503:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:4s); client may timeout.  req@ffff880f7636dc00 x1476723491220900/t0(0) o103->2e7e88a4-8918-d2b2-5d24-75c6e7a4f623@11829@gni107:0/0 lens 296/192 e 1 to 0 dl 1408544126 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:15:39 atlas-oss1c7.ccs.ornl.gov kernel: [3694744.489347] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 10:15:39 atlas-oss1c7.ccs.ornl.gov kernel: [3694744.489349]   req@ffff880c87b5f800 x1476723489115913/t0(0) o103->e0554721-49ad-ba80-ad83-142f74fdafc9@12513@gni105:0/0 lens 296/224 e 1 to 0 dl 1408544144 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:15:39 atlas-oss1c7.ccs.ornl.gov kernel: [3694744.583651] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 11 previous similar messages
Aug 20 10:16:00 atlas-oss1c7.ccs.ornl.gov kernel: [3694765.497247] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 10:16:00 atlas-oss1c7.ccs.ornl.gov kernel: [3694765.497249]   req@ffff88057cde3000 x1476723491228245/t0(0) o103->199057bb-e9b7-c9af-9248-a01cd0ea581f@10405@gni108:0/0 lens 296/224 e 1 to 0 dl 1408544165 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:16:00 atlas-oss1c7.ccs.ornl.gov kernel: [3694765.581819] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 6 previous similar messages
Aug 20 10:16:01 atlas-oss1c7.ccs.ornl.gov kernel: [3694766.304504] Lustre: 115947:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:35s); client may timeout.  req@ffff880ee7a8fc00 x1476723485995994/t0(0) o103->a1eb48ae-b4e7-6343-d282-a2f18b2521d4@10800@gni102:0/0 lens 296/192 e 1 to 0 dl 1408544126 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:16:05 atlas-oss1c7.ccs.ornl.gov kernel: [3694770.499130] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 10:16:05 atlas-oss1c7.ccs.ornl.gov kernel: [3694770.499132]   req@ffff881048bb5c00 x1476723485988098/t0(0) o103->9422134d-31f5-b92e-4afd-0a209cf39966@10900@gni105:0/0 lens 296/224 e 1 to 0 dl 1408544170 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:16:05 atlas-oss1c7.ccs.ornl.gov kernel: [3694770.583597] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages
Aug 20 10:16:09 atlas-oss1c7.ccs.ornl.gov kernel: [3694774.500628] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 10:16:09 atlas-oss1c7.ccs.ornl.gov kernel: [3694774.500630]   req@ffff880b2483f800 x1476723492259905/t0(0) o103->f0c6f14f-824a-5259-12b6-3f3f627e8193@10471@gni105:0/0 lens 296/224 e 1 to 0 dl 1408544174 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:16:09 atlas-oss1c7.ccs.ornl.gov kernel: [3694774.585451] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message
Aug 20 10:16:09 atlas-oss1c7.ccs.ornl.gov kernel: [3694774.980093] Lustre: 92609:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:43s); client may timeout.  req@ffff880d976e2850 x1476723474477937/t0(0) o103->feb6b0c5-49cc-688b-8b26-6f4c8e7adaa3@10774@gni102:0/0 lens 296/192 e 1 to 0 dl 1408544126 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:16:09 atlas-oss1c7.ccs.ornl.gov kernel: [3694775.065100] Lustre: 92609:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 3 previous similar messages
Aug 20 10:16:22 atlas-oss1c7.ccs.ornl.gov kernel: [3694787.505539] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 10:16:22 atlas-oss1c7.ccs.ornl.gov kernel: [3694787.505541]   req@ffff8810727b3850 x1476723480891652/t0(0) o103->36dd124e-2d6b-c85e-462f-a01d8b8f6826@11049@gni105:0/0 lens 296/224 e 1 to 0 dl 1408544187 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:16:22 atlas-oss1c7.ccs.ornl.gov kernel: [3694787.589869] Lustre: 115951:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 6 previous similar messages
Aug 20 10:16:35 atlas-oss1c7.ccs.ornl.gov kernel: [3694800.967398] Lustre: 53438:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:69s); client may timeout.  req@ffff8806c63dd000 x1476723486020324/t0(0) o103->acb2c1a2-843b-705b-8e3f-f0a2edb771e8@14468@gni110:0/0 lens 296/192 e 1 to 0 dl 1408544126 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:16:59 atlas-oss1c7.ccs.ornl.gov kernel: [3694824.519542] Lustre: 53438:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-112), not sending early reply
Aug 20 10:16:59 atlas-oss1c7.ccs.ornl.gov kernel: [3694824.519543]   req@ffff88093cd1c400 x1476723480747784/t0(0) o103->e4c2f9eb-ef82-cc67-e8cc-ec4c4a035f4f@10795@gni102:0/0 lens 296/224 e 1 to 0 dl 1408544224 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:16:59 atlas-oss1c7.ccs.ornl.gov kernel: [3694824.604104] Lustre: 53438:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 7 previous similar messages
Aug 20 10:17:01 atlas-oss1c7.ccs.ornl.gov kernel: [3694826.957740] Lustre: 19185:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:95s); client may timeout.  req@ffff880378203000 x1476723484980087/t0(0) o103->39217ebc-658c-26f3-383f-259fa3e04955@16160@gni101:0/0 lens 296/192 e 1 to 0 dl 1408544126 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:17:01 atlas-oss1c7.ccs.ornl.gov kernel: [3694827.044614] Lustre: 19185:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 1 previous similar message
Aug 20 10:17:14 atlas-oss1c7.ccs.ornl.gov kernel: [3694839.915283] Lustre: 20676:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:108s); client may timeout.  req@ffff8804375dfc00 x1476723485791950/t0(0) o103->cf98f657-74ec-673d-46d5-91aeb7b2d3eb@14460@gni110:0/0 lens 296/192 e 1 to 0 dl 1408544126 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:17:14 atlas-oss1c7.ccs.ornl.gov kernel: [3694839.999910] Lustre: 20676:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 1 previous similar message
Aug 20 10:17:23 atlas-oss1c7.ccs.ornl.gov kernel: [3694848.590783] Lustre: 115942:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:117s); client may timeout.  req@ffff880421d7ec00 x1476723486014024/t0(0) o103->a1076393-c29b-89f2-98b7-75cdfdec2792@16252@gni110:0/0 lens 296/192 e 1 to 0 dl 1408544126 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:17:23 atlas-oss1c7.ccs.ornl.gov kernel: [3694848.682776] Lustre: 115942:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 6 previous similar messages
Aug 20 10:17:25 atlas-oss1c7.ccs.ornl.gov kernel: [3694850.554304] Lustre: 18526:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-112), not sending early reply
Aug 20 10:17:25 atlas-oss1c7.ccs.ornl.gov kernel: [3694850.554306]   req@ffff880fb4dd0000 x1476723491231174/t0(0) o103->e4a83e55-51f4-1c2b-2ef1-ef71e7fc82ed@12132@gni105:0/0 lens 296/224 e 1 to 0 dl 1408544250 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:17:25 atlas-oss1c7.ccs.ornl.gov kernel: [3694850.643702] Lustre: 18526:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 10 previous similar messages
Aug 20 10:17:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694878.892823] Lustre: 53434:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (717:49s); client may timeout.  req@ffff8808b1865800 x1476723476551780/t0(0) o103->b656f95a-715c-3eb4-a194-5c7e3fd87ef4@10744@gni102:0/0 lens 296/192 e 1 to 0 dl 1408544224 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:17:53 atlas-oss1c7.ccs.ornl.gov kernel: [3694878.984074] Lustre: 53434:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 5 previous similar messages
Aug 20 10:18:02 atlas-oss1c7.ccs.ornl.gov kernel: [3694887.616325] Lustre: 115957:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-175), not sending early reply
Aug 20 10:18:02 atlas-oss1c7.ccs.ornl.gov kernel: [3694887.616327]   req@ffff88096ec57c00 x1476723487020226/t0(0) o103->f8cd4d27-2f30-b41f-7597-389d0ba90985@10958@gni105:0/0 lens 296/224 e 1 to 0 dl 1408544287 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:18:02 atlas-oss1c7.ccs.ornl.gov kernel: [3694887.708301] Lustre: 115957:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 38 previous similar messages
Aug 20 10:18:32 atlas-oss1c7.ccs.ornl.gov kernel: [3694917.865847] Lustre: 115189:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (780:25s); client may timeout.  req@ffff88096ec57c00 x1476723487020226/t0(0) o103->f8cd4d27-2f30-b41f-7597-389d0ba90985@10958@gni105:0/0 lens 296/192 e 1 to 0 dl 1408544287 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:19:26 atlas-oss1c7.ccs.ornl.gov kernel: [3694971.599917] Lustre: 20659:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-246), not sending early reply
Aug 20 10:19:26 atlas-oss1c7.ccs.ornl.gov kernel: [3694971.599919]   req@ffff8801e8418850 x1476723475486354/t0(0) o103->21e51972-d29a-50ee-e1f4-6b07a1ec958c@10946@gni105:0/0 lens 296/224 e 1 to 0 dl 1408544371 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 10:19:26 atlas-oss1c7.ccs.ornl.gov kernel: [3694971.689808] Lustre: 20659:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 6 previous similar messages
Aug 20 10:19:54 atlas-oss1c7.ccs.ornl.gov kernel: [3695000.140454] Lustre: 20668:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (717:166s); client may timeout.  req@ffff88010785dc00 x1476723481399590/t0(0) o103->7fdfae72-3a35-a90a-96a1-dd38833fb6c0@14758@gni110:0/0 lens 296/192 e 1 to 0 dl 1408544228 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:19:54 atlas-oss1c7.ccs.ornl.gov kernel: [3695000.230115] Lustre: 20668:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 8 previous similar messages
Aug 20 10:21:54 atlas-oss1c7.ccs.ornl.gov kernel: [3695119.887369] Lustre: atlas1-OST0136: Client 6baaf951-4332-ac39-2e19-216ebed3cd8c (at 12555@gni105) reconnecting
Aug 20 10:21:54 atlas-oss1c7.ccs.ornl.gov kernel: [3695119.887473] Lustre: atlas1-OST0136: Client 832ebc25-1539-51c6-04ec-dcfd0b6306e5 (at 12366@gni101) refused reconnection, still busy with 1 active RPCs
Aug 20 10:21:54 atlas-oss1c7.ccs.ornl.gov kernel: [3695119.956642] Lustre: Skipped 11 previous similar messages
Aug 20 10:22:04 atlas-oss1c7.ccs.ornl.gov kernel: [3695130.048289] Lustre: 19184:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:350s); client may timeout.  req@ffff880906114850 x1476723480741616/t0(0) o103->ed0e3251-1da8-c73d-684d-f72a58797433@12272@gni101:0/0 lens 296/192 e 1 to 0 dl 1408544174 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 10:22:04 atlas-oss1c7.ccs.ornl.gov kernel: [3695130.139700] Lustre: 19184:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 28 previous similar messages
Aug 20 10:22:07 atlas-oss1c7.ccs.ornl.gov kernel: [3695132.944025] Lustre: atlas1-OST0136: Client 9234a9df-571c-5e31-5dc2-78661e21775d (at 16008@gni110) reconnecting
Aug 20 10:22:07 atlas-oss1c7.ccs.ornl.gov kernel: [3695132.951490] Lustre: atlas1-OST0136: Client b9707f9a-fa10-749a-9b34-3990b4b47942 (at 16152@gni101) refused reconnection, still busy with 1 active RPCs
Aug 20 10:22:07 atlas-oss1c7.ccs.ornl.gov kernel: [3695132.951492] Lustre: Skipped 4 previous similar messages
Aug 20 10:22:07 atlas-oss1c7.ccs.ornl.gov kernel: [3695133.031123] Lustre: Skipped 8 previous similar messages
Aug 20 10:29:26 atlas-oss1c7.ccs.ornl.gov kernel: [3695572.427327] LustreError: 0:0:(ldlm_lockd.c:402:waiting_locks_callback()) ### lock callback timer expired after 416s: evicting client at 10866@gni102  ns: filter-atlas1-OST0136_UUID lock: ffff880f8d8bc000/0x33255120e81df64b lrc: 3/0,0 mode: PW/PW res: [0x5ab82d:0x0:0x0].0 rrc: 157 type: EXT [100663296->101801983] (req 100663296->101019647) flags: 0x20 nid: 10866@gni102 remote: 0x581cb06f5b3bde2a expref: 4 pid: 38378 timeout: 7988844139 lvb_type: 0
Aug 20 10:31:05 atlas-oss1c7.ccs.ornl.gov kernel: [3695671.092787] Lustre: atlas1-OST0136: Client 6baaf951-4332-ac39-2e19-216ebed3cd8c (at 12555@gni105) reconnecting
Aug 20 10:31:05 atlas-oss1c7.ccs.ornl.gov kernel: [3695671.124314] Lustre: Skipped 3 previous similar messages
Aug 20 10:31:18 atlas-oss1c7.ccs.ornl.gov kernel: [3695684.148333] Lustre: atlas1-OST0136: Client 9234a9df-571c-5e31-5dc2-78661e21775d (at 16008@gni110) reconnecting
Aug 20 10:31:18 atlas-oss1c7.ccs.ornl.gov kernel: [3695684.179690] Lustre: Skipped 8 previous similar messages
Aug 20 14:13:02 atlas-oss1c7.ccs.ornl.gov kernel: [3708992.657908] Lustre: atlas1-OST0376: haven't heard from client 0d6f96a0-f45f-71f0-3f92-0a975b76a439 (at 10.38.146.46@o2ib4) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff8802ca888c00, cur 1408558382 expire 1408557482 last 1408557030
Aug 20 14:13:02 atlas-oss1c7.ccs.ornl.gov kernel: [3708992.726604] Lustre: Skipped 11 previous similar messages
Aug 20 14:13:40 atlas-oss1c7.ccs.ornl.gov kernel: [3709030.887867] Lustre: atlas1-OST0016: haven't heard from client 4f610aa6-885a-3c97-287b-bdcd82a41557 (at 10.38.146.45@o2ib4) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff88102fce0c00, cur 1408558420 expire 1408557520 last 1408557068
Aug 20 14:13:40 atlas-oss1c7.ccs.ornl.gov kernel: [3709030.960338] Lustre: Skipped 1 previous similar message
Aug 20 15:20:58 atlas-oss1c7.ccs.ornl.gov kernel: [3713070.229412] Lustre: atlas1-OST0256: haven't heard from client b86cb9b2-cd3d-0a2d-8a9b-783ebde6fc41 (at 54@gni2) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff8803513dd400, cur 1408562458 expire 1408561558 last 1408561106
Aug 20 15:20:58 atlas-oss1c7.ccs.ornl.gov kernel: [3713070.292861] Lustre: Skipped 11 previous similar messages
Aug 20 15:20:58 atlas-oss1c7.ccs.ornl.gov kernel: [3713070.312568] Lustre: atlas1-OST0256: haven't heard from client 837d8fc8-359f-d7c8-f412-8fcacb9eff81 (at 55@gni2) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff880da299e400, cur 1408562458 expire 1408561558 last 1408561106
Aug 20 15:21:06 atlas-oss1c7.ccs.ornl.gov kernel: [3713078.410107] Lustre: atlas1-OST0376: haven't heard from client 3896816d-8b60-48b1-3fc1-b0ee22fb1d93 (at 3@gni2) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff88102e659c00, cur 1408562466 expire 1408561558 last 1408561114
Aug 20 15:21:06 atlas-oss1c7.ccs.ornl.gov kernel: [3713078.475023] Lustre: Skipped 94 previous similar messages
Aug 20 15:21:10 atlas-oss1c7.ccs.ornl.gov kernel: [3713082.740005] Lustre: atlas1-OST0376: haven't heard from client d19a8020-c889-0005-b3d1-86d657c0c195 (at 6@gni2) in 1356 seconds. I think it's dead, and I am evicting it. exp ffff880b461d4c00, cur 1408562470 expire 1408561558 last 1408561114
Aug 20 15:21:19 atlas-oss1c7.ccs.ornl.gov kernel: [3713091.404544] Lustre: atlas1-OST0136: haven't heard from client 3896816d-8b60-48b1-3fc1-b0ee22fb1d93 (at 3@gni2) in 1365 seconds. I think it's dead, and I am evicting it. exp ffff880a006db000, cur 1408562479 expire 1408561570 last 1408561114
Aug 20 15:21:19 atlas-oss1c7.ccs.ornl.gov kernel: [3713091.471758] Lustre: Skipped 244 previous similar messages
Aug 20 15:21:28 atlas-oss1c7.ccs.ornl.gov kernel: [3713100.102677] Lustre: atlas1-OST0136: haven't heard from client d19a8020-c889-0005-b3d1-86d657c0c195 (at 6@gni2) in 1373 seconds. I think it's dead, and I am evicting it. exp ffff880da299ec00, cur 1408562487 expire 1408561570 last 1408561114
Aug 20 16:19:35 atlas-oss1c7.ccs.ornl.gov kernel: [3716588.655139] Lustre: atlas1-OST0256: haven't heard from client 27fbf3bd-4fcf-447a-57bb-8c0fe43451bd (at 10.38.146.27@o2ib4) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff880a8f5bc400, cur 1408565975 expire 1408565075 last 1408564623
Aug 20 16:19:35 atlas-oss1c7.ccs.ornl.gov kernel: [3716588.727113] Lustre: Skipped 230 previous similar messages
Aug 20 16:20:10 atlas-oss1c7.ccs.ornl.gov kernel: [3716624.014835] Lustre: atlas1-OST0376: haven't heard from client 2b9e364b-fbf7-2e3e-c4f6-bd46f76ac497 (at 10.38.146.48@o2ib4) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff880ed3d5fc00, cur 1408566010 expire 1408565110 last 1408564658
Aug 20 16:20:10 atlas-oss1c7.ccs.ornl.gov kernel: [3716624.080322] Lustre: Skipped 17 previous similar messages
Aug 20 16:22:05 atlas-oss1c7.ccs.ornl.gov kernel: [3716738.636813] Lustre: atlas1-OST0136: haven't heard from client 27fbf3bd-4fcf-447a-57bb-8c0fe43451bd (at 10.38.146.27@o2ib4) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff880ee423cc00, cur 1408566125 expire 1408565225 last 1408564773
Aug 20 16:22:05 atlas-oss1c7.ccs.ornl.gov kernel: [3716738.703949] Lustre: Skipped 8 previous similar messages
Aug 20 16:40:34 atlas-oss1c7.ccs.ornl.gov kernel: [3717848.820606] LustreError: 7269:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 700, more than at_max 600
Aug 20 16:40:34 atlas-oss1c7.ccs.ornl.gov kernel: [3717848.820608]  ns: filter-atlas1-OST0256_UUID lock: ffff880edfbd5d80/0x33255120e82bff0b lrc: 4/0,0 mode: PW/PW res: [0x595bf8:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 4096->8191) flags: 0x20 nid: 10.38.144.36@o2ib4 remote: 0x4b1a2a2527a6acf8 expref: 400 pid: 38602 timeout: 8011487894 lvb_type: 0
Aug 20 16:40:34 atlas-oss1c7.ccs.ornl.gov kernel: [3717848.944049] LustreError: 7269:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 5 previous similar messages
Aug 20 16:40:38 atlas-oss1c7.ccs.ornl.gov kernel: [3717852.381659] LustreError: 52002:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 713, more than at_max 600
Aug 20 16:40:38 atlas-oss1c7.ccs.ornl.gov kernel: [3717852.381661]  ns: filter-atlas1-OST0256_UUID lock: ffff880edfbd5d80/0x33255120e82bff0b lrc: 4/0,0 mode: PW/PW res: [0x595bf8:0x0:0x0].0 rrc: 1 type: EXT [0->18446744073709551615] (req 4096->8191) flags: 0x20 nid: 10.38.144.36@o2ib4 remote: 0x4b1a2a2527a6acf8 expref: 401 pid: 38602 timeout: 8011813154 lvb_type: 0
Aug 20 16:40:38 atlas-oss1c7.ccs.ornl.gov kernel: [3717852.505599] LustreError: 52002:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 1 previous similar message
Aug 20 16:42:10 atlas-oss1c7.ccs.ornl.gov kernel: [3717943.943339] LustreError: 77234:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 16:42:10 atlas-oss1c7.ccs.ornl.gov kernel: [3717943.943341]  ns: filter-atlas1-OST00a6_UUID lock: ffff880993801240/0x33255120e82c079a lrc: 4/0,0 mode: PW/PW res: [0x60eadb:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x20 nid: 1173@gni109 remote: 0xd44bb150abef1991 expref: 13 pid: 38664 timeout: 8011583080 lvb_type: 0
Aug 20 16:42:10 atlas-oss1c7.ccs.ornl.gov kernel: [3717944.061459] LustreError: 38664:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 755 seconds
Aug 20 16:42:10 atlas-oss1c7.ccs.ornl.gov kernel: [3717944.061461]  ns: filter-atlas1-OST00a6_UUID lock: ffff880993801240/0x33255120e82c079a lrc: 5/0,0 mode: PW/PW res: [0x60eadb:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x20 nid: 1173@gni109 remote: 0xd44bb150abef1991 expref: 14 pid: 38664 timeout: 8011963204 lvb_type: 0
Aug 20 16:42:10 atlas-oss1c7.ccs.ornl.gov kernel: [3717944.200193] LustreError: 38664:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 4 previous similar messages
Aug 20 16:42:20 atlas-oss1c7.ccs.ornl.gov kernel: [3717954.776049] LustreError: 100384:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 626s
Aug 20 16:42:20 atlas-oss1c7.ccs.ornl.gov kernel: [3717954.814262] LustreError: 100384:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 626s
Aug 20 16:43:20 atlas-oss1c7.ccs.ornl.gov kernel: [3718014.321957] LustreError: 100775:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 658s
Aug 20 16:44:21 atlas-oss1c7.ccs.ornl.gov kernel: [3718075.210146] LustreError: 100899:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 685s
Aug 20 16:44:21 atlas-oss1c7.ccs.ornl.gov kernel: [3718075.241451] LustreError: 100899:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 1 previous similar message
Aug 20 16:45:19 atlas-oss1c7.ccs.ornl.gov kernel: [3718133.944354] LustreError: 43949:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 16:45:19 atlas-oss1c7.ccs.ornl.gov kernel: [3718133.944356]  ns: filter-atlas1-OST0256_UUID lock: ffff88041f85b900/0x33255120e82c0ba6 lrc: 4/0,0 mode: PW/PW res: [0x59656a:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x20 nid: 1960@gni109 remote: 0xc154e45f8273dbdd expref: 9 pid: 38505 timeout: 8011772751 lvb_type: 0
Aug 20 16:45:20 atlas-oss1c7.ccs.ornl.gov kernel: [3718134.063647] LustreError: 43949:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 19 previous similar messages
Aug 20 16:45:20 atlas-oss1c7.ccs.ornl.gov kernel: [3718134.536003] LustreError: 38701:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 754 seconds
Aug 20 16:45:20 atlas-oss1c7.ccs.ornl.gov kernel: [3718134.536005]  ns: filter-atlas1-OST0256_UUID lock: ffff88041f85b900/0x33255120e82c0ba6 lrc: 3/0,0 mode: PW/PW res: [0x59656a:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x20 nid: 1960@gni109 remote: 0xc154e45f8273dbdd expref: 9 pid: 38505 timeout: 8012153175 lvb_type: 0
Aug 20 16:45:20 atlas-oss1c7.ccs.ornl.gov kernel: [3718134.662967] LustreError: 38701:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 10 previous similar messages
Aug 20 16:45:22 atlas-oss1c7.ccs.ornl.gov kernel: [3718136.800949] LustreError: 101524:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 675s
Aug 20 16:45:22 atlas-oss1c7.ccs.ornl.gov kernel: [3718136.833217] LustreError: 101524:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 1 previous similar message
Aug 20 16:46:24 atlas-oss1c7.ccs.ornl.gov kernel: [3718199.046140] LustreError: 101661:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 649s
Aug 20 16:46:24 atlas-oss1c7.ccs.ornl.gov kernel: [3718199.078248] LustreError: 101661:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 1 previous similar message
Aug 20 16:47:26 atlas-oss1c7.ccs.ornl.gov kernel: [3718260.507118] LustreError: 102520:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 656s
Aug 20 16:47:26 atlas-oss1c7.ccs.ornl.gov kernel: [3718260.531612] LustreError: 102520:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 1 previous similar message
Aug 20 16:48:27 atlas-oss1c7.ccs.ornl.gov kernel: [3718321.482338] LustreError: 102941:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 685s
Aug 20 16:48:27 atlas-oss1c7.ccs.ornl.gov kernel: [3718321.514723] LustreError: 102941:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 1 previous similar message
Aug 20 16:48:32 atlas-oss1c7.ccs.ornl.gov kernel: [3718326.724918] Lustre: 77327:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/0), not sending early reply
Aug 20 16:48:32 atlas-oss1c7.ccs.ornl.gov kernel: [3718326.724919]   req@ffff8804d01fc800 x1476723496246813/t0(0) o3->106c827e-52b5-f585-7d3b-b5950093ef06@4693@gni103:0/0 lens 448/0 e 0 to 0 dl 1408567717 ref 2 fl New:/0/ffffffff rc 0/-1
Aug 20 16:48:32 atlas-oss1c7.ccs.ornl.gov kernel: [3718326.816218] Lustre: 77327:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 296 previous similar messages
Aug 20 16:48:49 atlas-oss1c7.ccs.ornl.gov kernel: [3718343.731292] Lustre: 7290:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-3), not sending early reply
Aug 20 16:48:49 atlas-oss1c7.ccs.ornl.gov kernel: [3718343.731294]   req@ffff880cdbbe6800 x1476723492596531/t0(0) o3->95f6baa9-cae1-980a-d6b8-416bd96ccefe@6059@gni102:0/0 lens 448/0 e 0 to 0 dl 1408567734 ref 2 fl New:/0/ffffffff rc 0/-1
Aug 20 16:48:49 atlas-oss1c7.ccs.ornl.gov kernel: [3718343.822691] Lustre: 7290:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 486 previous similar messages
Aug 20 16:49:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718359.015641] LustreError: 43851:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-6051@gni102: deadline 600:27s ago
Aug 20 16:49:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718359.015643]   req@ffff880ff1ba0000 x1476723492738665/t0(0) o3->6d8c7cb6-e328-4c21-2c36-982a7f86de95@6051@gni102:0/0 lens 448/0 e 0 to 0 dl 1408567717 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:49:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718359.015697] Lustre: 77333:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:27s); client may timeout.  req@ffff880b41f71000 x1476723485008770/t0(0) o3->328bf5d8-94e8-d2b1-4fdb-8ad22ab75ce2@1147@gni109:0/0 lens 448/0 e 0 to 0 dl 1408567717 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:49:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718359.015702] Lustre: 77333:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 46 previous similar messages
Aug 20 16:49:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718359.017060] LustreError: 77234:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-6153@gni102: deadline 602:21s ago
Aug 20 16:49:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718359.017062]   req@ffff880c40b0ac00 x1476723481044626/t0(0) o3->1be68b81-4ccf-b770-3f8a-398c2e71ae3d@6153@gni102:0/0 lens 448/0 e 0 to 0 dl 1408567723 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:49:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718359.017065] LustreError: 77234:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 130 previous similar messages
Aug 20 16:49:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718359.379244] LustreError: 43851:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 77 previous similar messages
Aug 20 16:49:13 atlas-oss1c7.ccs.ornl.gov kernel: [3718367.730338] LustreError: 77320:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -2+2s  req@ffff8805687ea400 x1476723488546521/t0(0) o3->08e495c7-e065-af2c-c556-91ff70e82fa1@4700@gni103:0/0 lens 448/432 e 0 to 0 dl 1408567751 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:49:13 atlas-oss1c7.ccs.ornl.gov kernel: [3718367.732399] Lustre: atlas1-OST0256: Bulk IO read error with 08e495c7-e065-af2c-c556-91ff70e82fa1 (at 4700@gni103), client will retry: rc -110
Aug 20 16:49:13 atlas-oss1c7.ccs.ornl.gov kernel: [3718367.732417] LustreError: 43951:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-3786@gni112: deadline 600:5s ago
Aug 20 16:49:13 atlas-oss1c7.ccs.ornl.gov kernel: [3718367.732419]   req@ffff88094936cc00 x1476723476124786/t0(0) o4->06949173-df6b-ac49-9acf-6eb1657f22c1@3786@gni112:0/0 lens 448/0 e 0 to 0 dl 1408567748 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:49:13 atlas-oss1c7.ccs.ornl.gov kernel: [3718367.952036] LustreError: 77320:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 11 previous similar messages
Aug 20 16:49:13 atlas-oss1c7.ccs.ornl.gov kernel: [3718367.982575] Lustre: atlas1-OST0256: Bulk IO read error with 08e495c7-e065-af2c-c556-91ff70e82fa1 (at 4700@gni103), client will retry: rc -110
Aug 20 16:49:13 atlas-oss1c7.ccs.ornl.gov kernel: [3718368.031092] Lustre: Skipped 10 previous similar messages
Aug 20 16:49:22 atlas-oss1c7.ccs.ornl.gov kernel: [3718376.311578] LustreError: 77244:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -11+11s  req@ffff88041867dc00 x1476723493534021/t0(0) o3->84a23340-9e66-aad4-12b9-35a20fd2a3ec@3815@gni103:0/0 lens 448/432 e 0 to 0 dl 1408567751 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:49:22 atlas-oss1c7.ccs.ornl.gov kernel: [3718376.319987] Lustre: atlas1-OST0256: Bulk IO read error with 85caff3e-664b-453f-605b-fcb076639dc5 (at 8391@gni111), client will retry: rc -110
Aug 20 16:49:22 atlas-oss1c7.ccs.ornl.gov kernel: [3718376.434093] LustreError: 77244:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 1 previous similar message
Aug 20 16:49:24 atlas-oss1c7.ccs.ornl.gov kernel: [3718378.744472] Lustre: 43839:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-8), not sending early reply
Aug 20 16:49:24 atlas-oss1c7.ccs.ornl.gov kernel: [3718378.744474]   req@ffff88102d42d000 x1476723494644305/t0(0) o3->3250fb13-d3bb-e4b3-60b3-d0bbd6bcef41@8411@gni102:0/0 lens 448/0 e 0 to 0 dl 1408567769 ref 2 fl New:/0/ffffffff rc 0/-1
Aug 20 16:49:24 atlas-oss1c7.ccs.ornl.gov kernel: [3718378.835748] Lustre: 43839:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 556 previous similar messages
Aug 20 16:49:27 atlas-oss1c7.ccs.ornl.gov kernel: [3718381.776170] LustreError: 103091:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 702s
Aug 20 16:49:27 atlas-oss1c7.ccs.ornl.gov kernel: [3718381.807739] LustreError: 103091:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 1 previous similar message
Aug 20 16:49:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718384.974814] LustreError: 43851:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -19+19s  req@ffff8801fbf98800 x1476723474775604/t0(0) o3->a64cda94-e9f0-d5d5-88e2-e2e017c7c3f2@4598@gni103:0/0 lens 448/432 e 0 to 0 dl 1408567751 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:49:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718384.975353] Lustre: atlas1-OST0256: Bulk IO read error with a64cda94-e9f0-d5d5-88e2-e2e017c7c3f2 (at 4598@gni103), client will retry: rc -110
Aug 20 16:49:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718384.975355] Lustre: Skipped 1 previous similar message
Aug 20 16:49:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718385.108261] LustreError: 43851:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 21 previous similar messages
Aug 20 16:49:44 atlas-oss1c7.ccs.ornl.gov kernel: [3718398.035761] LustreError: 77392:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -25+25s  req@ffff880a08fd2400 x1476723488296450/t0(0) o3->5cdf49d1-dd80-2602-d05a-395e024011de@2430@gni112:0/0 lens 448/432 e 0 to 0 dl 1408567758 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:49:44 atlas-oss1c7.ccs.ornl.gov kernel: [3718398.042379] Lustre: atlas1-OST0256: Bulk IO read error with 5cdf49d1-dd80-2602-d05a-395e024011de (at 2430@gni112), client will retry: rc -110
Aug 20 16:49:44 atlas-oss1c7.ccs.ornl.gov kernel: [3718398.042382] Lustre: Skipped 21 previous similar messages
Aug 20 16:49:44 atlas-oss1c7.ccs.ornl.gov kernel: [3718398.042390] Lustre: 43844:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (617:28s); client may timeout.  req@ffff880e3c945c00 x1476723488296449/t0(0) o3->5cdf49d1-dd80-2602-d05a-395e024011de@2430@gni112:0/0 lens 448/432 e 0 to 0 dl 1408567755 ref 1 fl Complete:/0/ffffffff rc 0/-1
Aug 20 16:49:44 atlas-oss1c7.ccs.ornl.gov kernel: [3718398.042393] Lustre: 43844:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 249 previous similar messages
Aug 20 16:49:44 atlas-oss1c7.ccs.ornl.gov kernel: [3718398.303856] LustreError: 77392:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 10 previous similar messages
Aug 20 16:49:48 atlas-oss1c7.ccs.ornl.gov kernel: [3718402.318377] LustreError: 77307:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -32+32s  req@ffff8809f45ecc00 x1476723481387822/t0(0) o3->402b084b-13ff-5e6d-bb7b-e7ee7d63e11d@1045@gni109:0/0 lens 448/432 e 0 to 0 dl 1408567756 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:49:48 atlas-oss1c7.ccs.ornl.gov kernel: [3718402.318463] Lustre: atlas1-OST0256: Bulk IO read error with 402b084b-13ff-5e6d-bb7b-e7ee7d63e11d (at 1045@gni109), client will retry: rc -110
Aug 20 16:49:48 atlas-oss1c7.ccs.ornl.gov kernel: [3718402.318465] Lustre: Skipped 10 previous similar messages
Aug 20 16:49:48 atlas-oss1c7.ccs.ornl.gov kernel: [3718402.454325] LustreError: 77307:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 13 previous similar messages
Aug 20 16:49:56 atlas-oss1c7.ccs.ornl.gov kernel: [3718410.954639] LustreError: 77361:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -2+2s  req@ffff880f3af68c00 x1476723485084257/t0(0) o3->85caff3e-664b-453f-605b-fcb076639dc5@8391@gni111:0/0 lens 448/432 e 0 to 0 dl 1408567794 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:49:56 atlas-oss1c7.ccs.ornl.gov kernel: [3718410.955127] Lustre: atlas1-OST0256: Bulk IO read error with 85caff3e-664b-453f-605b-fcb076639dc5 (at 8391@gni111), client will retry: rc -110
Aug 20 16:49:56 atlas-oss1c7.ccs.ornl.gov kernel: [3718410.955130] Lustre: Skipped 13 previous similar messages
Aug 20 16:49:56 atlas-oss1c7.ccs.ornl.gov kernel: [3718411.088480] LustreError: 77361:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 7 previous similar messages
Aug 20 16:50:01 atlas-oss1c7.ccs.ornl.gov kernel: [3718415.282568] LustreError: 77250:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-6137@gni102: deadline 620:3s ago
Aug 20 16:50:01 atlas-oss1c7.ccs.ornl.gov kernel: [3718415.282570]   req@ffff880acd67e400 x1476723484382550/t0(0) o3->afc91358-d9a9-2d58-065a-6aa8299c1ade@6137@gni102:0/0 lens 448/0 e 0 to 0 dl 1408567798 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:50:01 atlas-oss1c7.ccs.ornl.gov kernel: [3718415.379426] LustreError: 77250:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 11 previous similar messages
Aug 20 16:50:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718419.615897] LustreError: 77339:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -5+5s  req@ffff880d076f0800 x1476723485540544/t0(0) o3->4b402640-fd30-4e8b-c3a1-49c0ee2825dd@6234@gni102:0/0 lens 448/432 e 0 to 0 dl 1408567800 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:50:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718419.690508] LustreError: 77339:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 1 previous similar message
Aug 20 16:50:14 atlas-oss1c7.ccs.ornl.gov kernel: [3718428.286206] Lustre: atlas1-OST0256: Bulk IO read error with c86eb9ad-6361-edb2-bfa0-7140d29fc732 (at 4608@gni103), client will retry: rc -110
Aug 20 16:50:14 atlas-oss1c7.ccs.ornl.gov kernel: [3718428.324130] Lustre: Skipped 19 previous similar messages
Aug 20 16:50:22 atlas-oss1c7.ccs.ornl.gov kernel: [3718436.941425] LustreError: 9539:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -11+11s  req@ffff880667076800 x1476723484362255/t0(0) o3->c86eb9ad-6361-edb2-bfa0-7140d29fc732@4608@gni103:0/0 lens 448/432 e 0 to 0 dl 1408567811 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:50:22 atlas-oss1c7.ccs.ornl.gov kernel: [3718437.017397] LustreError: 9539:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 13 previous similar messages
Aug 20 16:50:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718444.769355] Lustre: 77271:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-5), not sending early reply
Aug 20 16:50:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718444.769357]   req@ffff88072f156400 x1476723476106848/t0(0) o3->4ba40e32-4892-7658-9d22-e3b5ecd07226@4620@gni103:0/0 lens 448/0 e 0 to 0 dl 1408567835 ref 2 fl New:/0/ffffffff rc 0/-1
Aug 20 16:50:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718444.860780] Lustre: 77271:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 453 previous similar messages
Aug 20 16:50:40 atlas-oss1c7.ccs.ornl.gov kernel: [3718454.269913] LustreError: 77282:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-8500@gni111: deadline 605:123s ago
Aug 20 16:50:40 atlas-oss1c7.ccs.ornl.gov kernel: [3718454.269915]   req@ffff880d3901c800 x1476723476742596/t0(0) o3->0b902e2d-5202-d963-03ea-e5c74d9c3c64@8500@gni111:0/0 lens 448/0 e 0 to 0 dl 1408567717 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:50:40 atlas-oss1c7.ccs.ornl.gov kernel: [3718454.364231] LustreError: 77282:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 417 previous similar messages
Aug 20 16:50:44 atlas-oss1c7.ccs.ornl.gov kernel: [3718458.592668] LustreError: 77342:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-8393@gni111: deadline 600:127s ago
Aug 20 16:50:44 atlas-oss1c7.ccs.ornl.gov kernel: [3718458.592670]   req@ffff880a8c1a5c00 x1476723482080096/t0(0) o3->83dbfc21-a372-755e-7940-a06b01091be7@8393@gni111:0/0 lens 448/0 e 0 to 0 dl 1408567717 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:50:44 atlas-oss1c7.ccs.ornl.gov kernel: [3718458.686216] LustreError: 77342:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 618 previous similar messages
Aug 20 16:50:48 atlas-oss1c7.ccs.ornl.gov kernel: [3718462.914886] Lustre: 80215:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:104s); client may timeout.  req@ffff880ae3368800 x1476723477868379/t0(0) o3->86cc564e-afaf-9398-394b-f6345cc8f94b@8388@gni111:0/0 lens 448/0 e 0 to 0 dl 1408567744 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:50:48 atlas-oss1c7.ccs.ornl.gov kernel: [3718463.007639] Lustre: 80215:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 1590 previous similar messages
Aug 20 16:50:53 atlas-oss1c7.ccs.ornl.gov kernel: [3718467.243956] LustreError: 7303:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-4695@gni103: deadline 627:61s ago
Aug 20 16:50:53 atlas-oss1c7.ccs.ornl.gov kernel: [3718467.243958]   req@ffff8803f2135800 x1476723490678009/t0(0) o3->a86fc038-c049-9bc4-80b6-db9288380d7e@4695@gni103:0/0 lens 448/0 e 0 to 0 dl 1408567792 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:50:53 atlas-oss1c7.ccs.ornl.gov kernel: [3718467.339109] LustreError: 7303:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 678 previous similar messages
Aug 20 16:50:57 atlas-oss1c7.ccs.ornl.gov kernel: [3718471.578479] LustreError: 77286:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -9+9s  req@ffff880c56d3a000 x1476723499159142/t0(0) o3->dad86f06-8ab3-ad5d-85ec-2672ca42ad11@9936@gni102:0/0 lens 448/432 e 0 to 0 dl 1408567848 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:50:57 atlas-oss1c7.ccs.ornl.gov kernel: [3718471.583186] Lustre: atlas1-OST0256: Bulk IO read error with dad86f06-8ab3-ad5d-85ec-2672ca42ad11 (at 9936@gni102), client will retry: rc -110
Aug 20 16:50:57 atlas-oss1c7.ccs.ornl.gov kernel: [3718471.583189] Lustre: Skipped 13 previous similar messages
Aug 20 16:50:57 atlas-oss1c7.ccs.ornl.gov kernel: [3718471.720909] LustreError: 77286:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 30 previous similar messages
Aug 20 16:51:10 atlas-oss1c7.ccs.ornl.gov kernel: [3718484.567790] LustreError: 80223:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-8383@gni111: deadline 673:7s ago
Aug 20 16:51:10 atlas-oss1c7.ccs.ornl.gov kernel: [3718484.567791]   req@ffff880c0d868800 x1476723485404040/t0(0) o3->51227e4c-3951-5a4c-6a84-2f061788aade@8383@gni111:0/0 lens 448/0 e 0 to 0 dl 1408567863 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:51:10 atlas-oss1c7.ccs.ornl.gov kernel: [3718484.665932] LustreError: 80223:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 490 previous similar messages
Aug 20 16:51:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718504.404621] LustreError: 103807:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 673s
Aug 20 16:51:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718504.433936] LustreError: 103807:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 3 previous similar messages
Aug 20 16:51:58 atlas-oss1c7.ccs.ornl.gov kernel: [3718532.206909] LustreError: 43864:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-4620@gni103: deadline 610:82s ago
Aug 20 16:51:58 atlas-oss1c7.ccs.ornl.gov kernel: [3718532.206911]   req@ffff8806a9f68800 x1476723476106838/t0(0) o3->4ba40e32-4892-7658-9d22-e3b5ecd07226@4620@gni103:0/0 lens 448/0 e 0 to 0 dl 1408567835 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:51:58 atlas-oss1c7.ccs.ornl.gov kernel: [3718532.304006] LustreError: 43864:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 283 previous similar messages
Aug 20 16:52:02 atlas-oss1c7.ccs.ornl.gov kernel: [3718536.528890] LustreError: 77283:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -32+32s  req@ffff8809f0f41000 x1476723477203040/t0(0) o3->19c411d9-275c-168c-dddf-88a85d1bc6f3@2232@gni112:0/0 lens 448/432 e 0 to 0 dl 1408567890 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:52:02 atlas-oss1c7.ccs.ornl.gov kernel: [3718536.529117] Lustre: atlas1-OST0256: Bulk IO read error with 19c411d9-275c-168c-dddf-88a85d1bc6f3 (at 2232@gni112), client will retry: rc -110
Aug 20 16:52:02 atlas-oss1c7.ccs.ornl.gov kernel: [3718536.529119] Lustre: Skipped 346 previous similar messages
Aug 20 16:52:02 atlas-oss1c7.ccs.ornl.gov kernel: [3718536.665781] LustreError: 77283:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 460 previous similar messages
Aug 20 16:52:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718572.819581] Lustre: 80215:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-111), not sending early reply
Aug 20 16:52:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718572.819582]   req@ffff88052d99a400 x1476723492675088/t0(0) o3->e1a2ea49-810b-c485-af03-c43e147ab9fa@4681@gni103:0/0 lens 448/432 e 0 to 0 dl 1408567963 ref 2 fl Interpret:/0/0 rc 0/0
Aug 20 16:52:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718572.909142] Lustre: 80215:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 2672 previous similar messages
Aug 20 16:52:41 atlas-oss1c7.ccs.ornl.gov kernel: [3718575.516573] LustreError: 43841:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 16:52:41 atlas-oss1c7.ccs.ornl.gov kernel: [3718575.516575]  ns: filter-atlas1-OST0256_UUID lock: ffff880251b23000/0x33255120e82c06e4 lrc: 3/0,0 mode: PR/PR res: [0x595bf8:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x20 nid: 10.38.144.11@o2ib4 remote: 0xe48198adef1c9b98 expref: 70 pid: 92623 timeout: 8012195074 lvb_type: 1
Aug 20 16:52:41 atlas-oss1c7.ccs.ornl.gov kernel: [3718575.640621] LustreError: 43841:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 5 previous similar messages
Aug 20 16:52:47 atlas-oss1c7.ccs.ornl.gov kernel: [3718581.303851] Lustre: atlas1-OST0136: Client baf173c2-83ca-1554-c9dc-0ba620b743f1 (at 2487@gni112) reconnecting
Aug 20 16:52:47 atlas-oss1c7.ccs.ornl.gov kernel: [3718581.332361] Lustre: Skipped 5 previous similar messages
Aug 20 16:52:47 atlas-oss1c7.ccs.ornl.gov kernel: [3718581.375800] Lustre: atlas1-OST0136: Client 4159e193-2333-4edb-c3e4-b9445beecfd1 (at 3792@gni103) refused reconnection, still busy with 3 active RPCs
Aug 20 16:52:47 atlas-oss1c7.ccs.ornl.gov kernel: [3718581.412727] Lustre: Skipped 9 previous similar messages
Aug 20 16:52:47 atlas-oss1c7.ccs.ornl.gov kernel: [3718581.469211] Lustre: atlas1-OST0136: Client a74969e9-d7e3-429b-527f-cbc255927d0f (at 9913@gni111) refused reconnection, still busy with 1 active RPCs
Aug 20 16:52:47 atlas-oss1c7.ccs.ornl.gov kernel: [3718581.502763] Lustre: Skipped 1 previous similar message
Aug 20 16:52:48 atlas-oss1c7.ccs.ornl.gov kernel: [3718582.273312] Lustre: atlas1-OST0136: Client c1487b46-288d-ae6e-77a8-963c979e3811 (at 6228@gni102) refused reconnection, still busy with 7 active RPCs
Aug 20 16:52:48 atlas-oss1c7.ccs.ornl.gov kernel: [3718582.312925] Lustre: Skipped 3 previous similar messages
Aug 20 16:52:53 atlas-oss1c7.ccs.ornl.gov kernel: [3718587.377693] Lustre: atlas1-OST0256: Client b41a3193-dc3a-46b9-f678-040fc7a94fa2 (at 2375@gni112) reconnecting
Aug 20 16:52:53 atlas-oss1c7.ccs.ornl.gov kernel: [3718587.404466] Lustre: Skipped 90 previous similar messages
Aug 20 16:52:56 atlas-oss1c7.ccs.ornl.gov kernel: [3718590.347228] Lustre: atlas1-OST0136: Client 3266db1a-2d15-eeaa-c088-5be50a095d19 (at 815@gni103) refused reconnection, still busy with 2 active RPCs
Aug 20 16:52:57 atlas-oss1c7.ccs.ornl.gov kernel: [3718591.439607] Lustre: atlas1-OST0136: Client db730e38-add0-54cc-5337-fbca1a1508c8 (at 3769@gni112) reconnecting
Aug 20 16:52:57 atlas-oss1c7.ccs.ornl.gov kernel: [3718591.466212] Lustre: Skipped 39 previous similar messages
Aug 20 16:52:58 atlas-oss1c7.ccs.ornl.gov kernel: [3718592.820028] Lustre: 43879:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (726:10s); client may timeout.  req@ffff880edfee4c00 x1476723490521820/t0(0) o3->e2461e63-173f-97b2-c963-d9e76909669d@6065@gni102:0/0 lens 448/400 e 0 to 0 dl 1408567968 ref 1 fl Complete:/0/0 rc 4096/4096
Aug 20 16:52:58 atlas-oss1c7.ccs.ornl.gov kernel: [3718592.820911] LustreError: 77294:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff8809889cfc00 x1476723488869602/t0(0) o3->c1487b46-288d-ae6e-77a8-963c979e3811@6228@gni102:0/0 lens 448/432 e 0 to 0 dl 1408567967 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:52:58 atlas-oss1c7.ccs.ornl.gov kernel: [3718592.829231] LustreError: 77266:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880eec21ec00 x1476723488869608/t0(0) o3->c1487b46-288d-ae6e-77a8-963c979e3811@6228@gni102:0/0 lens 448/432 e 0 to 0 dl 1408567974 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:52:58 atlas-oss1c7.ccs.ornl.gov kernel: [3718593.066629] Lustre: 43879:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 2452 previous similar messages
Aug 20 16:53:02 atlas-oss1c7.ccs.ornl.gov kernel: [3718596.403075] Lustre: atlas1-OST0256: Client 764fcf66-086f-8819-c494-65aba76a3d23 (at 8417@gni102) refused reconnection, still busy with 1 active RPCs
Aug 20 16:53:02 atlas-oss1c7.ccs.ornl.gov kernel: [3718596.438453] Lustre: Skipped 1 previous similar message
Aug 20 16:53:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718600.203455] Lustre: atlas1-OST0136: Client d9a78a73-8d2a-4cec-6323-40c19b4401bd (at 2326@gni103) reconnecting
Aug 20 16:53:05 atlas-oss1c7.ccs.ornl.gov kernel: [3718600.229514] Lustre: Skipped 53 previous similar messages
Aug 20 16:53:07 atlas-oss1c7.ccs.ornl.gov kernel: [3718601.479042] LustreError: 11285:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff8808a56b4c00 x1476723482107386/t0(0) o3->78cfc0ab-7199-1207-b28d-1617420afce3@6161@gni102:0/0 lens 448/432 e 0 to 0 dl 1408567972 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:53:07 atlas-oss1c7.ccs.ornl.gov kernel: [3718601.560694] LustreError: 11285:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 30 previous similar messages
Aug 20 16:53:08 atlas-oss1c7.ccs.ornl.gov kernel: [3718602.482318] Lustre: atlas1-OST0256: Client f5965081-4acc-18c8-b543-ef37c2399368 (at 2324@gni103) refused reconnection, still busy with 2 active RPCs
Aug 20 16:53:08 atlas-oss1c7.ccs.ornl.gov kernel: [3718602.520727] Lustre: Skipped 1 previous similar message
Aug 20 16:53:11 atlas-oss1c7.ccs.ornl.gov kernel: [3718605.821802] LustreError: 7304:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880836193400 x1476723499950598/t0(0) o3->9c4d87f7-8f01-ff37-6ce9-b2a411563188@4703@gni103:0/0 lens 448/432 e 0 to 0 dl 1408568036 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:53:11 atlas-oss1c7.ccs.ornl.gov kernel: [3718605.828386] LustreError: 77334:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-10.38.144.24@o2ib4: deadline 673:56s ago
Aug 20 16:53:11 atlas-oss1c7.ccs.ornl.gov kernel: [3718605.828388]   req@ffff880c2fe41800 x1475624092504712/t0(0) o3->dea7addd-49cf-5cb1-ed49-982f5194a447@10.38.144.24@o2ib4:0/0 lens 488/0 e 0 to 0 dl 1408567935 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:53:11 atlas-oss1c7.ccs.ornl.gov kernel: [3718605.828391] LustreError: 77334:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 241 previous similar messages
Aug 20 16:53:11 atlas-oss1c7.ccs.ornl.gov kernel: [3718606.041523] LustreError: 7304:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 5 previous similar messages
Aug 20 16:53:16 atlas-oss1c7.ccs.ornl.gov kernel: [3718610.539507] Lustre: atlas1-OST0256: Client 1cb8ae78-4ac4-db8c-2473-b83d0dacdf5b (at 836@gni112) refused reconnection, still busy with 9 active RPCs
Aug 20 16:53:16 atlas-oss1c7.ccs.ornl.gov kernel: [3718610.573622] Lustre: Skipped 1 previous similar message
Aug 20 16:53:21 atlas-oss1c7.ccs.ornl.gov kernel: [3718616.215531] Lustre: atlas1-OST0136: Client aa0eb32b-fd0b-570d-d19f-7fedf68367de (at 3708@gni112) reconnecting
Aug 20 16:53:21 atlas-oss1c7.ccs.ornl.gov kernel: [3718616.245586] Lustre: Skipped 57 previous similar messages
Aug 20 16:53:24 atlas-oss1c7.ccs.ornl.gov kernel: [3718618.797351] LustreError: 92650:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880d2820bc00 x1476723488298479/t0(0) o3->5cdf49d1-dd80-2602-d05a-395e024011de@2430@gni112:0/0 lens 448/432 e 0 to 0 dl 1408567983 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:53:24 atlas-oss1c7.ccs.ornl.gov kernel: [3718618.867683] LustreError: 92650:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 11 previous similar messages
Aug 20 16:53:34 atlas-oss1c7.ccs.ornl.gov kernel: [3718628.570271] Lustre: atlas1-OST0256: Client 1ff17d66-77f6-086f-13a0-c3b72aaa3a08 (at 4531@gni103) refused reconnection, still busy with 4 active RPCs
Aug 20 16:53:34 atlas-oss1c7.ccs.ornl.gov kernel: [3718628.610316] Lustre: Skipped 6 previous similar messages
Aug 20 16:53:40 atlas-oss1c7.ccs.ornl.gov kernel: [3718634.506462] Lustre: atlas1-OST0016: haven't heard from client d1aa85c9-ae10-f652-df8d-9e64a1c21a03 (at 10.36.202.174@o2ib) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff88102e706c00, cur 1408568020 expire 1408567120 last 1408566668
Aug 20 16:53:40 atlas-oss1c7.ccs.ornl.gov kernel: [3718634.572738] Lustre: Skipped 35 previous similar messages
Aug 20 16:53:41 atlas-oss1c7.ccs.ornl.gov kernel: [3718636.121592] LustreError: 43947:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880295cb5400 x1476723478008237/t0(0) o3->9d7294c5-b191-5e23-ad63-8be5b5fdc583@4618@gni103:0/0 lens 448/432 e 0 to 0 dl 1408567990 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:53:41 atlas-oss1c7.ccs.ornl.gov kernel: [3718636.202565] LustreError: 43947:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 15 previous similar messages
Aug 20 16:53:50 atlas-oss1c7.ccs.ornl.gov kernel: [3718644.788093] LustreError: 43858:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880bc9a57800 x1476723487586002/t0(0) o3->3e8c2a3f-7c00-ec17-6a4d-38a3879dc8d7@9934@gni111:0/0 lens 448/432 e 0 to 0 dl 1408568049 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:53:54 atlas-oss1c7.ccs.ornl.gov kernel: [3718648.339983] Lustre: atlas1-OST0136: Client 79517112-0404-47aa-0a02-f52718bf9733 (at 9924@gni111) reconnecting
Aug 20 16:53:54 atlas-oss1c7.ccs.ornl.gov kernel: [3718648.367959] Lustre: Skipped 64 previous similar messages
Aug 20 16:54:12 atlas-oss1c7.ccs.ornl.gov kernel: [3718666.430796] LustreError: 7304:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880963e50000 x1476723479211817/t0(0) o3->1cb8ae78-4ac4-db8c-2473-b83d0dacdf5b@836@gni112:0/0 lens 448/432 e 0 to 0 dl 1408568054 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:54:12 atlas-oss1c7.ccs.ornl.gov kernel: [3718666.430863] LustreError: 43872:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -16+16s  req@ffff8802d2aa5000 x1476723486315719/t0(0) o3->030ff632-4bb9-2fc9-3c06-3dea57ebe1f4@4518@gni103:0/0 lens 448/432 e 0 to 0 dl 1408568036 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:54:12 atlas-oss1c7.ccs.ornl.gov kernel: [3718666.430867] LustreError: 43872:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 895 previous similar messages
Aug 20 16:54:12 atlas-oss1c7.ccs.ornl.gov kernel: [3718666.431148] Lustre: atlas1-OST0136: Bulk IO read error with 0eaca7b9-18b7-604a-5541-2505f4fd0daf (at 4519@gni103), client will retry: rc -110
Aug 20 16:54:12 atlas-oss1c7.ccs.ornl.gov kernel: [3718666.431151] Lustre: Skipped 1132 previous similar messages
Aug 20 16:54:12 atlas-oss1c7.ccs.ornl.gov kernel: [3718666.677111] LustreError: 7304:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 34 previous similar messages
Aug 20 16:54:13 atlas-oss1c7.ccs.ornl.gov kernel: [3718667.880391] Lustre: atlas1-OST0136: Client 7e226e61-90e7-945d-7086-b1372897fb6e (at 8389@gni111) refused reconnection, still busy with 1 active RPCs
Aug 20 16:54:13 atlas-oss1c7.ccs.ornl.gov kernel: [3718667.915342] Lustre: Skipped 4 previous similar messages
Aug 20 16:54:31 atlas-oss1c7.ccs.ornl.gov kernel: [3718685.623078] LustreError: 105284:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 750s
Aug 20 16:54:31 atlas-oss1c7.ccs.ornl.gov kernel: [3718685.652596] LustreError: 105284:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 7 previous similar messages
Aug 20 16:54:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718692.595890] LustreError: 77350:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 16:54:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718692.595892]  ns: filter-atlas1-OST01c6_UUID lock: ffff8808285886c0/0x33255120e82c07af lrc: 4/0,0 mode: PW/PW res: [0x5b3732:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x20 nid: 1845@gni106 remote: 0xf93ecc0db5933680 expref: 19 pid: 38446 timeout: 8012331379 lvb_type: 0
Aug 20 16:54:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718692.715535] LustreError: 77350:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 6 previous similar messages
Aug 20 16:54:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718692.760459] LustreError: 77382:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 755 seconds
Aug 20 16:54:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718692.760461]  ns: filter-atlas1-OST01c6_UUID lock: ffff8808285886c0/0x33255120e82c07af lrc: 3/0,0 mode: PW/PW res: [0x5b3732:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x20 nid: 1845@gni106 remote: 0xf93ecc0db5933680 expref: 16 pid: 38446 timeout: 8012711621 lvb_type: 0
Aug 20 16:54:51 atlas-oss1c7.ccs.ornl.gov kernel: [3718705.416609] LustreError: 80222:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff8809a64e6c00 x1476723477021230/t0(0) o3->60bc5f6f-665d-aa68-ccf8-79a439c8d154@8396@gni111:0/0 lens 448/432 e 0 to 0 dl 1408568101 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:54:51 atlas-oss1c7.ccs.ornl.gov kernel: [3718705.489619] LustreError: 80222:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 18 previous similar messages
Aug 20 16:54:58 atlas-oss1c7.ccs.ornl.gov kernel: [3718712.384683] Lustre: atlas1-OST0256: Client 246e7b72-7693-0670-8ec4-2bbebf3facdb (at 8501@gni111) reconnecting
Aug 20 16:54:58 atlas-oss1c7.ccs.ornl.gov kernel: [3718712.412043] Lustre: Skipped 99 previous similar messages
Aug 20 16:55:18 atlas-oss1c7.ccs.ornl.gov kernel: [3718732.566108] Lustre: atlas1-OST0256: Client 7455c359-d4a5-6bff-74d0-69b195e78d6b (at 9933@gni111) refused reconnection, still busy with 5 active RPCs
Aug 20 16:55:18 atlas-oss1c7.ccs.ornl.gov kernel: [3718732.599921] Lustre: Skipped 17 previous similar messages
Aug 20 16:55:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718744.379326] LustreError: 43950:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-793@gni103: deadline 755:3s ago
Aug 20 16:55:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718744.379328]   req@ffff88068f20f800 x1476723501930523/t0(0) o3->ffc95758-3008-9c9b-fff3-8a2d0b6564d5@793@gni103:0/0 lens 448/0 e 0 to 0 dl 1408568127 ref 1 fl Interpret:/0/ffffffff rc 0/-1
Aug 20 16:55:30 atlas-oss1c7.ccs.ornl.gov kernel: [3718744.474219] LustreError: 43950:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 610 previous similar messages
Aug 20 16:56:00 atlas-oss1c7.ccs.ornl.gov kernel: [3718774.685198] LustreError: 77387:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff8808cb089000 x1476723491245847/t0(0) o3->26ed5c71-0fd3-ad1b-f4fc-99d2c1ad45bc@9945@gni102:0/0 lens 448/432 e 0 to 0 dl 1408568149 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:56:00 atlas-oss1c7.ccs.ornl.gov kernel: [3718774.688703] LustreError: 77353:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 16:56:00 atlas-oss1c7.ccs.ornl.gov kernel: [3718774.688705]  ns: filter-atlas1-OST01c6_UUID lock: ffff880ee2e2f000/0x33255120e82c7f0a lrc: 4/0,0 mode: PW/PW res: [0x5b3740:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x20 nid: 2058@gni109 remote: 0x673013a30fbc2846 expref: 9 pid: 38566 timeout: 8012411598 lvb_type: 0
Aug 20 16:56:00 atlas-oss1c7.ccs.ornl.gov kernel: [3718774.688709] LustreError: 77353:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 13 previous similar messages
Aug 20 16:56:00 atlas-oss1c7.ccs.ornl.gov kernel: [3718774.925398] LustreError: 77387:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 108 previous similar messages
Aug 20 16:56:29 atlas-oss1c7.ccs.ornl.gov kernel: [3718803.998300] Lustre: Failing over atlas1-OST00a6
Aug 20 16:56:29 atlas-oss1c7.ccs.ornl.gov kernel: [3718804.007133] Lustre: Skipped 3 previous similar messages
Aug 20 16:56:31 atlas-oss1c7.ccs.ornl.gov kernel: [3718806.156941] Lustre: atlas1-OST00a6: Not available for connect from 10.36.226.72@o2ib (stopping)
Aug 20 16:56:32 atlas-oss1c7.ccs.ornl.gov kernel: [3718806.739415] Lustre: atlas1-OST00a6: Not available for connect from 10.38.144.55@o2ib4 (stopping)
Aug 20 16:56:33 atlas-oss1c7.ccs.ornl.gov kernel: [3718807.651582] Lustre: atlas1-OST00a6: Not available for connect from 10.36.247.154@o2ib (stopping)
Aug 20 16:56:33 atlas-oss1c7.ccs.ornl.gov kernel: [3718807.669157] Lustre: Skipped 1 previous similar message
Aug 20 16:56:34 atlas-oss1c7.ccs.ornl.gov kernel: [3718809.225464] Lustre: atlas1-OST00a6: Not available for connect from 132@gni3 (stopping)
Aug 20 16:56:34 atlas-oss1c7.ccs.ornl.gov kernel: [3718809.239328] Lustre: Skipped 8 previous similar messages
Aug 20 16:56:37 atlas-oss1c7.ccs.ornl.gov kernel: [3718811.547705] Lustre: atlas1-OST00a6: Not available for connect from 10.36.202.139@o2ib (stopping)
Aug 20 16:56:37 atlas-oss1c7.ccs.ornl.gov kernel: [3718811.570026] Lustre: Skipped 2 previous similar messages
Aug 20 16:56:41 atlas-oss1c7.ccs.ornl.gov kernel: [3718815.993744] Lustre: atlas1-OST00a6: Not available for connect from 10.38.145.196@o2ib4 (stopping)
Aug 20 16:56:41 atlas-oss1c7.ccs.ornl.gov kernel: [3718816.020837] Lustre: Skipped 9 previous similar messages
Aug 20 16:56:47 atlas-oss1c7.ccs.ornl.gov kernel: [3718821.636333] Lustre: atlas1-OST00a6 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 943. Is it stuck?
Aug 20 16:56:49 atlas-oss1c7.ccs.ornl.gov kernel: [3718824.259270] Lustre: atlas1-OST00a6: Not available for connect from 10.38.145.225@o2ib4 (stopping)
Aug 20 16:56:49 atlas-oss1c7.ccs.ornl.gov kernel: [3718824.283896] Lustre: Skipped 14 previous similar messages
Aug 20 16:56:53 atlas-oss1c7.ccs.ornl.gov kernel: [3718827.556842] LustreError: 18521:0:(ldlm_lockd.c:2380:ldlm_cancel_handler()) ldlm_cancel from 10.36.205.206@o2ib arrived at 1408568213 with bad export cookie 3685441071499592780
Aug 20 16:56:57 atlas-oss1c7.ccs.ornl.gov kernel: [3718831.915245] Lustre: 43856:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
Aug 20 16:56:57 atlas-oss1c7.ccs.ornl.gov kernel: [3718831.915246]   req@ffff8809b0e1f000 x1476723485326054/t0(0) o3->3fcfe699-1517-0c59-f694-5f6b249f095b@3765@gni112:0/0 lens 448/432 e 0 to 0 dl 1408568222 ref 2 fl Interpret:/0/0 rc 0/0
Aug 20 16:56:57 atlas-oss1c7.ccs.ornl.gov kernel: [3718832.006832] Lustre: 43856:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 3699 previous similar messages
Aug 20 16:57:03 atlas-oss1c7.ccs.ornl.gov kernel: [3718837.679386] Lustre: atlas1-OST00a6 is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 941. Is it stuck?
Aug 20 16:57:06 atlas-oss1c7.ccs.ornl.gov kernel: [3718840.681346] Lustre: atlas1-OST00a6: Not available for connect from 10.38.145.48@o2ib4 (stopping)
Aug 20 16:57:06 atlas-oss1c7.ccs.ornl.gov kernel: [3718840.701607] Lustre: Skipped 23 previous similar messages
Aug 20 16:57:08 atlas-oss1c7.ccs.ornl.gov kernel: [3718842.483389] Lustre: atlas1-OST0256: Client 8cba0309-7dc2-a5bc-d7fd-64c715e51836 (at 2379@gni112) reconnecting
Aug 20 16:57:08 atlas-oss1c7.ccs.ornl.gov kernel: [3718842.511558] Lustre: Skipped 274 previous similar messages
Aug 20 16:57:18 atlas-oss1c7.ccs.ornl.gov kernel: [3718852.636522] Lustre: 43830:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:20s); client may timeout.  req@ffff880c466b8c00 x1476723504224375/t0(0) o3->e28cc7e5-4ee2-1944-ec76-e87641cfe2b3@8401@gni102:0/0 lens 448/432 e 0 to 0 dl 1408568218 ref 1 fl Complete:/0/ffffffff rc 0/-1
Aug 20 16:57:18 atlas-oss1c7.ccs.ornl.gov kernel: [3718852.725268] Lustre: 43830:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 3159 previous similar messages
Aug 20 16:57:32 atlas-oss1c7.ccs.ornl.gov kernel: [3718866.796917] Lustre: atlas1-OST0136: Client 49b2dd82-7637-f129-80a9-9a08357218c6 (at 4515@gni103) refused reconnection, still busy with 3 active RPCs
Aug 20 16:57:32 atlas-oss1c7.ccs.ornl.gov kernel: [3718866.830815] Lustre: Skipped 38 previous similar messages
Aug 20 16:57:35 atlas-oss1c7.ccs.ornl.gov kernel: [3718869.723455] Lustre: atlas1-OST00a6 is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 901. Is it stuck?
Aug 20 16:57:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718872.780300] Lustre: atlas1-OST00a6: Not available for connect from 10.36.247.166@o2ib (stopping)
Aug 20 16:57:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718872.802867] Lustre: Skipped 360 previous similar messages
Aug 20 16:58:10 atlas-oss1c7.ccs.ornl.gov kernel: [3718904.589585] LustreError: 7310:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880f6ae00800 x1476723480099063/t0(0) o3->30ae64b1-43d1-bb9d-4b88-b4925b22fd14@6135@gni102:0/0 lens 448/432 e 0 to 0 dl 1408568300 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:58:10 atlas-oss1c7.ccs.ornl.gov kernel: [3718904.665721] LustreError: 7310:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 250 previous similar messages
Aug 20 16:58:36 atlas-oss1c7.ccs.ornl.gov kernel: [3718930.568097] Lustre: atlas1-OST0136: Bulk IO read error with e0a18a62-ae4c-b78d-4e7f-e7f44a5a868b (at 6145@gni102), client will retry: rc -107
Aug 20 16:58:36 atlas-oss1c7.ccs.ornl.gov kernel: [3718930.568390] LustreError: 77383:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -16+16s  req@ffff8808b5144c00 x1476723490501693/t0(0) o3->921a508f-c19f-2612-585c-2b04868e8c76@6057@gni102:0/0 lens 448/432 e 0 to 0 dl 1408568300 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 16:58:36 atlas-oss1c7.ccs.ornl.gov kernel: [3718930.568395] LustreError: 77383:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 2047 previous similar messages
Aug 20 16:58:36 atlas-oss1c7.ccs.ornl.gov kernel: [3718930.715203] Lustre: Skipped 2499 previous similar messages
Aug 20 16:58:39 atlas-oss1c7.ccs.ornl.gov kernel: [3718933.777606] Lustre: atlas1-OST00a6 is waiting for obd_unlinked_exports more than 64 seconds. The obd refcount = 815. Is it stuck?
Aug 20 16:58:42 atlas-oss1c7.ccs.ornl.gov kernel: [3718936.819847] Lustre: atlas1-OST00a6: Not available for connect from 607@gni3 (stopping)
Aug 20 16:58:42 atlas-oss1c7.ccs.ornl.gov kernel: [3718936.837696] Lustre: Skipped 19142 previous similar messages
Aug 20 16:59:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718993.222494] LustreError: 107336:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 720s
Aug 20 16:59:38 atlas-oss1c7.ccs.ornl.gov kernel: [3718993.249450] LustreError: 107336:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 9 previous similar messages
Aug 20 17:00:28 atlas-oss1c7.ccs.ornl.gov kernel: [3719043.264220] LustreError: 80219:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 17:00:28 atlas-oss1c7.ccs.ornl.gov kernel: [3719043.264222]  ns: filter-atlas1-OST0016_UUID lock: ffff88047782a000/0x33255120e82c1148 lrc: 3/0,0 mode: PW/PW res: [0x55feea:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->528383) flags: 0x20 nid: 1204@gni106 remote: 0x92b9d9276d6214c0 expref: 7 pid: 38369 timeout: 8012681992 lvb_type: 0
Aug 20 17:00:28 atlas-oss1c7.ccs.ornl.gov kernel: [3719043.378131] LustreError: 80219:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 6 previous similar messages
Aug 20 17:00:30 atlas-oss1c7.ccs.ornl.gov kernel: [3719045.044417] LustreError: 94110:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 753 seconds
Aug 20 17:00:30 atlas-oss1c7.ccs.ornl.gov kernel: [3719045.044418]  ns: filter-atlas1-OST0016_UUID lock: ffff88047782a000/0x33255120e82c1148 lrc: 3/0,0 mode: PW/PW res: [0x55feea:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->528383) flags: 0x20 nid: 1204@gni106 remote: 0x92b9d9276d6214c0 expref: 9 pid: 38369 timeout: 8013062146 lvb_type: 0
Aug 20 17:00:30 atlas-oss1c7.ccs.ornl.gov kernel: [3719045.168248] LustreError: 94110:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 4 previous similar messages
Aug 20 17:00:47 atlas-oss1c7.ccs.ornl.gov kernel: [3719061.855851] Lustre: atlas1-OST00a6 is waiting for obd_unlinked_exports more than 128 seconds. The obd refcount = 491. Is it stuck?
Aug 20 17:00:53 atlas-oss1c7.ccs.ornl.gov kernel: [3719068.125757] Lustre: atlas1-OST00a6: Not available for connect from 10.36.207.211@o2ib (stopping)
Aug 20 17:00:53 atlas-oss1c7.ccs.ornl.gov kernel: [3719068.156368] Lustre: Skipped 470 previous similar messages
Aug 20 17:01:26 atlas-oss1c7.ccs.ornl.gov kernel: [3719101.019180] Lustre: atlas1-OST0256: Client b08b1013-c371-9dba-0099-f11e98301a11 (at 4613@gni103) reconnecting
Aug 20 17:01:26 atlas-oss1c7.ccs.ornl.gov kernel: [3719101.049478] Lustre: Skipped 305 previous similar messages
Aug 20 17:01:58 atlas-oss1c7.ccs.ornl.gov kernel: [3719132.615773] Lustre: atlas1-OST0136: Client 3721d42f-fe2d-b54a-6749-4492e3997cce (at 2276@gni103) refused reconnection, still busy with 6 active RPCs
Aug 20 17:01:58 atlas-oss1c7.ccs.ornl.gov kernel: [3719132.651640] Lustre: Skipped 95 previous similar messages
Aug 20 17:02:34 atlas-oss1c7.ccs.ornl.gov kernel: [3719168.724434] LustreError: 43839:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880edfee5c00 x1476723493928535/t0(0) o3->070d0654-73a7-013a-6d12-59132c7ca866@2431@gni112:0/0 lens 448/432 e 0 to 0 dl 1408568712 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 17:02:34 atlas-oss1c7.ccs.ornl.gov kernel: [3719168.796047] LustreError: 43839:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 1080 previous similar messages
Aug 20 17:03:08 atlas-oss1c7.ccs.ornl.gov kernel: [3719203.367771] LustreError: 77360:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-3812@gni103: deadline 600:21s ago
Aug 20 17:03:08 atlas-oss1c7.ccs.ornl.gov kernel: [3719203.367773]   req@ffff8804458d6000 x1476723494745496/t0(0) o3->2f3c9880-1269-0dfc-7ed1-86b01d758486@3812@gni103:0/0 lens 448/0 e 0 to 0 dl 1408568567 ref 1 fl Interpret:/2/ffffffff rc 0/-1
Aug 20 17:03:08 atlas-oss1c7.ccs.ornl.gov kernel: [3719203.468504] LustreError: 77360:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 215 previous similar messages
Aug 20 17:04:30 atlas-oss1c7.ccs.ornl.gov kernel: [3719284.530287] Lustre: atlas1-OST0376: haven't heard from client 1d896f6a-8858-32b9-8715-dca8c78e4355 (at 10.38.146.27@o2ib4) in 1360 seconds. I think it's dead, and I am evicting it. exp ffff880fcefe7000, cur 1408568670 expire 1408567770 last 1408567310
Aug 20 17:04:30 atlas-oss1c7.ccs.ornl.gov kernel: [3719284.599226] Lustre: Skipped 6 previous similar messages
Aug 20 17:04:30 atlas-oss1c7.ccs.ornl.gov kernel: [3719284.618899] Lustre: atlas1-OST0376: haven't heard from client 37960920-8118-563e-38f4-890dba7707ba (at 10.38.146.45@o2ib4) in 1358 seconds. I think it's dead, and I am evicting it. exp ffff8809fccd9400, cur 1408568670 expire 1408567770 last 1408567312
Aug 20 17:05:03 atlas-oss1c7.ccs.ornl.gov kernel: [3719317.982401] Lustre: atlas1-OST00a6 is waiting for obd_unlinked_exports more than 256 seconds. The obd refcount = 205. Is it stuck?
Aug 20 17:05:10 atlas-oss1c7.ccs.ornl.gov kernel: [3719324.708984] Lustre: atlas1-OST00a6: Not available for connect from 10.38.145.15@o2ib4 (stopping)
Aug 20 17:05:10 atlas-oss1c7.ccs.ornl.gov kernel: [3719324.725265] Lustre: Skipped 369 previous similar messages
Aug 20 17:05:44 atlas-oss1c7.ccs.ornl.gov kernel: [3719359.113892] Lustre: 7293:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-56), not sending early reply
Aug 20 17:05:44 atlas-oss1c7.ccs.ornl.gov kernel: [3719359.113893]   req@ffff8802bbbc5c00 x1476723485252546/t0(0) o3->50bba4b3-6c17-d24b-19df-6d4c42a2af01@4520@gni103:0/0 lens 448/0 e 0 to 0 dl 1408568749 ref 2 fl New:/2/ffffffff rc 0/-1
Aug 20 17:05:44 atlas-oss1c7.ccs.ornl.gov kernel: [3719359.197545] Lustre: 7293:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 2963 previous similar messages
Aug 20 17:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3719367.910540] Lustre: 77277:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (648:43s); client may timeout.  req@ffff8809f4200800 x1476723492599564/t0(0) o3->14a5c02c-0750-e545-b8a8-da5e59f748f3@6223@gni102:0/0 lens 448/432 e 0 to 0 dl 1408568710 ref 1 fl Complete:/2/ffffffff rc 0/-1
Aug 20 17:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3719368.000995] Lustre: 77277:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 7039 previous similar messages
Aug 20 17:06:49 atlas-oss1c7.ccs.ornl.gov kernel: [3719424.214480] LustreError: 7270:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 17:06:49 atlas-oss1c7.ccs.ornl.gov kernel: [3719424.214482]  ns: filter-atlas1-OST0256_UUID lock: ffff880e302af900/0x33255120e82c44b7 lrc: 4/0,0 mode: PW/PW res: [0x595bf8:0x0:0x0].0 rrc: 1 type: EXT [0->18446744073709551615] (req 4096->8191) flags: 0x20 nid: 10.38.144.36@o2ib4 remote: 0x4b1a2a2527a6c865 expref: 402 pid: 92625 timeout: 8013060976 lvb_type: 0
Aug 20 17:06:49 atlas-oss1c7.ccs.ornl.gov kernel: [3719424.341471] LustreError: 7270:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 7 previous similar messages
Aug 20 17:06:55 atlas-oss1c7.ccs.ornl.gov kernel: [3719429.660229] Lustre: atlas1-OST0016: haven't heard from client 1d896f6a-8858-32b9-8715-dca8c78e4355 (at 10.38.146.27@o2ib4) in 1355 seconds. I think it's dead, and I am evicting it. exp ffff88090372d000, cur 1408568815 expire 1408567915 last 1408567460
Aug 20 17:06:55 atlas-oss1c7.ccs.ornl.gov kernel: [3719429.724672] Lustre: Skipped 1 previous similar message
Aug 20 17:06:56 atlas-oss1c7.ccs.ornl.gov kernel: [3719431.205576] Lustre: atlas1-OST02e6: haven't heard from client 1d896f6a-8858-32b9-8715-dca8c78e4355 (at 10.38.146.27@o2ib4) in 1356 seconds. I think it's dead, and I am evicting it. exp ffff880f8aebd800, cur 1408568816 expire 1408567916 last 1408567460
Aug 20 17:06:56 atlas-oss1c7.ccs.ornl.gov kernel: [3719431.274522] Lustre: Skipped 3 previous similar messages
Aug 20 17:07:19 atlas-oss1c7.ccs.ornl.gov kernel: [3719454.514670] Lustre: atlas1-OST0136: Bulk IO read error with 72ed1c9c-4f57-6ab9-7fb3-814b6e970af4 (at 6146@gni102), client will retry: rc -107
Aug 20 17:07:19 atlas-oss1c7.ccs.ornl.gov kernel: [3719454.553477] Lustre: Skipped 2106 previous similar messages
Aug 20 17:07:41 atlas-oss1c7.ccs.ornl.gov kernel: [3719476.172998] LustreError: 80230:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -24+24s  req@ffff8809ae417800 x1476723481467985/t0(0) o3->5c156359-e469-102e-7413-20d5502d4a29@3683@gni112:0/0 lens 448/432 e 0 to 0 dl 1408568837 ref 1 fl Interpret:/2/0 rc 0/0
Aug 20 17:07:41 atlas-oss1c7.ccs.ornl.gov kernel: [3719476.251511] LustreError: 80230:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 890 previous similar messages
Aug 20 17:08:50 atlas-oss1c7.ccs.ornl.gov kernel: [3719545.199122] LustreError: 111391:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 601s
Aug 20 17:08:50 atlas-oss1c7.ccs.ornl.gov kernel: [3719545.228942] LustreError: 111391:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 15 previous similar messages
Aug 20 17:09:59 atlas-oss1c7.ccs.ornl.gov kernel: [3719613.851292] Lustre: atlas1-OST0136: Client 253d7140-a244-f296-b3f2-b4239f998741 (at 6224@gni102) reconnecting
Aug 20 17:09:59 atlas-oss1c7.ccs.ornl.gov kernel: [3719613.874431] Lustre: Skipped 524 previous similar messages
Aug 20 17:10:40 atlas-oss1c7.ccs.ornl.gov kernel: [3719655.421844] LustreError: 38566:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 749 seconds
Aug 20 17:10:40 atlas-oss1c7.ccs.ornl.gov kernel: [3719655.421846]  ns: filter-atlas1-OST0136_UUID lock: ffff88055c9a2900/0x33255120e82c9d33 lrc: 3/0,0 mode: PW/PW res: [0x5ac405:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x20 nid: 2730@gni109 remote: 0xb424931e70e0bd45 expref: 11 pid: 38586 timeout: 8013667949 lvb_type: 0
Aug 20 17:10:40 atlas-oss1c7.ccs.ornl.gov kernel: [3719655.549691] LustreError: 38566:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 1 previous similar message
Aug 20 17:10:52 atlas-oss1c7.ccs.ornl.gov kernel: [3719666.744985] Lustre: atlas1-OST0136: Client 1945aed7-a6c5-7af2-896c-487ca6cfc496 (at 6955@gni102) refused reconnection, still busy with 3 active RPCs
Aug 20 17:10:52 atlas-oss1c7.ccs.ornl.gov kernel: [3719666.784158] Lustre: Skipped 22 previous similar messages
Aug 20 17:11:44 atlas-oss1c7.ccs.ornl.gov kernel: [3719718.652550] LustreError: 77304:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880e0bd09800 x1476723491392154/t0(0) o3->1945aed7-a6c5-7af2-896c-487ca6cfc496@6955@gni102:0/0 lens 448/432 e 0 to 0 dl 1408569508 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 17:11:44 atlas-oss1c7.ccs.ornl.gov kernel: [3719718.725992] LustreError: 77304:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 417 previous similar messages
Aug 20 17:12:05 atlas-oss1c7.ccs.ornl.gov kernel: [3719740.317222] LustreError: 38446:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 754 seconds
Aug 20 17:12:05 atlas-oss1c7.ccs.ornl.gov kernel: [3719740.317224]  ns: filter-atlas1-OST0016_UUID lock: ffff8806b6e6cb40/0x33255120e82ca185 lrc: 3/0,0 mode: PW/PW res: [0x55fefd:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x20 nid: 1068@gni109 remote: 0xfde47dd979df280c expref: 10 pid: 38573 timeout: 8013758780 lvb_type: 0
Aug 20 17:12:29 atlas-oss1c7.ccs.ornl.gov kernel: [3719763.930448] Lustre: atlas1-OST01c6: haven't heard from client 1d896f6a-8858-32b9-8715-dca8c78e4355 (at 10.38.146.27@o2ib4) in 1689 seconds. I think it's dead, and I am evicting it. exp ffff880b70a1c400, cur 1408569149 expire 1408568249 last 1408567460
Aug 20 17:12:29 atlas-oss1c7.ccs.ornl.gov kernel: [3719764.001007] Lustre: Skipped 1 previous similar message
Aug 20 17:13:35 atlas-oss1c7.ccs.ornl.gov kernel: [3719830.213458] Lustre: atlas1-OST00a6 is waiting for obd_unlinked_exports more than 512 seconds. The obd refcount = 3. Is it stuck?
Aug 20 17:13:36 atlas-oss1c7.ccs.ornl.gov kernel: [3719831.053984] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.38.144.41@o2ib4 (no target)
Aug 20 17:13:37 atlas-oss1c7.ccs.ornl.gov kernel: [3719831.826280] Lustre: server umount atlas1-OST00a6 complete
Aug 20 17:13:37 atlas-oss1c7.ccs.ornl.gov kernel: [3719831.837465] Lustre: Skipped 1 previous similar message
Aug 20 17:13:37 atlas-oss1c7.ccs.ornl.gov kernel: [3719831.862115] Lustre: Failing over atlas1-OST0016
Aug 20 17:13:38 atlas-oss1c7.ccs.ornl.gov kernel: [3719833.528238] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.36.202.176@o2ib (no target)
Aug 20 17:13:46 atlas-oss1c7.ccs.ornl.gov kernel: [3719840.709089] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.36.202.170@o2ib (no target)
Aug 20 17:13:51 atlas-oss1c7.ccs.ornl.gov kernel: [3719846.475031] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.36.202.175@o2ib (no target)
Aug 20 17:13:59 atlas-oss1c7.ccs.ornl.gov kernel: [3719854.441939] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.36.207.224@o2ib (no target)
Aug 20 17:14:02 atlas-oss1c7.ccs.ornl.gov kernel: [3719857.478899] Lustre: atlas1-OST0016: Not available for connect from 10.38.146.28@o2ib4 (stopping)
Aug 20 17:14:02 atlas-oss1c7.ccs.ornl.gov kernel: [3719857.497240] Lustre: Skipped 2469 previous similar messages
Aug 20 17:14:03 atlas-oss1c7.ccs.ornl.gov kernel: [3719858.660110] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.36.247.142@o2ib (no target)
Aug 20 17:14:03 atlas-oss1c7.ccs.ornl.gov kernel: [3719858.686954] LustreError: Skipped 10 previous similar messages
Aug 20 17:14:12 atlas-oss1c7.ccs.ornl.gov kernel: [3719866.770158] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.36.207.196@o2ib (no target)
Aug 20 17:14:12 atlas-oss1c7.ccs.ornl.gov kernel: [3719866.800044] LustreError: Skipped 16 previous similar messages
Aug 20 17:14:30 atlas-oss1c7.ccs.ornl.gov kernel: [3719884.751374] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.38.145.187@o2ib4 (no target)
Aug 20 17:14:30 atlas-oss1c7.ccs.ornl.gov kernel: [3719884.787187] LustreError: Skipped 11 previous similar messages
Aug 20 17:15:02 atlas-oss1c7.ccs.ornl.gov kernel: [3719916.838889] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.38.145.50@o2ib4 (no target)
Aug 20 17:15:02 atlas-oss1c7.ccs.ornl.gov kernel: [3719916.869017] LustreError: Skipped 236 previous similar messages
Aug 20 17:15:32 atlas-oss1c7.ccs.ornl.gov kernel: [3719946.783180] Lustre: atlas1-OST0136: haven't heard from client 1d896f6a-8858-32b9-8715-dca8c78e4355 (at 10.38.146.27@o2ib4) in 1872 seconds. I think it's dead, and I am evicting it. exp ffff88102c9efc00, cur 1408569332 expire 1408568432 last 1408567460
Aug 20 17:15:32 atlas-oss1c7.ccs.ornl.gov kernel: [3719946.850355] Lustre: Skipped 7 previous similar messages
Aug 20 17:15:43 atlas-oss1c7.ccs.ornl.gov kernel: [3719957.764600] Lustre: atlas1-OST0376: haven't heard from client d4db0f74-f7c4-dcb1-e2ff-a13ef1f6f2d1 (at 10.38.146.30@o2ib4) in 1581 seconds. I think it's dead, and I am evicting it. exp ffff880229025000, cur 1408569343 expire 1408568443 last 1408567762
Aug 20 17:15:43 atlas-oss1c7.ccs.ornl.gov kernel: [3719957.834527] Lustre: Skipped 3 previous similar messages
Aug 20 17:16:06 atlas-oss1c7.ccs.ornl.gov kernel: [3719980.985151] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 553@gni3 (no target)
Aug 20 17:16:06 atlas-oss1c7.ccs.ornl.gov kernel: [3719981.012754] LustreError: Skipped 482 previous similar messages
Aug 20 17:16:08 atlas-oss1c7.ccs.ornl.gov kernel: [3719982.788955] Lustre: atlas1-OST0256: haven't heard from client c6567098-35af-cd6e-0e85-fc5604482972 (at 10.38.146.48@o2ib4) in 1607 seconds. I think it's dead, and I am evicting it. exp ffff880e32235000, cur 1408569368 expire 1408568468 last 1408567761
Aug 20 17:16:08 atlas-oss1c7.ccs.ornl.gov kernel: [3719982.853834] Lustre: Skipped 5 previous similar messages
Aug 20 17:17:08 atlas-oss1c7.ccs.ornl.gov kernel: [3720043.409953] Lustre: 77333:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:3s); client may timeout.  req@ffff88014a7adc00 x1476723483157559/t0(0) o3->763e9d17-1748-d96f-36f1-73171afbfdab@4611@gni103:0/0 lens 448/432 e 0 to 0 dl 1408569425 ref 1 fl Complete:/2/ffffffff rc 0/-1
Aug 20 17:17:08 atlas-oss1c7.ccs.ornl.gov kernel: [3720043.496825] Lustre: 77333:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 4786 previous similar messages
Aug 20 17:17:08 atlas-oss1c7.ccs.ornl.gov kernel: [3720043.527728] LustreError: 77333:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-4611@gni103: deadline 600:3s ago
Aug 20 17:17:08 atlas-oss1c7.ccs.ornl.gov kernel: [3720043.527729]   req@ffff8802e32e1400 x1476723483157570/t0(0) o3->763e9d17-1748-d96f-36f1-73171afbfdab@4611@gni103:0/0 lens 448/0 e 0 to 0 dl 1408569425 ref 1 fl Interpret:/2/ffffffff rc 0/-1
Aug 20 17:17:08 atlas-oss1c7.ccs.ornl.gov kernel: [3720043.636878] LustreError: 77333:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 8855 previous similar messages
Aug 20 17:17:21 atlas-oss1c7.ccs.ornl.gov kernel: [3720056.409849] Lustre: atlas1-OST0256: Bulk IO read error with 4bb95e31-ce70-363d-3bfd-9747d764fc64 (at 838@gni112), client will retry: rc -110
Aug 20 17:17:21 atlas-oss1c7.ccs.ornl.gov kernel: [3720056.451155] Lustre: Skipped 1120 previous similar messages
Aug 20 17:17:38 atlas-oss1c7.ccs.ornl.gov kernel: [3720072.862605] Lustre: atlas1-OST02e6: haven't heard from client c6567098-35af-cd6e-0e85-fc5604482972 (at 10.38.146.48@o2ib4) in 1847 seconds. I think it's dead, and I am evicting it. exp ffff88102ecccc00, cur 1408569458 expire 1408568558 last 1408567611
Aug 20 17:17:38 atlas-oss1c7.ccs.ornl.gov kernel: [3720072.928018] Lustre: Skipped 2 previous similar messages
Aug 20 17:17:43 atlas-oss1c7.ccs.ornl.gov kernel: [3720078.061889] LustreError: 43837:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -24+24s  req@ffff880b1124cc00 x1476723491565908/t0(0) o3->63543711-3c79-0688-b42e-dc0294189e55@9935@gni111:0/0 lens 448/432 e 0 to 0 dl 1408569439 ref 1 fl Interpret:/2/0 rc 0/0
Aug 20 17:17:43 atlas-oss1c7.ccs.ornl.gov kernel: [3720078.139593] LustreError: 43837:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 1086 previous similar messages
Aug 20 17:18:14 atlas-oss1c7.ccs.ornl.gov kernel: [3720109.337484] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 30@gni2 (no target)
Aug 20 17:18:14 atlas-oss1c7.ccs.ornl.gov kernel: [3720109.361371] LustreError: Skipped 19355 previous similar messages
Aug 20 17:18:59 atlas-oss1c7.ccs.ornl.gov kernel: [3720154.783357] LustreError: 115530:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 607s
Aug 20 17:18:59 atlas-oss1c7.ccs.ornl.gov kernel: [3720154.809767] LustreError: 115530:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 5 previous similar messages
Aug 20 17:19:13 atlas-oss1c7.ccs.ornl.gov kernel: [3720168.421013] Lustre: 77281:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-32), not sending early reply
Aug 20 17:19:13 atlas-oss1c7.ccs.ornl.gov kernel: [3720168.421015]   req@ffff880d68a2fc00 x1476723479689057/t0(0) o3->56c3d435-6203-b20a-479a-f13e8836969e@6154@gni102:0/0 lens 448/432 e 0 to 0 dl 1408569558 ref 2 fl Interpret:/2/0 rc 0/0
Aug 20 17:19:13 atlas-oss1c7.ccs.ornl.gov kernel: [3720168.504389] Lustre: 77281:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 285 previous similar messages
Aug 20 17:20:03 atlas-oss1c7.ccs.ornl.gov kernel: [3720217.951246] Lustre: atlas1-OST0136: Client 1945aed7-a6c5-7af2-896c-487ca6cfc496 (at 6955@gni102) reconnecting
Aug 20 17:20:03 atlas-oss1c7.ccs.ornl.gov kernel: [3720217.982423] Lustre: Skipped 391 previous similar messages
Aug 20 17:20:37 atlas-oss1c7.ccs.ornl.gov kernel: [3720252.354578] Lustre: atlas1-OST0016 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 19. Is it stuck?
Aug 20 17:21:25 atlas-oss1c7.ccs.ornl.gov kernel: [3720300.404695] Lustre: atlas1-OST0016 is waiting for obd_unlinked_exports more than 32 seconds. The obd refcount = 13. Is it stuck?
Aug 20 17:21:25 atlas-oss1c7.ccs.ornl.gov kernel: [3720300.435302] Lustre: Skipped 1 previous similar message
Aug 20 17:22:16 atlas-oss1c7.ccs.ornl.gov kernel: [3720351.460049] Lustre: atlas1-OST0256: Client f11effcf-929b-c270-61fa-6fc074354e11 (at 4612@gni103) refused reconnection, still busy with 15 active RPCs
Aug 20 17:22:29 atlas-oss1c7.ccs.ornl.gov kernel: [3720364.478850] Lustre: atlas1-OST0016 is waiting for obd_unlinked_exports more than 64 seconds. The obd refcount = 13. Is it stuck?
Aug 20 17:22:34 atlas-oss1c7.ccs.ornl.gov kernel: [3720369.326949] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.38.144.36@o2ib4 (no target)
Aug 20 17:22:34 atlas-oss1c7.ccs.ornl.gov kernel: [3720369.350613] LustreError: Skipped 1127 previous similar messages
Aug 20 17:22:46 atlas-oss1c7.ccs.ornl.gov kernel: [3720381.165690] LustreError: 92644:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880844725000 x1476723478098214/t0(0) o3->f11effcf-929b-c270-61fa-6fc074354e11@4612@gni103:0/0 lens 448/432 e 0 to 0 dl 1408569802 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 17:22:46 atlas-oss1c7.ccs.ornl.gov kernel: [3720381.235671] LustreError: 92644:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 29 previous similar messages
Aug 20 17:23:09 atlas-oss1c7.ccs.ornl.gov kernel: [3720403.933614] Lustre: atlas1-OST0136: haven't heard from client 19bcb643-c59e-7506-59db-9c9493a91fcb (at 10.38.146.46@o2ib4) in 1943 seconds. I think it's dead, and I am evicting it. exp ffff88042f023800, cur 1408569789 expire 1408568889 last 1408567846
Aug 20 17:23:09 atlas-oss1c7.ccs.ornl.gov kernel: [3720404.003767] Lustre: Skipped 7 previous similar messages
Aug 20 17:23:59 atlas-oss1c7.ccs.ornl.gov kernel: [3720454.251446] LustreError: 7298:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 17:23:59 atlas-oss1c7.ccs.ornl.gov kernel: [3720454.251448]  ns: filter-atlas1-OST01c6_UUID lock: ffff8809f1fa6480/0x33255120e82cdc56 lrc: 4/0,0 mode: PW/PW res: [0x5b3747:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x20 nid: 1907@gni109 remote: 0xa371a4e42f1d72ed expref: 7 pid: 38511 timeout: 8014090770 lvb_type: 0
Aug 20 17:23:59 atlas-oss1c7.ccs.ornl.gov kernel: [3720454.373360] LustreError: 7298:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 59 previous similar messages
Aug 20 17:24:06 atlas-oss1c7.ccs.ornl.gov kernel: [3720461.286446] Lustre: atlas1-OST0016: Not available for connect from 90@gni103 (stopping)
Aug 20 17:24:06 atlas-oss1c7.ccs.ornl.gov kernel: [3720461.305619] Lustre: Skipped 20180 previous similar messages
Aug 20 17:24:37 atlas-oss1c7.ccs.ornl.gov kernel: [3720492.557162] Lustre: atlas1-OST0016 is waiting for obd_unlinked_exports more than 128 seconds. The obd refcount = 11. Is it stuck?
Aug 20 17:27:27 atlas-oss1c7.ccs.ornl.gov kernel: [3720662.622608] LustreError: 92651:0:(service.c:1999:ptlrpc_server_handle_request()) @@@ Dropping timed-out request from 12345-8399@gni111: deadline 600:95s ago
Aug 20 17:27:27 atlas-oss1c7.ccs.ornl.gov kernel: [3720662.622610]   req@ffff880b7c560c00 x1476723492808288/t0(0) o3->337db527-a474-56d1-3c76-39636221b63e@8399@gni111:0/0 lens 448/0 e 0 to 0 dl 1408569952 ref 1 fl Interpret:/2/ffffffff rc 0/-1
Aug 20 17:27:27 atlas-oss1c7.ccs.ornl.gov kernel: [3720662.622643] Lustre: 77308:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:95s); client may timeout.  req@ffff880a47086c00 x1476723492808290/t0(0) o3->337db527-a474-56d1-3c76-39636221b63e@8399@gni111:0/0 lens 448/0 e 0 to 0 dl 1408569952 ref 1 fl Interpret:/2/ffffffff rc 0/-1
Aug 20 17:27:27 atlas-oss1c7.ccs.ornl.gov kernel: [3720662.622648] Lustre: 77308:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 10273 previous similar messages
Aug 20 17:27:27 atlas-oss1c7.ccs.ornl.gov kernel: [3720662.851922] LustreError: 92651:0:(service.c:1999:ptlrpc_server_handle_request()) Skipped 9062 previous similar messages
Aug 20 17:28:06 atlas-oss1c7.ccs.ornl.gov kernel: [3720701.044863] Lustre: atlas1-OST01c6: haven't heard from client 3813a180-af83-56c6-42a8-c3a035b77491 (at 10.38.146.47@o2ib4) in 1945 seconds. I think it's dead, and I am evicting it. exp ffff8809a868a800, cur 1408570086 expire 1408569186 last 1408568141
Aug 20 17:28:06 atlas-oss1c7.ccs.ornl.gov kernel: [3720701.116204] Lustre: Skipped 11 previous similar messages
Aug 20 17:28:41 atlas-oss1c7.ccs.ornl.gov kernel: [3720736.231000] LustreError: 77349:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ timeout on bulk PUT after -7+7s  req@ffff880ebd939800 x1476723486393360/t0(0) o3->7244fa69-8aff-8185-90b4-5bfa648d93af@6052@gni102:0/0 lens 448/432 e 0 to 0 dl 1408570114 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 17:28:41 atlas-oss1c7.ccs.ornl.gov kernel: [3720736.310135] Lustre: atlas1-OST0256: Bulk IO read error with f8fc9c7d-bec2-e977-d954-ccf1890fd16c (at 3711@gni112), client will retry: rc -110
Aug 20 17:28:41 atlas-oss1c7.ccs.ornl.gov kernel: [3720736.310381] LustreError: 77349:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 1248 previous similar messages
Aug 20 17:28:41 atlas-oss1c7.ccs.ornl.gov kernel: [3720736.380473] Lustre: Skipped 1512 previous similar messages
Aug 20 17:28:53 atlas-oss1c7.ccs.ornl.gov kernel: [3720748.692722] Lustre: atlas1-OST0016 is waiting for obd_unlinked_exports more than 256 seconds. The obd refcount = 3. Is it stuck?
Aug 20 17:29:02 atlas-oss1c7.ccs.ornl.gov kernel: [3720757.505219] LustreError: 119902:0:(service.c:3216:ptlrpc_svcpt_health_check()) ost_io: unhealthy - request has been waiting 726s
Aug 20 17:29:02 atlas-oss1c7.ccs.ornl.gov kernel: [3720757.537414] LustreError: 119902:0:(service.c:3216:ptlrpc_svcpt_health_check()) Skipped 19 previous similar messages
Aug 20 17:29:25 atlas-oss1c7.ccs.ornl.gov kernel: [3720780.649750] Lustre: 43846:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
Aug 20 17:29:25 atlas-oss1c7.ccs.ornl.gov kernel: [3720780.649752]   req@ffff8806704f1400 x1476723481129030/t0(0) o3->c55ce73b-7771-473d-e8f9-c6a80d2a8b61@4616@gni103:0/0 lens 448/432 e 0 to 0 dl 1408570170 ref 2 fl Interpret:/0/0 rc 0/0
Aug 20 17:29:25 atlas-oss1c7.ccs.ornl.gov kernel: [3720780.649756] Lustre: 52010:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
Aug 20 17:29:25 atlas-oss1c7.ccs.ornl.gov kernel: [3720780.649758]   req@ffff8807159ffc00 x1476723485851377/t0(0) o3->4159e193-2333-4edb-c3e4-b9445beecfd1@3792@gni103:0/0 lens 448/432 e 0 to 0 dl 1408570170 ref 2 fl Interpret:/0/0 rc 0/0
Aug 20 17:29:25 atlas-oss1c7.ccs.ornl.gov kernel: [3720780.649761] Lustre: 52010:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 2091 previous similar messages
Aug 20 17:29:25 atlas-oss1c7.ccs.ornl.gov kernel: [3720780.866790] Lustre: 43846:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 88 previous similar messages
Aug 20 17:30:04 atlas-oss1c7.ccs.ornl.gov kernel: [3720819.369910] Lustre: atlas1-OST0256: Client 0139bae8-e50a-ec3d-1f2d-a4c05f905c5f (at 2279@gni103) reconnecting
Aug 20 17:30:04 atlas-oss1c7.ccs.ornl.gov kernel: [3720819.401034] Lustre: Skipped 518 previous similar messages
Aug 20 17:30:08 atlas-oss1c7.ccs.ornl.gov kernel: [3720823.157143] Lustre: server umount atlas1-OST0016 complete
Aug 20 17:30:08 atlas-oss1c7.ccs.ornl.gov kernel: [3720823.167009] Lustre: Failing over atlas1-OST0136
Aug 20 17:31:06 atlas-oss1c7.ccs.ornl.gov kernel: [3720881.971390] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 10.36.202.173@o2ib (no target)
Aug 20 17:31:06 atlas-oss1c7.ccs.ornl.gov kernel: [3720881.994943] LustreError: Skipped 2276 previous similar messages
Aug 20 17:33:27 atlas-oss1c7.ccs.ornl.gov kernel: [3721022.373415] LustreError: 38573:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 754 seconds
Aug 20 17:33:27 atlas-oss1c7.ccs.ornl.gov kernel: [3721022.373417]  ns: filter-atlas1-OST0256_UUID lock: ffff880710e9cb40/0x33255120e82d0013 lrc: 3/0,0 mode: PW/PW res: [0x596584:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x20 nid: 2712@gni109 remote: 0xe719b7a6e111d09c expref: 11 pid: 38713 timeout: 8015040161 lvb_type: 0
Aug 20 17:33:27 atlas-oss1c7.ccs.ornl.gov kernel: [3721022.498261] LustreError: 38573:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 1 previous similar message
Aug 20 17:34:00 atlas-oss1c7.ccs.ornl.gov kernel: [3721055.952667] LustreError: 43869:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 17:34:00 atlas-oss1c7.ccs.ornl.gov kernel: [3721055.952669]  ns: filter-atlas1-OST0256_UUID lock: ffff880596ca3000/0x33255120e82cde39 lrc: 4/0,0 mode: PW/PW res: [0x596580:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->184319) flags: 0x20 nid: 1021@gni109 remote: 0xe1c0c0c9b2725be expref: 33 pid: 38389 timeout: 8014660356 lvb_type: 0
Aug 20 17:34:00 atlas-oss1c7.ccs.ornl.gov kernel: [3721056.071324] LustreError: 43869:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 13 previous similar messages
Aug 20 17:34:12 atlas-oss1c7.ccs.ornl.gov kernel: [3721067.252706] Lustre: atlas1-OST0136: Not available for connect from 10.36.207.196@o2ib (stopping)
Aug 20 17:34:12 atlas-oss1c7.ccs.ornl.gov kernel: [3721067.275401] Lustre: Skipped 20963 previous similar messages
Aug 20 17:34:30 atlas-oss1c7.ccs.ornl.gov kernel: [3721085.242638] LustreError: 38566:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 735 seconds
Aug 20 17:34:30 atlas-oss1c7.ccs.ornl.gov kernel: [3721085.242640]  ns: filter-atlas1-OST0256_UUID lock: ffff880596ca3000/0x33255120e82cde39 lrc: 3/0,0 mode: PW/PW res: [0x596580:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->184319) flags: 0x20 nid: 1021@gni109 remote: 0xe1c0c0c9b2725be expref: 8 pid: 38389 timeout: 8015083299 lvb_type: 0
Aug 20 17:38:25 atlas-oss1c7.ccs.ornl.gov kernel: [3721320.323559] Lustre: atlas1-OST0256: Client 85caff3e-664b-453f-605b-fcb076639dc5 (at 8391@gni111) refused reconnection, still busy with 11 active RPCs
Aug 20 17:38:25 atlas-oss1c7.ccs.ornl.gov kernel: [3721320.361364] Lustre: Skipped 1 previous similar message
Aug 20 17:38:25 atlas-oss1c7.ccs.ornl.gov kernel: [3721320.874393] LustreError: 9582:0:(ldlm_lib.c:2730:target_bulk_io()) @@@ bulk PUT failed: rc -107  req@ffff880f6a2cd000 x1476723485126982/t0(0) o3->85caff3e-664b-453f-605b-fcb076639dc5@8391@gni111:0/0 lens 448/432 e 0 to 0 dl 1408571053 ref 1 fl Interpret:/0/0 rc 0/0
Aug 20 17:38:25 atlas-oss1c7.ccs.ornl.gov kernel: [3721320.942314] LustreError: 9582:0:(ldlm_lib.c:2730:target_bulk_io()) Skipped 24 previous similar messages
Aug 20 17:40:17 atlas-oss1c7.ccs.ornl.gov kernel: [3721432.370714] Lustre: atlas1-OST0376: haven't heard from client 9b9cc00c-1962-1247-1e15-95645389a19b (at 10@gni2) in 1352 seconds. I think it's dead, and I am evicting it. exp ffff8802f3f4d800, cur 1408570817 expire 1408569917 last 1408569465
Aug 20 17:40:17 atlas-oss1c7.ccs.ornl.gov kernel: [3721432.433861] Lustre: Skipped 1 previous similar message
Aug 20 17:41:07 atlas-oss1c7.ccs.ornl.gov kernel: [3721482.427594] LustreError: 137-5: atlas1-OST00a6_UUID: not available for connect from 631@gni3 (no target)
Aug 20 17:41:07 atlas-oss1c7.ccs.ornl.gov kernel: [3721482.452539] LustreError: Skipped 42993 previous similar messages
Aug 20 17:41:09 atlas-oss1c7.ccs.ornl.gov kernel: [3721484.601101] Lustre: atlas1-OST0136 is waiting for obd_unlinked_exports more than 256 seconds. The obd refcount = 3. Is it stuck?
Aug 20 17:41:09 atlas-oss1c7.ccs.ornl.gov kernel: [3721484.633529] Lustre: Skipped 5 previous similar messages
Aug 20 17:41:13 atlas-oss1c7.ccs.ornl.gov kernel: [3721489.132338] Lustre: server umount atlas1-OST0136 complete
Aug 20 17:41:13 atlas-oss1c7.ccs.ornl.gov kernel: [3721489.142174] Lustre: Failing over atlas1-OST01c6
Aug 20 17:41:18 atlas-oss1c7.ccs.ornl.gov kernel: [3721494.091625] Lustre: server umount atlas1-OST01c6 complete
Aug 20 17:41:18 atlas-oss1c7.ccs.ornl.gov kernel: [3721494.100781] Lustre: Failing over atlas1-OST0256
Aug 20 17:41:28 atlas-oss1c7.ccs.ornl.gov kernel: [3721503.473508] Lustre: server umount atlas1-OST0256 complete
Aug 20 17:41:28 atlas-oss1c7.ccs.ornl.gov kernel: [3721503.484523] Lustre: Failing over atlas1-OST02e6
Aug 20 17:41:32 atlas-oss1c7.ccs.ornl.gov kernel: [3721507.586741] Lustre: server umount atlas1-OST02e6 complete
Aug 20 17:41:32 atlas-oss1c7.ccs.ornl.gov kernel: [3721507.596266] Lustre: Failing over atlas1-OST0376
Aug 20 17:41:38 atlas-oss1c7.ccs.ornl.gov kernel: [3721514.230103] Lustre: server umount atlas1-OST0376 complete
Aug 20 17:58:57 atlas-oss1c7.ccs.ornl.gov kernel: [3722553.183536] LNet: 12095:0:(router.c:1215:lnet_prune_rc_data()) Waiting for rc buffers to unlink
Aug 20 17:58:58 atlas-oss1c7.ccs.ornl.gov kernel: [3722554.198919] LNet: 12095:0:(router.c:1215:lnet_prune_rc_data()) Waiting for rc buffers to unlink
Aug 20 17:59:00 atlas-oss1c7.ccs.ornl.gov kernel: [3722556.219685] LNet: 12095:0:(router.c:1215:lnet_prune_rc_data()) Waiting for rc buffers to unlink
Aug 20 17:59:04 atlas-oss1c7.ccs.ornl.gov kernel: [3722560.241209] LNet: 12095:0:(router.c:1215:lnet_prune_rc_data()) Waiting for rc buffers to unlink
Aug 20 17:59:12 atlas-oss1c7.ccs.ornl.gov kernel: [3722568.265250] LNet: 12095:0:(router.c:1215:lnet_prune_rc_data()) Waiting for rc buffers to unlink
Aug 20 17:59:28 atlas-oss1c7.ccs.ornl.gov kernel: [3722584.299324] LNet: 12095:0:(router.c:1215:lnet_prune_rc_data()) Waiting for rc buffers to unlink
Aug 20 17:59:41 atlas-oss1c7.ccs.ornl.gov kernel: [3722597.150417] LNetError: 12088:0:(lib-move.c:1949:lnet_parse()) 10.36.145.12@o2ib, src 10.38.145.12@o2ib4: Dropping PUT (error -108 looking up sender)
Aug 20 17:59:41 atlas-oss1c7.ccs.ornl.gov kernel: [3722597.152879] LNetError: 12087:0:(lib-move.c:1949:lnet_parse()) 10.36.145.17@o2ib, src 10.38.145.12@o2ib4: Dropping PUT (error -108 looking up sender)
Aug 20 17:59:41 atlas-oss1c7.ccs.ornl.gov kernel: [3722597.234704] LNetError: 12088:0:(lib-move.c:1949:lnet_parse()) Skipped 1 previous similar message
Aug 20 17:59:41 atlas-oss1c7.ccs.ornl.gov kernel: [3722597.656263] LNetError: 12087:0:(lib-move.c:1949:lnet_parse()) 10.36.207.252@o2ib, src 10.36.207.252@o2ib: Dropping PUT (error -108 looking up sender)
Aug 20 17:59:41 atlas-oss1c7.ccs.ornl.gov kernel: [3722597.695462] LNetError: 12087:0:(lib-move.c:1949:lnet_parse()) Skipped 21 previous similar messages
Aug 20 17:59:42 atlas-oss1c7.ccs.ornl.gov kernel: [3722598.407792] LNet: Removed LNI 10.36.225.52@o2ib204
Aug 20 17:59:42 atlas-oss1c7.ccs.ornl.gov kernel: [3722598.527707] LNetError: 3039:0:(o2iblnd_cb.c:2267:kiblnd_passive_connect()) Can't accept 10.36.145.13@o2ib on NA (ib2:1:10.36.225.52): bad dst nid 10.36.225.52@o2ib
Aug 20 17:59:42 atlas-oss1c7.ccs.ornl.gov kernel: [3722598.596708] LNetError: 3039:0:(o2iblnd_cb.c:2267:kiblnd_passive_connect()) Can't accept 10.36.145.14@o2ib on NA (ib2:1:10.36.225.52): bad dst nid 10.36.225.52@o2ib
Aug 20 17:59:43 atlas-oss1c7.ccs.ornl.gov kernel: [3722599.251945] LNetError: 3039:0:(o2iblnd_cb.c:2267:kiblnd_passive_connect()) Can't accept 10.36.145.12@o2ib on NA (ib2:1:10.36.225.52): bad dst nid 10.36.225.52@o2ib
Aug 20 17:59:43 atlas-oss1c7.ccs.ornl.gov kernel: [3722599.287049] LNetError: 3039:0:(o2iblnd_cb.c:2267:kiblnd_passive_connect()) Skipped 4 previous similar messages
Aug 20 17:59:44 atlas-oss1c7.ccs.ornl.gov kernel: [3722600.451548] LNet: Removed LNI 10.36.225.52@o2ib
Aug 20 18:00:15 atlas-oss1c7.ccs.ornl.gov kernel: [3722631.693086] LNet: HW CPU cores: 16, npartitions: 4
Aug 20 18:00:15 atlas-oss1c7.ccs.ornl.gov kernel: [3722631.703082] alg: No test for crc32 (crc32-table)
Aug 20 18:00:15 atlas-oss1c7.ccs.ornl.gov kernel: [3722631.718226] alg: No test for adler32 (adler32-zlib)
Aug 20 18:00:15 atlas-oss1c7.ccs.ornl.gov kernel: [3722631.737838] alg: No test for crc32 (crc32-pclmul)
Aug 20 18:00:24 atlas-oss1c7.ccs.ornl.gov kernel: [3722639.881665] LNetError: 3039:0:(o2iblnd_cb.c:2267:kiblnd_passive_connect()) Can't accept 10.36.229.11@o2ib204 on NA (ib2:0:10.36.225.52): bad dst nid 10.36.225.52@o2ib204
Aug 20 18:00:24 atlas-oss1c7.ccs.ornl.gov kernel: [3722640.024260] LNetError: 3039:0:(o2iblnd_cb.c:2267:kiblnd_passive_connect()) Can't accept 10.36.225.5@o2ib on NA (ib2:0:10.36.225.52): bad dst nid 10.36.225.52@o2ib
Aug 20 18:00:24 atlas-oss1c7.ccs.ornl.gov kernel: [3722640.153508] LNet: Added LNI 10.36.225.52@o2ib [63/2560/0/180]
Aug 20 18:00:25 atlas-oss1c7.ccs.ornl.gov kernel: [3722640.984029] LNet: Added LNI 10.36.225.52@o2ib204 [63/2560/0/180]
Aug 20 18:00:25 atlas-oss1c7.ccs.ornl.gov kernel: [3722641.290707] Lustre: Lustre: Build Version: 2.4.3-gd00e4d6-CHANGED-2.6.32-358.23.2.el6.atlas.x86_64
Aug 20 18:00:27 atlas-oss1c7.ccs.ornl.gov kernel: [3722643.570145] LDISKFS-fs (dm-5): mounted filesystem with ordered data mode. quota=on. Opts: 
Aug 20 18:00:27 atlas-oss1c7.ccs.ornl.gov kernel: [3722643.696906] LDISKFS-fs (dm-21): mounted filesystem with ordered data mode. quota=on. Opts: 
Aug 20 18:00:28 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.168687] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
Aug 20 18:00:28 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.303707] LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on. Opts: 
Aug 20 18:00:28 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.380093] Lustre: atlas1-OST0016: Not available for connect from 728@gni3 (not set up)
Aug 20 18:00:28 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.382293] LustreError: 137-5: atlas1-OST01c6_UUID: not available for connect from 728@gni3 (no target)
Aug 20 18:00:28 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.387956] LustreError: 137-5: atlas1-OST0376_UUID: not available for connect from 728@gni3 (no target)
Aug 20 18:00:28 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.453876] Lustre: atlas1-OST0016: Not available for connect from 10.38.145.115@o2ib4 (not set up)
Aug 20 18:00:29 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.818594] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. quota=on. Opts: 
Aug 20 18:00:29 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.835007] LDISKFS-fs (dm-12): mounted filesystem with ordered data mode. quota=on. Opts: 
Aug 20 18:00:29 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.893427] LustreError: 137-5: atlas1-OST02e6_UUID: not available for connect from 677@gni3 (no target)
Aug 20 18:00:29 atlas-oss1c7.ccs.ornl.gov kernel: [3722644.922920] LustreError: Skipped 20 previous similar messages
Aug 20 18:00:29 atlas-oss1c7.ccs.ornl.gov kernel: [3722645.102858] Lustre: atlas1-OST0016: Not available for connect from 10.36.247.163@o2ib (not set up)
Aug 20 18:00:29 atlas-oss1c7.ccs.ornl.gov kernel: [3722645.133065] Lustre: Skipped 4 previous similar messages
Aug 20 18:00:29 atlas-oss1c7.ccs.ornl.gov kernel: [3722645.433167] LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on. Opts: 
Aug 20 18:00:31 atlas-oss1c7.ccs.ornl.gov kernel: [3722646.746614] Lustre: atlas1-OST0016: Will be in recovery for at least 30:00, or until 20148 clients reconnect
Aug 20 18:00:31 atlas-oss1c7.ccs.ornl.gov kernel: [3722646.748392] LustreError: 137-5: atlas1-OST0256_UUID: not available for connect from 681@gni3 (no target)
Aug 20 18:00:31 atlas-oss1c7.ccs.ornl.gov kernel: [3722646.748394] LustreError: Skipped 4 previous similar messages
Aug 20 18:00:31 atlas-oss1c7.ccs.ornl.gov kernel: [3722646.923448] Lustre: atlas1-OST00a6: Not available for connect from 555@gni3 (not set up)
Aug 20 18:00:32 atlas-oss1c7.ccs.ornl.gov kernel: [3722647.979493] Lustre: atlas1-OST00a6: Will be in recovery for at least 30:00, or until 20062 clients reconnect
Aug 20 18:00:33 atlas-oss1c7.ccs.ornl.gov kernel: [3722649.135414] LustreError: 137-5: atlas1-OST01c6_UUID: not available for connect from 553@gni3 (no target)
Aug 20 18:00:33 atlas-oss1c7.ccs.ornl.gov kernel: [3722649.164550] LustreError: Skipped 25 previous similar messages
Aug 20 18:00:33 atlas-oss1c7.ccs.ornl.gov kernel: [3722649.310013] Lustre: atlas1-OST0136: Not available for connect from 10.38.144.30@o2ib4 (not set up)
Aug 20 18:00:33 atlas-oss1c7.ccs.ornl.gov kernel: [3722649.334457] Lustre: Skipped 3 previous similar messages
Aug 20 18:00:35 atlas-oss1c7.ccs.ornl.gov kernel: [3722650.816808] Lustre: atlas1-OST0016: Denying connection for new client 899bb21e-1cdb-3bf7-309c-78beba69c130 (at 92@gni2), waiting for all 20148 known clients (12 recovered, 4 in progress, and 0 evicted) to recover in 29:55
Aug 20 18:00:35 atlas-oss1c7.ccs.ornl.gov kernel: [3722650.862547] Lustre: atlas1-OST0016: Denying connection for new client 17ea155d-b687-3787-8883-f9930ea50864 (at 73@gni2), waiting for all 20148 known clients (18 recovered, 5 in progress, and 0 evicted) to recover in 29:55
Aug 20 18:00:35 atlas-oss1c7.ccs.ornl.gov kernel: [3722650.862550] Lustre: Skipped 1 previous similar message
Aug 20 18:00:35 atlas-oss1c7.ccs.ornl.gov kernel: [3722650.965441] Lustre: Skipped 3 previous similar messages
Aug 20 18:00:35 atlas-oss1c7.ccs.ornl.gov kernel: [3722651.441484] Lustre: atlas1-OST0136: Will be in recovery for at least 30:00, or until 20221 clients reconnect
Aug 20 18:00:35 atlas-oss1c7.ccs.ornl.gov kernel: [3722651.585793] Lustre: atlas1-OST0016: Denying connection for new client 2c9772ac-eff3-9c57-623c-4d225feccb26 (at 36@gni2), waiting for all 20148 known clients (126 recovered, 8 in progress, and 0 evicted) to recover in 29:55
Aug 20 18:00:35 atlas-oss1c7.ccs.ornl.gov kernel: [3722651.646244] Lustre: Skipped 29 previous similar messages
Aug 20 18:00:36 atlas-oss1c7.ccs.ornl.gov kernel: [3722652.594342] Lustre: atlas1-OST0016: Denying connection for new client 58f1aefb-573f-117f-76d7-a5867625ce55 (at 65@gni2), waiting for all 20148 known clients (269 recovered, 14 in progress, and 0 evicted) to recover in 29:54
Aug 20 18:00:36 atlas-oss1c7.ccs.ornl.gov kernel: [3722652.656451] Lustre: Skipped 28 previous similar messages
Aug 20 18:00:37 atlas-oss1c7.ccs.ornl.gov kernel: [3722652.952807] Lustre: atlas1-OST01c6: Will be in recovery for at least 30:00, or until 20221 clients reconnect
Aug 20 18:00:37 atlas-oss1c7.ccs.ornl.gov kernel: [3722653.266721] LustreError: 137-5: atlas1-OST02e6_UUID: not available for connect from 10.38.146.39@o2ib4 (no target)
Aug 20 18:00:37 atlas-oss1c7.ccs.ornl.gov kernel: [3722653.296701] LustreError: Skipped 42 previous similar messages
Aug 20 18:00:37 atlas-oss1c7.ccs.ornl.gov kernel: [3722653.761149] Lustre: atlas1-OST0256: Not available for connect from 10.38.145.8@o2ib4 (not set up)
Aug 20 18:00:37 atlas-oss1c7.ccs.ornl.gov kernel: [3722653.777688] Lustre: Skipped 11 previous similar messages
Aug 20 18:00:38 atlas-oss1c7.ccs.ornl.gov kernel: [3722654.685659] Lustre: atlas1-OST0016: Denying connection for new client d4cf6adc-a67e-8015-ddff-ec399f0b6f1b (at 42@gni2), waiting for all 20148 known clients (542 recovered, 15 in progress, and 0 evicted) to recover in 29:52
Aug 20 18:00:38 atlas-oss1c7.ccs.ornl.gov kernel: [3722654.747399] Lustre: Skipped 62 previous similar messages
Aug 20 18:00:40 atlas-oss1c7.ccs.ornl.gov kernel: [3722656.169778] Lustre: atlas1-OST02e6: Will be in recovery for at least 30:00, or until 20221 clients reconnect
Aug 20 18:00:40 atlas-oss1c7.ccs.ornl.gov kernel: [3722656.197613] Lustre: Skipped 1 previous similar message
Aug 20 18:03:09 atlas-oss1c7.ccs.ornl.gov kernel: [3722805.734089] Lustre: atlas1-OST00a6: Denying connection for new client 3acfe0b6-fc7c-e2d7-81fd-f880317b2357 (at 10.38.146.28@o2ib4), waiting for all 20062 known clients (1140 recovered, 165 in progress, and 0 evicted) to recover in 27:22
Aug 20 18:03:09 atlas-oss1c7.ccs.ornl.gov kernel: [3722805.803983] Lustre: Skipped 34 previous similar messages
Aug 20 18:03:18 atlas-oss1c7.ccs.ornl.gov kernel: [3722814.284278] Lustre: atlas1-OST00a6: Denying connection for new client dda2d5ca-0ae1-7c9c-120d-2cb2514b4aa8 (at 10.38.146.29@o2ib4), waiting for all 20062 known clients (1144 recovered, 165 in progress, and 0 evicted) to recover in 27:13
Aug 20 18:03:18 atlas-oss1c7.ccs.ornl.gov kernel: [3722814.347913] Lustre: Skipped 2 previous similar messages
Aug 20 18:03:38 atlas-oss1c7.ccs.ornl.gov kernel: [3722834.724828] Lustre: atlas1-OST00a6: Denying connection for new client 17f56069-1f4c-2fa7-cb8a-1971da8fc5a1 (at 10.38.146.27@o2ib4), waiting for all 20062 known clients (1147 recovered, 165 in progress, and 0 evicted) to recover in 26:53
Aug 20 18:03:38 atlas-oss1c7.ccs.ornl.gov kernel: [3722834.794953] Lustre: Skipped 159 previous similar messages
Aug 20 18:04:31 atlas-oss1c7.ccs.ornl.gov kernel: [3722887.081328] Lustre: atlas1-OST0016: Denying connection for new client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4), waiting for all 20148 known clients (940 recovered, 101 in progress, and 0 evicted) to recover in 25:59
Aug 20 18:05:39 atlas-oss1c7.ccs.ornl.gov kernel: [3722955.796371] Lustre: atlas1-OST00a6: Denying connection for new client 3acfe0b6-fc7c-e2d7-81fd-f880317b2357 (at 10.38.146.28@o2ib4), waiting for all 20062 known clients (1210 recovered, 169 in progress, and 0 evicted) to recover in 24:52
Aug 20 18:05:39 atlas-oss1c7.ccs.ornl.gov kernel: [3722955.870681] Lustre: Skipped 6 previous similar messages
Aug 20 18:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3722966.949279] Lustre: atlas1-OST00a6: Client 271926ed-0a7d-29c8-d161-eba4df8dd7be (at 10.36.207.248@o2ib) reconnecting, waiting for 20062 clients in recovery for 24:41
Aug 20 18:05:51 atlas-oss1c7.ccs.ornl.gov kernel: [3722966.986518] Lustre: atlas1-OST00a6: Client 271926ed-0a7d-29c8-d161-eba4df8dd7be (at 10.36.207.248@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3722969.207172] Lustre: atlas1-OST0016: Client 430e121f-17ca-29e2-8cf0-27a38cd1ec74 (at 10.36.207.217@o2ib) reconnecting, waiting for 20148 clients in recovery for 24:37
Aug 20 18:05:53 atlas-oss1c7.ccs.ornl.gov kernel: [3722969.247172] Lustre: atlas1-OST0016: Client 430e121f-17ca-29e2-8cf0-27a38cd1ec74 (at 10.36.207.217@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3722970.072045] Lustre: atlas1-OST0136: Client 08f96cbf-e4e7-e945-68e0-c9d85efe31b1 (at 10.36.207.204@o2ib) reconnecting, waiting for 20221 clients in recovery for 24:41
Aug 20 18:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3722970.072049] Lustre: atlas1-OST00a6: Client 08f96cbf-e4e7-e945-68e0-c9d85efe31b1 (at 10.36.207.204@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:05:54 atlas-oss1c7.ccs.ornl.gov kernel: [3722970.157037] Lustre: Skipped 1 previous similar message
Aug 20 18:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3722971.212577] Lustre: atlas1-OST00a6: Client 791275a1-8261-a196-b311-1c3b3540f274 (at 10.36.207.247@o2ib) reconnecting, waiting for 20062 clients in recovery for 24:36
Aug 20 18:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3722971.248092] Lustre: Skipped 1 previous similar message
Aug 20 18:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3722971.267967] Lustre: atlas1-OST00a6: Client 791275a1-8261-a196-b311-1c3b3540f274 (at 10.36.207.247@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:05:55 atlas-oss1c7.ccs.ornl.gov kernel: [3722971.317274] Lustre: Skipped 2 previous similar messages
Aug 20 18:05:57 atlas-oss1c7.ccs.ornl.gov kernel: [3722973.223301] Lustre: atlas1-OST00a6: Client 42407212-49f4-56f5-8e29-5206c02520e7 (at 10.36.207.251@o2ib) reconnecting, waiting for 20062 clients in recovery for 24:34
Aug 20 18:05:57 atlas-oss1c7.ccs.ornl.gov kernel: [3722973.267279] Lustre: Skipped 11 previous similar messages
Aug 20 18:05:57 atlas-oss1c7.ccs.ornl.gov kernel: [3722973.278793] Lustre: atlas1-OST00a6: Client 42407212-49f4-56f5-8e29-5206c02520e7 (at 10.36.207.251@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:05:57 atlas-oss1c7.ccs.ornl.gov kernel: [3722973.328206] Lustre: Skipped 11 previous similar messages
Aug 20 18:06:02 atlas-oss1c7.ccs.ornl.gov kernel: [3722977.860745] Lustre: atlas1-OST01c6: Client 48524258-e1df-2f01-185e-78a66aa9ab78 (at 10.36.207.245@o2ib) reconnecting, waiting for 20221 clients in recovery for 24:35
Aug 20 18:06:02 atlas-oss1c7.ccs.ornl.gov kernel: [3722977.909024] Lustre: Skipped 11 previous similar messages
Aug 20 18:06:02 atlas-oss1c7.ccs.ornl.gov kernel: [3722977.920601] Lustre: atlas1-OST01c6: Client 48524258-e1df-2f01-185e-78a66aa9ab78 (at 10.36.207.245@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:06:02 atlas-oss1c7.ccs.ornl.gov kernel: [3722977.969794] Lustre: Skipped 11 previous similar messages
Aug 20 18:06:11 atlas-oss1c7.ccs.ornl.gov kernel: [3722986.887139] Lustre: atlas1-OST02e6: Client 48524258-e1df-2f01-185e-78a66aa9ab78 (at 10.36.207.245@o2ib) reconnecting, waiting for 20221 clients in recovery for 24:29
Aug 20 18:06:11 atlas-oss1c7.ccs.ornl.gov kernel: [3722986.887146] Lustre: atlas1-OST0256: Client 48524258-e1df-2f01-185e-78a66aa9ab78 (at 10.36.207.245@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:06:11 atlas-oss1c7.ccs.ornl.gov kernel: [3722986.887149] Lustre: Skipped 48 previous similar messages
Aug 20 18:06:11 atlas-oss1c7.ccs.ornl.gov kernel: [3722986.993683] Lustre: Skipped 50 previous similar messages
Aug 20 18:06:27 atlas-oss1c7.ccs.ornl.gov kernel: [3723003.788220] Lustre: atlas1-OST00a6: Client 06b7b1ff-4b62-b87a-dd8e-c7539607434b (at 10.36.207.211@o2ib) reconnecting, waiting for 20062 clients in recovery for 24:04
Aug 20 18:06:27 atlas-oss1c7.ccs.ornl.gov kernel: [3723003.830370] Lustre: Skipped 34 previous similar messages
Aug 20 18:06:27 atlas-oss1c7.ccs.ornl.gov kernel: [3723003.850195] Lustre: atlas1-OST00a6: Client 06b7b1ff-4b62-b87a-dd8e-c7539607434b (at 10.36.207.211@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:06:28 atlas-oss1c7.ccs.ornl.gov kernel: [3723003.899847] Lustre: Skipped 35 previous similar messages
Aug 20 18:07:02 atlas-oss1c7.ccs.ornl.gov kernel: [3723037.938279] Lustre: atlas1-OST0136: Client 3635d171-7c3c-f835-885d-e1fbbb639851 (at 22@gni2) reconnecting, waiting for 20221 clients in recovery for 23:33
Aug 20 18:07:02 atlas-oss1c7.ccs.ornl.gov kernel: [3723037.973074] Lustre: Skipped 16 previous similar messages
Aug 20 18:07:02 atlas-oss1c7.ccs.ornl.gov kernel: [3723037.992567] Lustre: atlas1-OST0136: Client 3635d171-7c3c-f835-885d-e1fbbb639851 (at 22@gni2) refused reconnection, still busy with 1 active RPCs
Aug 20 18:07:02 atlas-oss1c7.ccs.ornl.gov kernel: [3723038.033112] Lustre: Skipped 16 previous similar messages
Aug 20 18:07:49 atlas-oss1c7.ccs.ornl.gov kernel: [3723085.174719] Lustre: atlas1-OST00a6: Denying connection for new client 0601a537-dbff-8201-9a0c-01ff9d2809a3 (at 10.38.146.45@o2ib4), waiting for all 20062 known clients (1210 recovered, 169 in progress, and 0 evicted) to recover in 22:42
Aug 20 18:07:49 atlas-oss1c7.ccs.ornl.gov kernel: [3723085.231466] Lustre: Skipped 170 previous similar messages
Aug 20 18:08:36 atlas-oss1c7.ccs.ornl.gov kernel: [3723132.828571] Lustre: atlas1-OST00a6: Client 6fdda7f5-5c50-83a8-b1be-51091bf27a94 (at 10.36.207.210@o2ib) reconnecting, waiting for 20062 clients in recovery for 21:55
Aug 20 18:08:36 atlas-oss1c7.ccs.ornl.gov kernel: [3723132.830157] Lustre: atlas1-OST01c6: Client 6fdda7f5-5c50-83a8-b1be-51091bf27a94 (at 10.36.207.210@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:08:36 atlas-oss1c7.ccs.ornl.gov kernel: [3723132.830160] Lustre: Skipped 399 previous similar messages
Aug 20 18:08:36 atlas-oss1c7.ccs.ornl.gov kernel: [3723132.939078] Lustre: Skipped 401 previous similar messages
Aug 20 18:10:48 atlas-oss1c7.ccs.ornl.gov kernel: [3723264.455086] Lustre: atlas1-OST0136: Client c84e9dc8-8013-4d66-16d6-ef6754d4bc07 (at 74@gni2) reconnecting, waiting for 20221 clients in recovery for 19:47
Aug 20 18:10:48 atlas-oss1c7.ccs.ornl.gov kernel: [3723264.456793] Lustre: atlas1-OST01c6: Client c84e9dc8-8013-4d66-16d6-ef6754d4bc07 (at 74@gni2) refused reconnection, still busy with 1 active RPCs
Aug 20 18:10:48 atlas-oss1c7.ccs.ornl.gov kernel: [3723264.456796] Lustre: Skipped 563 previous similar messages
Aug 20 18:10:48 atlas-oss1c7.ccs.ornl.gov kernel: [3723264.549200] Lustre: Skipped 565 previous similar messages
Aug 20 18:12:32 atlas-oss1c7.ccs.ornl.gov kernel: [3723368.267347] Lustre: atlas1-OST00a6: Denying connection for new client fcf37d07-c39a-da0f-e1a9-3123637bccdb (at 10.38.146.47@o2ib4), waiting for all 20062 known clients (19880 recovered, 169 in progress, and 0 evicted) to recover in 17:59
Aug 20 18:12:32 atlas-oss1c7.ccs.ornl.gov kernel: [3723368.337071] Lustre: Skipped 335 previous similar messages
Aug 20 18:15:05 atlas-oss1c7.ccs.ornl.gov kernel: [3723521.503410] Lustre: atlas1-OST0016: Client f997a435-f741-2177-64dd-5600697ced5d (at 10.36.202.172@o2ib) reconnecting, waiting for 20148 clients in recovery for 15:25
Aug 20 18:15:05 atlas-oss1c7.ccs.ornl.gov kernel: [3723521.503426] Lustre: atlas1-OST01c6: Client f997a435-f741-2177-64dd-5600697ced5d (at 10.36.202.172@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:15:05 atlas-oss1c7.ccs.ornl.gov kernel: [3723521.503428] Lustre: Skipped 1491 previous similar messages
Aug 20 18:15:05 atlas-oss1c7.ccs.ornl.gov kernel: [3723521.615376] Lustre: Skipped 1487 previous similar messages
Aug 20 18:21:08 atlas-oss1c7.ccs.ornl.gov kernel: [3723885.129467] Lustre: atlas1-OST00a6: Denying connection for new client 17f56069-1f4c-2fa7-cb8a-1971da8fc5a1 (at 10.38.146.27@o2ib4), waiting for all 20062 known clients (19880 recovered, 169 in progress, and 0 evicted) to recover in 9:23
Aug 20 18:21:08 atlas-oss1c7.ccs.ornl.gov kernel: [3723885.193326] Lustre: Skipped 681 previous similar messages
Aug 20 18:23:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724037.615656] Lustre: atlas1-OST0016: Client 7c582689-5c88-37ea-9012-406c7fbb86c4 (at 10.36.207.219@o2ib) reconnecting, waiting for 20148 clients in recovery for 6:49
Aug 20 18:23:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724037.616626] Lustre: atlas1-OST00a6: Client 7c582689-5c88-37ea-9012-406c7fbb86c4 (at 10.36.207.219@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:23:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724037.616629] Lustre: Skipped 3238 previous similar messages
Aug 20 18:23:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724037.721673] Lustre: Skipped 3239 previous similar messages
Aug 20 18:26:49 atlas-oss1c7.ccs.ornl.gov kernel: [3724225.769002] LustreError: 137-5: atlas1-OST0132_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 18:26:49 atlas-oss1c7.ccs.ornl.gov kernel: [3724225.793006] LustreError: Skipped 346 previous similar messages
Aug 20 18:30:30 atlas-oss1c7.ccs.ornl.gov kernel: [3724447.427313] Lustre: atlas1-OST0016: recovery is timed out, evict stale exports
Aug 20 18:30:31 atlas-oss1c7.ccs.ornl.gov kernel: [3724447.449024] Lustre: atlas1-OST0016: disconnecting 88 stale clients
Aug 20 18:30:32 atlas-oss1c7.ccs.ornl.gov kernel: [3724448.660750] Lustre: atlas1-OST00a6: recovery is timed out, evict stale exports
Aug 20 18:30:32 atlas-oss1c7.ccs.ornl.gov kernel: [3724448.680012] Lustre: atlas1-OST00a6: disconnecting 13 stale clients
Aug 20 18:30:35 atlas-oss1c7.ccs.ornl.gov kernel: [3724452.122154] Lustre: atlas1-OST0136: recovery is timed out, evict stale exports
Aug 20 18:30:35 atlas-oss1c7.ccs.ornl.gov kernel: [3724452.141508] Lustre: atlas1-OST0136: disconnecting 80 stale clients
Aug 20 18:30:37 atlas-oss1c7.ccs.ornl.gov kernel: [3724453.633632] Lustre: atlas1-OST01c6: recovery is timed out, evict stale exports
Aug 20 18:30:37 atlas-oss1c7.ccs.ornl.gov kernel: [3724453.652085] Lustre: atlas1-OST01c6: disconnecting 80 stale clients
Aug 20 18:30:39 atlas-oss1c7.ccs.ornl.gov kernel: [3724455.613588] Lustre: atlas1-OST0256: deleting orphan objects from 0x0:5858697 to 0x0:5858785
Aug 20 18:30:39 atlas-oss1c7.ccs.ornl.gov kernel: [3724455.894465] Lustre: atlas1-OST0256: Recovery over after 30:01, of 20141 clients 20140 recovered and 1 was evicted.
Aug 20 18:30:40 atlas-oss1c7.ccs.ornl.gov kernel: [3724456.565268] Lustre: atlas1-OST0256: Client 2d9bc563-a9a0-ebd5-c53c-e4fde6141a67 (at 10.36.247.141@o2ib) reconnecting
Aug 20 18:30:40 atlas-oss1c7.ccs.ornl.gov kernel: [3724456.850832] Lustre: atlas1-OST02e6: recovery is timed out, evict stale exports
Aug 20 18:30:40 atlas-oss1c7.ccs.ornl.gov kernel: [3724456.870462] Lustre: Skipped 1 previous similar message
Aug 20 18:30:40 atlas-oss1c7.ccs.ornl.gov kernel: [3724456.892253] Lustre: atlas1-OST02e6: disconnecting 80 stale clients
Aug 20 18:30:40 atlas-oss1c7.ccs.ornl.gov kernel: [3724456.910481] Lustre: Skipped 1 previous similar message
Aug 20 18:30:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724457.995079] Lustre: atlas1-OST0256: Client 70ec26d6-748d-b3a9-160c-223d1253d455 (at 10.36.247.161@o2ib) reconnecting
Aug 20 18:30:42 atlas-oss1c7.ccs.ornl.gov kernel: [3724458.533820] Lustre: atlas1-OST0256: Client 8740c462-c2a0-e4fd-9ed9-65454c096c02 (at 10.36.202.170@o2ib) reconnecting
Aug 20 18:30:45 atlas-oss1c7.ccs.ornl.gov kernel: [3724462.371225] Lustre: atlas1-OST0376: deleting orphan objects from 0x0:6029152 to 0x0:6029249
Aug 20 18:30:46 atlas-oss1c7.ccs.ornl.gov kernel: [3724462.646208] Lustre: atlas1-OST0376: Recovery over after 30:02, of 20141 clients 20140 recovered and 1 was evicted.
Aug 20 18:30:48 atlas-oss1c7.ccs.ornl.gov kernel: [3724464.891994] Lustre: atlas1-OST0256: Client c84e9dc8-8013-4d66-16d6-ef6754d4bc07 (at 74@gni2) reconnecting
Aug 20 18:30:48 atlas-oss1c7.ccs.ornl.gov kernel: [3724464.923371] Lustre: Skipped 1 previous similar message
Aug 20 18:30:50 atlas-oss1c7.ccs.ornl.gov kernel: [3724466.920169] Lustre: atlas1-OST0256: Client c3f43b46-e37e-a161-6de6-25e4e9c02d21 (at 59@gni2) reconnecting
Aug 20 18:30:50 atlas-oss1c7.ccs.ornl.gov kernel: [3724466.943909] Lustre: Skipped 61 previous similar messages
Aug 20 18:30:54 atlas-oss1c7.ccs.ornl.gov kernel: [3724470.952647] Lustre: atlas1-OST0256: Client a0c02c7f-2f8b-b64f-08a2-a95cab971755 (at 45@gni2) reconnecting
Aug 20 18:30:54 atlas-oss1c7.ccs.ornl.gov kernel: [3724470.975803] Lustre: Skipped 92 previous similar messages
Aug 20 18:31:06 atlas-oss1c7.ccs.ornl.gov kernel: [3724483.352734] Lustre: atlas1-OST0256: Client 6fdda7f5-5c50-83a8-b1be-51091bf27a94 (at 10.36.207.210@o2ib) reconnecting
Aug 20 18:31:06 atlas-oss1c7.ccs.ornl.gov kernel: [3724483.380846] Lustre: Skipped 7 previous similar messages
Aug 20 18:31:28 atlas-oss1c7.ccs.ornl.gov kernel: [3724505.067301] Lustre: atlas1-OST0016: Denying connection for new client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4), waiting for all 20148 known clients (20058 recovered, 2 in progress, and 88 evicted) to recover in 4:20
Aug 20 18:31:28 atlas-oss1c7.ccs.ornl.gov kernel: [3724505.129466] Lustre: Skipped 684 previous similar messages
Aug 20 18:31:34 atlas-oss1c7.ccs.ornl.gov kernel: [3724511.058030] Lustre: atlas1-OST0256: Client ab4f6757-0a23-d306-47b6-4b76d0f7c455 (at 10.36.247.131@o2ib) reconnecting
Aug 20 18:31:34 atlas-oss1c7.ccs.ornl.gov kernel: [3724511.081573] Lustre: Skipped 5 previous similar messages
Aug 20 18:31:57 atlas-oss1c7.ccs.ornl.gov kernel: [3724533.729773] Lustre: atlas1-OST00a6: recovery is timed out, evict stale exports
Aug 20 18:31:57 atlas-oss1c7.ccs.ornl.gov kernel: [3724533.749814] Lustre: Skipped 1 previous similar message
Aug 20 18:31:57 atlas-oss1c7.ccs.ornl.gov kernel: [3724533.772105] Lustre: atlas1-OST00a6: disconnecting 1 stale clients
Aug 20 18:31:57 atlas-oss1c7.ccs.ornl.gov kernel: [3724533.789891] Lustre: Skipped 1 previous similar message
Aug 20 18:32:02 atlas-oss1c7.ccs.ornl.gov kernel: [3724538.925337] Lustre: atlas1-OST0136: deleting orphan objects from 0x0:5948436 to 0x0:5948609
Aug 20 18:32:02 atlas-oss1c7.ccs.ornl.gov kernel: [3724539.227522] Lustre: atlas1-OST0136: Recovery over after 31:27, of 20221 clients 20140 recovered and 81 were evicted.
Aug 20 18:32:02 atlas-oss1c7.ccs.ornl.gov kernel: [3724539.250769] LustreError: 14216:0:(ldlm_resource.c:1165:ldlm_resource_get()) atlas1-OST0136: lvbo_init failed for resource 0x5ac40d:0x0: rc = -2
Aug 20 18:32:06 atlas-oss1c7.ccs.ornl.gov kernel: [3724543.406996] Lustre: atlas1-OST0136: Client fb17880b-8acf-aa1a-35fb-70a1e0885903 (at 10.36.207.225@o2ib) reconnecting
Aug 20 18:32:06 atlas-oss1c7.ccs.ornl.gov kernel: [3724543.433510] Lustre: Skipped 26 previous similar messages
Aug 20 18:33:10 atlas-oss1c7.ccs.ornl.gov kernel: [3724607.446732] Lustre: atlas1-OST0256: Client 12fb1411-e187-2143-d0db-868ac4d6e9c6 (at 10.36.247.142@o2ib) reconnecting
Aug 20 18:33:10 atlas-oss1c7.ccs.ornl.gov kernel: [3724607.477943] Lustre: Skipped 138 previous similar messages
Aug 20 18:33:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724637.858394] Lustre: atlas1-OST00a6: Client 7c582689-5c88-37ea-9012-406c7fbb86c4 (at 10.36.207.219@o2ib) reconnecting, waiting for 20062 clients in recovery for 3:34
Aug 20 18:33:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724637.858399] Lustre: atlas1-OST02e6: Client 7c582689-5c88-37ea-9012-406c7fbb86c4 (at 10.36.207.219@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:33:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724637.858403] Lustre: Skipped 2845 previous similar messages
Aug 20 18:33:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724637.969573] Lustre: Skipped 2845 previous similar messages
Aug 20 18:35:19 atlas-oss1c7.ccs.ornl.gov kernel: [3724735.818188] Lustre: atlas1-OST0136: Client 0601a537-dbff-8201-9a0c-01ff9d2809a3 (at 10.38.146.45@o2ib4) reconnecting
Aug 20 18:35:19 atlas-oss1c7.ccs.ornl.gov kernel: [3724735.856597] Lustre: Skipped 189 previous similar messages
Aug 20 18:35:41 atlas-oss1c7.ccs.ornl.gov kernel: [3724758.216968] Lustre: atlas1-OST00a6: deleting orphan objects from 0x0:6351625 to 0x0:6351649
Aug 20 18:35:42 atlas-oss1c7.ccs.ornl.gov kernel: [3724758.585563] Lustre: atlas1-OST00a6: Recovery over after 35:10, of 20062 clients 20048 recovered and 14 were evicted.
Aug 20 18:35:42 atlas-oss1c7.ccs.ornl.gov kernel: [3724758.605857] LustreError: 14234:0:(ldlm_resource.c:1165:ldlm_resource_get()) atlas1-OST00a6: lvbo_init failed for resource 0x60eb03:0x0: rc = -2
Aug 20 18:35:58 atlas-oss1c7.ccs.ornl.gov kernel: [3724774.586422] Lustre: atlas1-OST02e6: recovery is timed out, evict stale exports
Aug 20 18:35:58 atlas-oss1c7.ccs.ornl.gov kernel: [3724774.601444] Lustre: Skipped 1 previous similar message
Aug 20 18:35:58 atlas-oss1c7.ccs.ornl.gov kernel: [3724774.623499] Lustre: atlas1-OST02e6: disconnecting 1 stale clients
Aug 20 18:35:58 atlas-oss1c7.ccs.ornl.gov kernel: [3724774.641160] Lustre: Skipped 1 previous similar message
Aug 20 18:35:58 atlas-oss1c7.ccs.ornl.gov kernel: [3724774.826092] Lustre: atlas1-OST02e6: deleting orphan objects from 0x0:5599514 to 0x0:5599553
Aug 20 18:35:58 atlas-oss1c7.ccs.ornl.gov kernel: [3724775.106054] Lustre: atlas1-OST02e6: Recovery over after 35:18, of 20221 clients 20140 recovered and 81 were evicted.
Aug 20 18:36:40 atlas-oss1c7.ccs.ornl.gov kernel: [3724817.126570] LustreError: 137-5: atlas1-OST0132_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 18:36:40 atlas-oss1c7.ccs.ornl.gov kernel: [3724817.157292] LustreError: Skipped 1 previous similar message
Aug 20 18:36:54 atlas-oss1c7.ccs.ornl.gov kernel: [3724830.689524] Lustre: atlas1-OST01c6: recovery is timed out, evict stale exports
Aug 20 18:36:54 atlas-oss1c7.ccs.ornl.gov kernel: [3724830.705395] Lustre: atlas1-OST01c6: disconnecting 1 stale clients
Aug 20 18:36:54 atlas-oss1c7.ccs.ornl.gov kernel: [3724830.885437] Lustre: atlas1-OST01c6: deleting orphan objects from 0x0:5977938 to 0x0:5977953
Aug 20 18:36:54 atlas-oss1c7.ccs.ornl.gov kernel: [3724831.371913] Lustre: atlas1-OST01c6: Recovery over after 36:17, of 20221 clients 20140 recovered and 81 were evicted.
Aug 20 18:38:14 atlas-oss1c7.ccs.ornl.gov kernel: [3724911.223830] Lustre: atlas1-OST0016: recovery is timed out, evict stale exports
Aug 20 18:38:14 atlas-oss1c7.ccs.ornl.gov kernel: [3724911.245024] Lustre: atlas1-OST0016: disconnecting 1 stale clients
Aug 20 18:38:14 atlas-oss1c7.ccs.ornl.gov kernel: [3724911.531492] Lustre: atlas1-OST0016: deleting orphan objects from 0x0:5635849 to 0x0:5635873
Aug 20 18:38:15 atlas-oss1c7.ccs.ornl.gov kernel: [3724911.794947] Lustre: atlas1-OST0016: Recovery over after 37:45, of 20148 clients 20059 recovered and 89 were evicted.
Aug 20 18:39:33 atlas-oss1c7.ccs.ornl.gov kernel: [3724989.720858] Lustre: atlas1-OST0136: haven't heard from client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) in 1085 seconds. I think it's dead, and I am evicting it. exp ffff88100be73800, cur 1408574373 expire 1408573473 last 1408573288
Aug 20 18:39:35 atlas-oss1c7.ccs.ornl.gov kernel: [3724992.131113] Lustre: atlas1-OST0016: Client a99a4776-e133-c381-3625-347d03180630 (at 10.36.247.166@o2ib) reconnecting
Aug 20 18:39:35 atlas-oss1c7.ccs.ornl.gov kernel: [3724992.162875] Lustre: Skipped 494 previous similar messages
Aug 20 18:48:25 atlas-oss1c7.ccs.ornl.gov kernel: [3725522.624771] Lustre: atlas1-OST0136: Client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) reconnecting
Aug 20 18:48:25 atlas-oss1c7.ccs.ornl.gov kernel: [3725522.654817] Lustre: Skipped 137 previous similar messages
Aug 20 18:49:46 atlas-oss1c7.ccs.ornl.gov kernel: [3725603.096213] Lustre: 33357:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 18:49:46 atlas-oss1c7.ccs.ornl.gov kernel: [3725603.096215]   req@ffff880ed55cec00 x1476723490305250/t0(0) o4->72ca8e75-7239-a3a5-bda0-5c06ac977704@3443@gni109:0/0 lens 448/448 e 1 to 0 dl 1408574991 ref 2 fl Interpret:/0/0 rc 0/0
Aug 20 18:49:46 atlas-oss1c7.ccs.ornl.gov kernel: [3725603.185337] Lustre: 33357:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 18:49:46 atlas-oss1c7.ccs.ornl.gov kernel: [3725603.185339]   req@ffff880d32e0ec00 x1476723490305254/t0(0) o4->72ca8e75-7239-a3a5-bda0-5c06ac977704@3443@gni109:0/0 lens 448/448 e 1 to 0 dl 1408574991 ref 2 fl Interpret:/0/0 rc 0/0
Aug 20 18:49:51 atlas-oss1c7.ccs.ornl.gov kernel: [3725608.403172] Lustre: 33070:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 18:49:51 atlas-oss1c7.ccs.ornl.gov kernel: [3725608.403173]   req@ffff880fbcc85c00 x1476723492797916/t0(0) o4->f96811f0-031b-f9c1-d0e0-08300562f42d@2195@gni112:0/0 lens 448/448 e 1 to 0 dl 1408574996 ref 2 fl Interpret:/0/0 rc 0/0
Aug 20 18:49:55 atlas-oss1c7.ccs.ornl.gov kernel: [3725612.099614] Lustre: 33357:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 18:49:55 atlas-oss1c7.ccs.ornl.gov kernel: [3725612.099616]   req@ffff88059cb39400 x1476723479476607/t0(0) o4->137b0810-5707-e776-88bf-bc4844f3f16b@2739@gni106:0/0 lens 448/448 e 1 to 0 dl 1408575000 ref 2 fl Interpret:/0/0 rc 0/0
Aug 20 18:50:08 atlas-oss1c7.ccs.ornl.gov kernel: [3725625.419636] Lustre: 33158:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 18:50:08 atlas-oss1c7.ccs.ornl.gov kernel: [3725625.419638]   req@ffff880f78327800 x1476723497664039/t0(0) o4->b86ad8f3-7b0f-8d49-ea7c-3ebf302a856d@1905@gni109:0/0 lens 448/448 e 1 to 0 dl 1408575013 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 18:50:15 atlas-oss1c7.ccs.ornl.gov kernel: [3725632.711487] Lustre: 14479:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 18:50:15 atlas-oss1c7.ccs.ornl.gov kernel: [3725632.711488]   req@ffff8806b6365800 x1470545220923918/t0(0) o2->e7003d96-79dc-6ac6-5da0-8080eb1fecf0@10.36.205.200@o2ib:0/0 lens 408/432 e 1 to 0 dl 1408575020 ref 2 fl Interpret:/0/0 rc 0/0
Aug 20 18:50:15 atlas-oss1c7.ccs.ornl.gov kernel: [3725632.796818] Lustre: 14479:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 3 previous similar messages
Aug 20 18:50:43 atlas-oss1c7.ccs.ornl.gov kernel: [3725660.422768] Lustre: 33067:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-49), not sending early reply
Aug 20 18:50:43 atlas-oss1c7.ccs.ornl.gov kernel: [3725660.422770]   req@ffff880a91a15000 x1475535419443572/t0(0) o4->a8d18045-7eae-ed4a-5a51-f96f28e23342@10.36.202.130@o2ib:0/0 lens 488/448 e 1 to 0 dl 1408575048 ref 2 fl Interpret:H/0/0 rc 0/0
Aug 20 18:50:43 atlas-oss1c7.ccs.ornl.gov kernel: [3725660.507416] Lustre: 33067:0:(service.c:1339:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message
Aug 20 18:51:16 atlas-oss1c7.ccs.ornl.gov kernel: [3725693.783664] Lustre: atlas1-OST0256: Client a8d18045-7eae-ed4a-5a51-f96f28e23342 (at 10.36.202.130@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 18:51:16 atlas-oss1c7.ccs.ornl.gov kernel: [3725693.820084] Lustre: Skipped 521 previous similar messages
Aug 20 18:52:28 atlas-oss1c7.ccs.ornl.gov kernel: [3725765.226132] Lustre: lock timed out (enqueued at 1408574398, 750s ago)
Aug 20 18:52:28 atlas-oss1c7.ccs.ornl.gov kernel: [3725765.246432] LustreError: dumping log to /tmp/lustre-log.1408575148.15462
Aug 20 18:59:03 atlas-oss1c7.ccs.ornl.gov kernel: [3726160.965625] Lustre: atlas1-OST0256: Client atlas1-MDT0000-mdtlov_UUID (at 10.36.226.72@o2ib) reconnecting
Aug 20 18:59:03 atlas-oss1c7.ccs.ornl.gov kernel: [3726160.996530] Lustre: Skipped 10 previous similar messages
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.406819] LNet: Service thread pid 33246 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.452928] Pid: 33246, comm: ll_ost_io01_078
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.472020] 
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.472021] Call Trace:
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.482938]  [<ffffffffa023b3fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.503407]  [<ffffffff811b6580>] ? sync_buffer+0x0/0x50
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.523134]  [<ffffffff8150cda3>] io_schedule+0x73/0xc0
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.542760]  [<ffffffff811b65c0>] sync_buffer+0x40/0x50
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.562392]  [<ffffffff8150d75f>] __wait_on_bit+0x5f/0x90
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.582383]  [<ffffffff811b6580>] ? sync_buffer+0x0/0x50
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.602016]  [<ffffffff8150d808>] out_of_line_wait_on_bit+0x78/0x90
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.622242]  [<ffffffff81096fd0>] ? wake_bit_function+0x0/0x50
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.642220]  [<ffffffff811b6576>] __wait_on_buffer+0x26/0x30
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.662228]  [<ffffffffa10188c4>] ldiskfs_mb_init_cache+0x254/0xa20 [ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.682812]  [<ffffffffa10191ae>] ldiskfs_mb_init_group+0x11e/0x210 [ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.703567]  [<ffffffffa101936d>] ldiskfs_mb_good_group+0xcd/0x110 [ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.732470]  [<ffffffffa101ab1b>] ldiskfs_mb_regular_allocator+0x19b/0x410 [ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.753360]  [<ffffffffa101cbed>] ldiskfs_mb_new_blocks+0x46d/0x620 [ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.782713]  [<ffffffff8113b769>] ? zone_statistics+0x99/0xc0
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.802681]  [<ffffffffa04d4839>] ldiskfs_ext_new_extent_cb+0x559/0x67c [fsfilt_ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.832110]  [<ffffffffa10025ef>] ldiskfs_ext_walk_space+0x14f/0x340 [ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.852879]  [<ffffffffa04d42e0>] ? ldiskfs_ext_new_extent_cb+0x0/0x67c [fsfilt_ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.882538]  [<ffffffffa0fad0af>] ? qsd_op_begin+0x5f/0xb40 [lquota]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.902914]  [<ffffffffa04d402c>] fsfilt_map_nblocks+0xcc/0xf0 [fsfilt_ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.923815]  [<ffffffffa04d4150>] fsfilt_ldiskfs_map_ext_inode_pages+0x100/0x200 [fsfilt_ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.962181]  [<ffffffffa04d42d5>] fsfilt_ldiskfs_map_inode_pages+0x85/0x90 [fsfilt_ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726176.983660]  [<ffffffffa1035983>] ? ldiskfs_dquot_initialize+0x73/0xc0 [ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.013011]  [<ffffffffa109f082>] osd_write_commit+0x302/0x610 [osd_ldiskfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.033578]  [<ffffffffa1164ec4>] ofd_commitrw_write+0x684/0x11b0 [ofd]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.062192]  [<ffffffffa1167c2d>] ofd_commitrw+0x5cd/0xbb0 [ofd]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.082270]  [<ffffffffa044f7e5>] ? lprocfs_counter_add+0x125/0x182 [lvfs]
Aug 20 18:59:19 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.102804]  [<ffffffffa111c1d8>] obd_commitrw+0x128/0x3d0 [ost]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.122965]  [<ffffffffa11261d1>] ost_brw_write+0xea1/0x15d0 [ost]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.143080]  [<ffffffff8127d78c>] ? put_dec+0x10c/0x110
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.162884]  [<ffffffffa0ae71b0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.183354]  [<ffffffffa112c42b>] ost_handle+0x3ecb/0x48e0 [ost]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.203465]  [<ffffffffa0b2ed8b>] ? ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.232870]  [<ffffffffa0b37568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.253739]  [<ffffffffa08905de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.282277]  [<ffffffffa08a1d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.302915]  [<ffffffffa0b2e8c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.323424]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.342901]  [<ffffffffa0b388fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.363277]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.383603]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.403156]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.423437]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.443622]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.463124] 
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.472531] LustreError: dumping log to /tmp/lustre-log.1408575560.33246
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.540774] LNet: Service thread pid 33116 was inactive for 1201.13s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.593156] Pid: 33116, comm: ll_ost_io01_035
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.612413] 
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.612414] Call Trace:
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.623160]  [<ffffffffa02baf6c>] ? mlx4_ib_post_send+0x4fc/0x1280 [mlx4_ib]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.643871]  [<ffffffff8150eb75>] rwsem_down_failed_common+0x95/0x1d0
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.672430]  [<ffffffff8150ed06>] rwsem_down_read_failed+0x26/0x30
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.692634]  [<ffffffffa091a6c7>] ? kiblnd_post_tx_locked+0x487/0x930 [ko2iblnd]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.713569]  [<ffffffff81281ce4>] call_rwsem_down_read_failed+0x14/0x30
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.733891]  [<ffffffffa04d42e0>] ? ldiskfs_ext_new_extent_cb+0x0/0x67c [fsfilt_ldiskfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.763358]  [<ffffffff8150e204>] ? down_read+0x24/0x30
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.783088]  [<ffffffffa1002513>] ldiskfs_ext_walk_space+0x73/0x340 [ldiskfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.803830]  [<ffffffffa04d42e0>] ? ldiskfs_ext_new_extent_cb+0x0/0x67c [fsfilt_ldiskfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.833432]  [<ffffffffa0fad0af>] ? qsd_op_begin+0x5f/0xb40 [lquota]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.853785]  [<ffffffffa04d402c>] fsfilt_map_nblocks+0xcc/0xf0 [fsfilt_ldiskfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.883293]  [<ffffffffa04d4150>] fsfilt_ldiskfs_map_ext_inode_pages+0x100/0x200 [fsfilt_ldiskfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.913309]  [<ffffffffa04d42d5>] fsfilt_ldiskfs_map_inode_pages+0x85/0x90 [fsfilt_ldiskfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.943156]  [<ffffffffa1035983>] ? ldiskfs_dquot_initialize+0x73/0xc0 [ldiskfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.963888]  [<ffffffffa109f082>] osd_write_commit+0x302/0x610 [osd_ldiskfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726177.992802]  [<ffffffffa1164ec4>] ofd_commitrw_write+0x684/0x11b0 [ofd]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.013357]  [<ffffffffa1167c2d>] ofd_commitrw+0x5cd/0xbb0 [ofd]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.033486]  [<ffffffffa044f7e5>] ? lprocfs_counter_add+0x125/0x182 [lvfs]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.054094]  [<ffffffffa111c1d8>] obd_commitrw+0x128/0x3d0 [ost]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.074062]  [<ffffffffa11261d1>] ost_brw_write+0xea1/0x15d0 [ost]
Aug 20 18:59:20 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.102819]  [<ffffffffa11230ef>] ? ost_brw_read+0x68f/0x1340 [ost]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.123025]  [<ffffffffa0ae71b0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.143521]  [<ffffffffa112c42b>] ost_handle+0x3ecb/0x48e0 [ost]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.163633]  [<ffffffffa1120aeb>] ? ost_rw_hpreq_check+0x25b/0x500 [ost]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.184195]  [<ffffffffa0b2ed8b>] ? ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.213650]  [<ffffffffa0b37568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.242866]  [<ffffffffa08905de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.263247]  [<ffffffffa08a1d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.283782]  [<ffffffffa0b2e8c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.304080]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.324002]  [<ffffffffa0b388fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.352702]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.372701]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.384027]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.404302]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.432909]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.444151] 
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.453301] LNet: Service thread pid 14435 was inactive for 1200.45s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.512927] Pid: 14435, comm: ll_ost01_034
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.523654] 
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.523654] Call Trace:
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.542755]  [<ffffffff81130321>] ? mark_page_accessed+0x41/0x50
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.562933]  [<ffffffff811b5bf7>] ? __find_get_block+0x97/0xe0
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.582948]  [<ffffffffa061718a>] start_this_handle+0x27a/0x4a0 [jbd2]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.603388]  [<ffffffff81096f90>] ? autoremove_wake_function+0x0/0x40
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.623736]  [<ffffffffa06175b0>] jbd2_journal_start+0xd0/0x110 [jbd2]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.644120]  [<ffffffffa10ad742>] ? osd_declare_inode_qid+0x1a2/0x270 [osd_ldiskfs]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.673641]  [<ffffffffa1035546>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.694410]  [<ffffffffa1080f8f>] osd_trans_start+0x1df/0x680 [osd_ldiskfs]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.723620]  [<ffffffffa116045d>] ofd_trans_start+0x22d/0x3f0 [ofd]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.743791]  [<ffffffffa116423c>] ofd_attr_set+0x38c/0x6c0 [ofd]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.764068]  [<ffffffffa1155de8>] ofd_setattr+0x678/0xc10 [ofd]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.784081]  [<ffffffffa1126c1c>] ost_setattr+0x31c/0x990 [ost]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.804430]  [<ffffffffa112a746>] ost_handle+0x21e6/0x48e0 [ost]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.824359]  [<ffffffffa0b2ed8b>] ? ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.853506]  [<ffffffffa0b37568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.882924]  [<ffffffffa08905de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.903287]  [<ffffffffa08a1d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.923864]  [<ffffffffa0b2e8c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.944187]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.964053]  [<ffffffffa0b388fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726178.984434]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.012972]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.024104]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.044511]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.072965]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.084214] 
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.093530] Pid: 33057, comm: ll_ost_io01_017
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.104404] 
Aug 20 18:59:21 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.104405] Call Trace:
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.123449]  [<ffffffffa02baf6c>] ? mlx4_ib_post_send+0x4fc/0x1280 [mlx4_ib]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.144150]  [<ffffffff8150eb75>] rwsem_down_failed_common+0x95/0x1d0
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.164513]  [<ffffffff8150ed06>] rwsem_down_read_failed+0x26/0x30
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.193029]  [<ffffffffa091a6c7>] ? kiblnd_post_tx_locked+0x487/0x930 [ko2iblnd]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.214011]  [<ffffffff81281ce4>] call_rwsem_down_read_failed+0x14/0x30
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.234632]  [<ffffffffa04d42e0>] ? ldiskfs_ext_new_extent_cb+0x0/0x67c [fsfilt_ldiskfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.264119]  [<ffffffff8150e204>] ? down_read+0x24/0x30
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.283951]  [<ffffffffa1002513>] ldiskfs_ext_walk_space+0x73/0x340 [ldiskfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.313064]  [<ffffffffa04d42e0>] ? ldiskfs_ext_new_extent_cb+0x0/0x67c [fsfilt_ldiskfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.334339]  [<ffffffffa0fad0af>] ? qsd_op_begin+0x5f/0xb40 [lquota]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.354659]  [<ffffffffa04d402c>] fsfilt_map_nblocks+0xcc/0xf0 [fsfilt_ldiskfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.383983]  [<ffffffffa04d4150>] fsfilt_ldiskfs_map_ext_inode_pages+0x100/0x200 [fsfilt_ldiskfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.414036]  [<ffffffffa04d42d5>] fsfilt_ldiskfs_map_inode_pages+0x85/0x90 [fsfilt_ldiskfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.443658]  [<ffffffffa1035983>] ? ldiskfs_dquot_initialize+0x73/0xc0 [ldiskfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.464594]  [<ffffffffa109f082>] osd_write_commit+0x302/0x610 [osd_ldiskfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.493860]  [<ffffffffa1164ec4>] ofd_commitrw_write+0x684/0x11b0 [ofd]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.514233]  [<ffffffffa1167c2d>] ofd_commitrw+0x5cd/0xbb0 [ofd]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.534409]  [<ffffffffa044f7e5>] ? lprocfs_counter_add+0x125/0x182 [lvfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.563145]  [<ffffffffa111c1d8>] obd_commitrw+0x128/0x3d0 [ost]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.583135]  [<ffffffffa11261d1>] ost_brw_write+0xea1/0x15d0 [ost]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.603336]  [<ffffffffa11230ef>] ? ost_brw_read+0x68f/0x1340 [ost]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.623804]  [<ffffffffa0ae71b0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.644093]  [<ffffffffa112c42b>] ost_handle+0x3ecb/0x48e0 [ost]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.664257]  [<ffffffffa1120aeb>] ? ost_rw_hpreq_check+0x25b/0x500 [ost]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.684750]  [<ffffffffa0b2ed8b>] ? ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.713991]  [<ffffffffa0b37568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.743399]  [<ffffffffa08905de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.763953]  [<ffffffffa08a1d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.784408]  [<ffffffffa0b2e8c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.804704]  [<ffffffffa0b388fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.833438]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.853843]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.873291]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.893513]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.913847]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 18:59:22 atlas-oss1c7.ccs.ornl.gov kernel: [3726179.933453] 
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.324526] LNet: Service thread pid 14482 was inactive for 1200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.375905] LNet: Skipped 1 previous similar message
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.395213] Pid: 14482, comm: ll_ost01_081
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.406102] 
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.406103] Call Trace:
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.425016]  [<ffffffff81130321>] ? mark_page_accessed+0x41/0x50
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.445393]  [<ffffffff811b5bf7>] ? __find_get_block+0x97/0xe0
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.465366]  [<ffffffffa061718a>] start_this_handle+0x27a/0x4a0 [jbd2]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.485581]  [<ffffffff81096f90>] ? autoremove_wake_function+0x0/0x40
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.514649]  [<ffffffffa06175b0>] jbd2_journal_start+0xd0/0x110 [jbd2]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.534848]  [<ffffffffa10ad742>] ? osd_declare_inode_qid+0x1a2/0x270 [osd_ldiskfs]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.556080]  [<ffffffffa1035546>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.585256]  [<ffffffffa1080f8f>] osd_trans_start+0x1df/0x680 [osd_ldiskfs]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.605808]  [<ffffffffa116045d>] ofd_trans_start+0x22d/0x3f0 [ofd]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.626126]  [<ffffffffa116423c>] ofd_attr_set+0x38c/0x6c0 [ofd]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.646290]  [<ffffffffa1155de8>] ofd_setattr+0x678/0xc10 [ofd]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.666184]  [<ffffffffa1126c1c>] ost_setattr+0x31c/0x990 [ost]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.694697]  [<ffffffffa112a746>] ost_handle+0x21e6/0x48e0 [ost]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.714775]  [<ffffffffa0b2ed8b>] ? ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.735688]  [<ffffffffa0b37568>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.765316]  [<ffffffffa08905de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.785725]  [<ffffffffa08a1d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.806046]  [<ffffffffa0b2e8c9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.834953]  [<ffffffff81055cc3>] ? __wake_up+0x53/0x70
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.846381]  [<ffffffffa0b388fe>] ptlrpc_main+0xace/0x1700 [ptlrpc]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.875009]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.895186]  [<ffffffff8100c0ca>] child_rip+0xa/0x20
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.914824]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.934845]  [<ffffffffa0b37e30>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.955003]  [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.966310] 
Aug 20 18:59:26 atlas-oss1c7.ccs.ornl.gov kernel: [3726183.975619] LustreError: dumping log to /tmp/lustre-log.1408575566.14482
Aug 20 18:59:36 atlas-oss1c7.ccs.ornl.gov kernel: [3726193.214152] LNet: Service thread pid 33166 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
Aug 20 18:59:36 atlas-oss1c7.ccs.ornl.gov kernel: [3726193.249111] LustreError: dumping log to /tmp/lustre-log.1408575576.33166
Aug 20 18:59:36 atlas-oss1c7.ccs.ornl.gov kernel: [3726193.300187] LNet: Service thread pid 33313 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
Aug 20 18:59:36 atlas-oss1c7.ccs.ornl.gov kernel: [3726193.339003] LustreError: dumping log to /tmp/lustre-log.1408575576.33313
Aug 20 18:59:36 atlas-oss1c7.ccs.ornl.gov kernel: [3726193.446213] LustreError: dumping log to /tmp/lustre-log.1408575576.33111
Aug 20 18:59:53 atlas-oss1c7.ccs.ornl.gov kernel: [3726210.636674] LNet: Service thread pid 33188 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
Aug 20 18:59:53 atlas-oss1c7.ccs.ornl.gov kernel: [3726210.675490] LNet: Skipped 2 previous similar messages
Aug 20 18:59:53 atlas-oss1c7.ccs.ornl.gov kernel: [3726210.695150] LustreError: dumping log to /tmp/lustre-log.1408575593.33188
Aug 20 18:59:58 atlas-oss1c7.ccs.ornl.gov kernel: [3726215.394456] LNet: Service thread pid 15462 was inactive for 1200.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
Aug 20 18:59:58 atlas-oss1c7.ccs.ornl.gov kernel: [3726215.427729] LustreError: dumping log to /tmp/lustre-log.1408575598.15462
Aug 20 18:59:58 atlas-oss1c7.ccs.ornl.gov kernel: [3726215.530520] LustreError: dumping log to /tmp/lustre-log.1408575598.33153
Aug 20 19:01:16 atlas-oss1c7.ccs.ornl.gov kernel: [3726294.014670] Lustre: atlas1-OST0256: Client a8d18045-7eae-ed4a-5a51-f96f28e23342 (at 10.36.202.130@o2ib) refused reconnection, still busy with 1 active RPCs
Aug 20 19:01:16 atlas-oss1c7.ccs.ornl.gov kernel: [3726294.056613] Lustre: Skipped 12 previous similar messages
Aug 20 19:06:00 atlas-oss1c7.ccs.ornl.gov kernel: [3726577.569806] Lustre: 14482:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:940s); client may timeout.  req@ffff8806b6365800 x1470545220923918/t111669149961(0) o2->e7003d96-79dc-6ac6-5da0-8080eb1fecf0@10.36.205.200@o2ib:0/0 lens 408/400 e 1 to 0 dl 1408575020 ref 1 fl Complete:/0/0 rc 0/0
Aug 20 19:06:00 atlas-oss1c7.ccs.ornl.gov kernel: [3726577.665461] LNet: Service thread pid 14482 completed after 1594.19s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Aug 20 19:06:05 atlas-oss1c7.ccs.ornl.gov kernel: [3726582.312172] Lustre: 33246:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:952s); client may timeout.  req@ffff880ef4f1b400 x1476723497664047/t111669149952(0) o4->b86ad8f3-7b0f-8d49-ea7c-3ebf302a856d@1905@gni109:0/0 lens 448/416 e 1 to 0 dl 1408575013 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 19:06:05 atlas-oss1c7.ccs.ornl.gov kernel: [3726582.397318] LustreError: 33057:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 19:06:05 atlas-oss1c7.ccs.ornl.gov kernel: [3726582.397320]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 4/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 15 pid: 14426 timeout: 8020116000 lvb_type: 0
Aug 20 19:06:05 atlas-oss1c7.ccs.ornl.gov kernel: [3726582.398170] Lustre: 33246:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 1 previous similar message
Aug 20 19:06:05 atlas-oss1c7.ccs.ornl.gov kernel: [3726582.557550] LustreError: 33246:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 19:06:05 atlas-oss1c7.ccs.ornl.gov kernel: [3726582.557551]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 5/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 15 pid: 14426 timeout: 8020598442 lvb_type: 0
Aug 20 19:06:05 atlas-oss1c7.ccs.ornl.gov kernel: [3726582.557555] LNet: Service thread pid 33057 completed after 1605.99s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Aug 20 19:06:05 atlas-oss1c7.ccs.ornl.gov kernel: [3726583.032379] Lustre: 33166:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:938s); client may timeout.  req@ffff880e841fe000 x1476723497664056/t111669149964(0) o4->b86ad8f3-7b0f-8d49-ea7c-3ebf302a856d@1905@gni109:0/0 lens 448/416 e 1 to 0 dl 1408575027 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 19:06:05 atlas-oss1c7.ccs.ornl.gov kernel: [3726583.118211] Lustre: 33166:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 2 previous similar messages
Aug 20 19:06:06 atlas-oss1c7.ccs.ornl.gov kernel: [3726583.156660] LustreError: 33166:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 19:06:06 atlas-oss1c7.ccs.ornl.gov kernel: [3726583.156661]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 4/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 6 pid: 14426 timeout: 8020598632 lvb_type: 0
Aug 20 19:06:06 atlas-oss1c7.ccs.ornl.gov kernel: [3726583.277963] LustreError: 33166:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) Skipped 1 previous similar message
Aug 20 19:06:06 atlas-oss1c7.ccs.ornl.gov kernel: [3726583.318936] LNet: Service thread pid 33166 completed after 1589.95s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Aug 20 19:06:06 atlas-oss1c7.ccs.ornl.gov kernel: [3726583.376259] LNet: Skipped 3 previous similar messages
Aug 20 19:06:18 atlas-oss1c7.ccs.ornl.gov kernel: [3726595.522101] Lustre: 33302:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:987s); client may timeout.  req@ffff880d32e0ec00 x1476723490305254/t111669149976(0) o4->72ca8e75-7239-a3a5-bda0-5c06ac977704@3443@gni109:0/0 lens 448/416 e 1 to 0 dl 1408574991 ref 1 fl Complete:/0/0 rc 0/0
Aug 20 19:06:18 atlas-oss1c7.ccs.ornl.gov kernel: [3726595.522912] LNet: Service thread pid 33313 completed after 1602.07s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Aug 20 19:06:18 atlas-oss1c7.ccs.ornl.gov kernel: [3726595.522915] LNet: Skipped 2 previous similar messages
Aug 20 19:06:18 atlas-oss1c7.ccs.ornl.gov kernel: [3726595.682167] Lustre: 33302:0:(service.c:2031:ptlrpc_server_handle_request()) Skipped 3 previous similar messages
Aug 20 19:06:30 atlas-oss1c7.ccs.ornl.gov kernel: [3726607.889289] Lustre: 33153:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:942s); client may timeout.  req@ffff880a91a15000 x1475535419443572/t111669149978(0) o4->a8d18045-7eae-ed4a-5a51-f96f28e23342@10.36.202.130@o2ib:0/0 lens 488/416 e 1 to 0 dl 1408575048 ref 1 fl Complete:H/0/0 rc 0/0
Aug 20 19:06:30 atlas-oss1c7.ccs.ornl.gov kernel: [3726607.985601] LustreError: 33153:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 19:06:30 atlas-oss1c7.ccs.ornl.gov kernel: [3726607.985603]  ns: filter-atlas1-OST0256_UUID lock: ffff880e8736b6c0/0xecd0a12120b992ed lrc: 4/0,0 mode: PW/PW res: [0x5965f0:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x10020 nid: 10.36.202.130@o2ib remote: 0x3d25f9172d41d49f expref: 8 pid: 17872 timeout: 8020155000 lvb_type: 0
Aug 20 19:06:30 atlas-oss1c7.ccs.ornl.gov kernel: [3726608.116059] LustreError: 33153:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 730 seconds
Aug 20 19:06:30 atlas-oss1c7.ccs.ornl.gov kernel: [3726608.116061]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 3/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 5 pid: 14426 timeout: 8020599202 lvb_type: 0
Aug 20 19:06:31 atlas-oss1c7.ccs.ornl.gov kernel: [3726608.246843] LNet: Service thread pid 33153 completed after 1592.56s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Aug 20 19:06:31 atlas-oss1c7.ccs.ornl.gov kernel: [3726608.305953] LNet: Skipped 1 previous similar message
Aug 20 19:07:53 atlas-oss1c7.ccs.ornl.gov kernel: [3726690.348155] LustreError: 137-5: atlas1-OST01c2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 19:07:53 atlas-oss1c7.ccs.ornl.gov kernel: [3726690.377664] LustreError: Skipped 1 previous similar message
Aug 20 19:08:46 atlas-oss1c7.ccs.ornl.gov kernel: [3726744.184116] LustreError: 33301:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 19:08:46 atlas-oss1c7.ccs.ornl.gov kernel: [3726744.184118]  ns: filter-atlas1-OST0256_UUID lock: ffff880e8736b6c0/0xecd0a12120b992ed lrc: 4/0,0 mode: PW/PW res: [0x5965f0:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x10020 nid: 10.36.202.130@o2ib remote: 0x3d25f9172d41d49f expref: 8 pid: 17872 timeout: 8020623991 lvb_type: 0
Aug 20 19:08:53 atlas-oss1c7.ccs.ornl.gov kernel: [3726751.029280] LustreError: 33100:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 19:08:53 atlas-oss1c7.ccs.ornl.gov kernel: [3726751.029282]  ns: filter-atlas1-OST0256_UUID lock: ffff880e8736b6c0/0xecd0a12120b992ed lrc: 4/0,0 mode: PW/PW res: [0x5965f0:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x10020 nid: 10.36.202.130@o2ib remote: 0x3d25f9172d41d49f expref: 10 pid: 17872 timeout: 8020760132 lvb_type: 0
Aug 20 19:09:02 atlas-oss1c7.ccs.ornl.gov kernel: [3726759.445829] Lustre: 15462:0:(service.c:2031:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (654:1090s); client may timeout.  req@ffff880d779dc800 x1476423544406988/t111669149980(0) o6->atlas1-MDT0000-mdtlov_UUID@10.36.226.72@o2ib:0/0 lens 664/400 e 1 to 0 dl 1408575052 ref 1 fl Complete:/0/0 rc 0/0
Aug 20 19:09:02 atlas-oss1c7.ccs.ornl.gov kernel: [3726759.534270] LNet: Service thread pid 15462 completed after 1743.93s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Aug 20 19:09:03 atlas-oss1c7.ccs.ornl.gov kernel: [3726761.192076] Lustre: atlas1-OST0256: deleting orphan objects from 0x0:5858836 to 0x0:5858849
Aug 20 19:09:57 atlas-oss1c7.ccs.ornl.gov kernel: [3726815.293878] Lustre: atlas1-OST0256: Client 924415ef-1788-c67c-ad02-d24453551566 (at 9228@gni102) reconnecting
Aug 20 19:09:57 atlas-oss1c7.ccs.ornl.gov kernel: [3726815.324839] Lustre: Skipped 27 previous similar messages
Aug 20 19:14:15 atlas-oss1c7.ccs.ornl.gov kernel: [3727073.178489] LustreError: 33055:0:(ldlm_lockd.c:460:__ldlm_add_waiting_lock()) ### requested timeout 755, more than at_max 600
Aug 20 19:14:15 atlas-oss1c7.ccs.ornl.gov kernel: [3727073.178491]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 5/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 6 pid: 14426 timeout: 8020599202 lvb_type: 0
Aug 20 19:14:16 atlas-oss1c7.ccs.ornl.gov kernel: [3727074.013205] LustreError: 15441:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 754 seconds
Aug 20 19:14:16 atlas-oss1c7.ccs.ornl.gov kernel: [3727074.013207]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 3/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 9 pid: 14426 timeout: 8021088994 lvb_type: 0
Aug 20 19:14:31 atlas-oss1c7.ccs.ornl.gov kernel: [3727089.189201] LustreError: 15475:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 739 seconds
Aug 20 19:14:31 atlas-oss1c7.ccs.ornl.gov kernel: [3727089.189203]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 3/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 12 pid: 14426 timeout: 8021088994 lvb_type: 0
Aug 20 19:14:31 atlas-oss1c7.ccs.ornl.gov kernel: [3727089.318722] LustreError: 15475:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 9 previous similar messages
Aug 20 19:14:36 atlas-oss1c7.ccs.ornl.gov kernel: [3727094.197598] LustreError: 15458:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 734 seconds
Aug 20 19:14:36 atlas-oss1c7.ccs.ornl.gov kernel: [3727094.197600]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 3/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 12 pid: 14426 timeout: 8021088994 lvb_type: 0
Aug 20 19:14:36 atlas-oss1c7.ccs.ornl.gov kernel: [3727094.321072] LustreError: 15458:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 7 previous similar messages
Aug 20 19:14:41 atlas-oss1c7.ccs.ornl.gov kernel: [3727099.211049] LustreError: 15441:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 729 seconds
Aug 20 19:14:41 atlas-oss1c7.ccs.ornl.gov kernel: [3727099.211051]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 3/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 12 pid: 14426 timeout: 8021088994 lvb_type: 0
Aug 20 19:14:49 atlas-oss1c7.ccs.ornl.gov kernel: [3727106.933576] LustreError: 15433:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 721 seconds
Aug 20 19:14:49 atlas-oss1c7.ccs.ornl.gov kernel: [3727106.933578]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 3/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 12 pid: 14426 timeout: 8021088994 lvb_type: 0
Aug 20 19:14:49 atlas-oss1c7.ccs.ornl.gov kernel: [3727107.065460] LustreError: 15433:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 2 previous similar messages
Aug 20 19:15:10 atlas-oss1c7.ccs.ornl.gov kernel: [3727127.512638] LustreError: 15457:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 700 seconds
Aug 20 19:15:10 atlas-oss1c7.ccs.ornl.gov kernel: [3727127.512640]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 3/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 12 pid: 14426 timeout: 8021088994 lvb_type: 0
Aug 20 19:15:10 atlas-oss1c7.ccs.ornl.gov kernel: [3727127.643276] LustreError: 15457:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 6 previous similar messages
Aug 20 19:15:30 atlas-oss1c7.ccs.ornl.gov kernel: [3727147.916792] LustreError: 15427:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) ### Adding a lock, but the front position is scheduled in 680 seconds
Aug 20 19:15:30 atlas-oss1c7.ccs.ornl.gov kernel: [3727147.916794]  ns: filter-atlas1-OST0256_UUID lock: ffff880329e40240/0xecd0a12120b98381 lrc: 3/0,0 mode: PW/PW res: [0x5965ef:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->208895) flags: 0x20 nid: 1905@gni109 remote: 0x200ba227782b5270 expref: 12 pid: 14426 timeout: 8021088994 lvb_type: 0
Aug 20 19:15:30 atlas-oss1c7.ccs.ornl.gov kernel: [3727148.041174] LustreError: 15427:0:(ldlm_lockd.c:484:__ldlm_add_waiting_lock()) Skipped 14 previous similar messages
Aug 20 19:17:20 atlas-oss1c7.ccs.ornl.gov kernel: [3727257.672232] LustreError: 137-5: atlas1-OST0132_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 19:17:20 atlas-oss1c7.ccs.ornl.gov kernel: [3727257.702954] LustreError: Skipped 1 previous similar message
Aug 20 19:17:23 atlas-oss1c7.ccs.ornl.gov kernel: [3727260.613012] LustreError: 137-5: atlas1-OST01c2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 19:17:23 atlas-oss1c7.ccs.ornl.gov kernel: [3727260.643896] LustreError: Skipped 1 previous similar message
Aug 20 19:19:52 atlas-oss1c7.ccs.ornl.gov kernel: [3727409.735037] LustreError: 137-5: atlas1-OST00a2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 19:21:39 atlas-oss1c7.ccs.ornl.gov kernel: [3727516.953016] Lustre: atlas1-OST00a6: Client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) reconnecting
Aug 20 19:21:39 atlas-oss1c7.ccs.ornl.gov kernel: [3727516.981159] Lustre: Skipped 13 previous similar messages
Aug 20 19:26:47 atlas-oss1c7.ccs.ornl.gov kernel: [3727824.950619] LustreError: 137-5: atlas1-OST0372_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 19:26:47 atlas-oss1c7.ccs.ornl.gov kernel: [3727824.977901] LustreError: Skipped 1 previous similar message
Aug 20 19:29:19 atlas-oss1c7.ccs.ornl.gov kernel: [3727976.963730] LustreError: 137-5: atlas1-OST00a2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 19:29:19 atlas-oss1c7.ccs.ornl.gov kernel: [3727976.995156] LustreError: Skipped 1 previous similar message
Aug 20 19:31:42 atlas-oss1c7.ccs.ornl.gov kernel: [3728119.847914] Lustre: atlas1-OST00a6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 19:31:42 atlas-oss1c7.ccs.ornl.gov kernel: [3728119.889921] Lustre: Skipped 7 previous similar messages
Aug 20 19:43:40 atlas-oss1c7.ccs.ornl.gov kernel: [3728838.166084] Lustre: atlas1-OST0016: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 19:43:40 atlas-oss1c7.ccs.ornl.gov kernel: [3728838.192258] Lustre: Skipped 11 previous similar messages
Aug 20 19:50:48 atlas-oss1c7.ccs.ornl.gov kernel: [3729266.941716] LustreError: 137-5: atlas1-OST0132_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 19:55:36 atlas-oss1c7.ccs.ornl.gov kernel: [3729555.076778] Lustre: atlas1-OST0256: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 19:55:36 atlas-oss1c7.ccs.ornl.gov kernel: [3729555.103966] Lustre: Skipped 11 previous similar messages
Aug 20 19:55:55 atlas-oss1c7.ccs.ornl.gov kernel: [3729573.408711] LustreError: 137-5: atlas1-OST0252_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 19:55:55 atlas-oss1c7.ccs.ornl.gov kernel: [3729573.440359] LustreError: Skipped 2 previous similar messages
Aug 20 20:02:46 atlas-oss1c7.ccs.ornl.gov kernel: [3729985.104307] LustreError: 137-5: atlas1-OST0132_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:02:46 atlas-oss1c7.ccs.ornl.gov kernel: [3729985.127755] LustreError: Skipped 1 previous similar message
Aug 20 20:07:43 atlas-oss1c7.ccs.ornl.gov kernel: [3730281.964168] Lustre: atlas1-OST00a6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 20:07:43 atlas-oss1c7.ccs.ornl.gov kernel: [3730281.989824] Lustre: Skipped 13 previous similar messages
Aug 20 20:07:55 atlas-oss1c7.ccs.ornl.gov kernel: [3730293.687913] LustreError: 137-5: atlas1-OST00a2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:07:55 atlas-oss1c7.ccs.ornl.gov kernel: [3730293.714058] LustreError: Skipped 2 previous similar messages
Aug 20 20:10:13 atlas-oss1c7.ccs.ornl.gov kernel: [3730432.072771] LustreError: 137-5: atlas1-OST0012_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:10:13 atlas-oss1c7.ccs.ornl.gov kernel: [3730432.096592] LustreError: Skipped 3 previous similar messages
Aug 20 20:17:21 atlas-oss1c7.ccs.ornl.gov kernel: [3730859.961413] LustreError: 137-5: atlas1-OST00a2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:19:37 atlas-oss1c7.ccs.ornl.gov kernel: [3730996.341499] LustreError: 137-5: atlas1-OST02e2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:19:40 atlas-oss1c7.ccs.ornl.gov kernel: [3730999.374178] Lustre: atlas1-OST00a6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 20:19:40 atlas-oss1c7.ccs.ornl.gov kernel: [3730999.401685] Lustre: Skipped 12 previous similar messages
Aug 20 20:22:28 atlas-oss1c7.ccs.ornl.gov kernel: [3731167.142688] LustreError: 137-5: atlas1-OST02e2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:22:28 atlas-oss1c7.ccs.ornl.gov kernel: [3731167.165867] LustreError: Skipped 4 previous similar messages
Aug 20 20:26:46 atlas-oss1c7.ccs.ornl.gov kernel: [3731425.172788] LustreError: 137-5: atlas1-OST00a2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:31:54 atlas-oss1c7.ccs.ornl.gov kernel: [3731733.553272] LustreError: 137-5: atlas1-OST01c2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:31:54 atlas-oss1c7.ccs.ornl.gov kernel: [3731733.553275] LustreError: 137-5: atlas1-OST0252_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:31:54 atlas-oss1c7.ccs.ornl.gov kernel: [3731733.553278] LustreError: Skipped 1 previous similar message
Aug 20 20:34:25 atlas-oss1c7.ccs.ornl.gov kernel: [3731884.646353] Lustre: atlas1-OST01c6: Client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) reconnecting
Aug 20 20:34:25 atlas-oss1c7.ccs.ornl.gov kernel: [3731884.677461] Lustre: Skipped 10 previous similar messages
Aug 20 20:43:44 atlas-oss1c7.ccs.ornl.gov kernel: [3732444.165764] LustreError: 137-5: atlas1-OST0372_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:43:44 atlas-oss1c7.ccs.ornl.gov kernel: [3732444.189576] LustreError: Skipped 7 previous similar messages
Aug 20 20:46:22 atlas-oss1c7.ccs.ornl.gov kernel: [3732602.189560] Lustre: atlas1-OST01c6: Client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) reconnecting
Aug 20 20:46:22 atlas-oss1c7.ccs.ornl.gov kernel: [3732602.219745] Lustre: Skipped 13 previous similar messages
Aug 20 20:55:48 atlas-oss1c7.ccs.ornl.gov kernel: [3733168.305283] LustreError: 137-5: atlas1-OST01c2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 20:55:48 atlas-oss1c7.ccs.ornl.gov kernel: [3733168.334068] LustreError: Skipped 10 previous similar messages
Aug 20 20:57:38 atlas-oss1c7.ccs.ornl.gov kernel: [3733278.535486] Lustre: atlas1-OST0136: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 20:57:38 atlas-oss1c7.ccs.ornl.gov kernel: [3733278.566004] Lustre: Skipped 17 previous similar messages
Aug 20 21:07:40 atlas-oss1c7.ccs.ornl.gov kernel: [3733880.809629] LustreError: 137-5: atlas1-OST0252_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 21:07:40 atlas-oss1c7.ccs.ornl.gov kernel: [3733880.834494] LustreError: Skipped 9 previous similar messages
Aug 20 21:09:35 atlas-oss1c7.ccs.ornl.gov kernel: [3733995.910521] Lustre: atlas1-OST01c6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 21:09:35 atlas-oss1c7.ccs.ornl.gov kernel: [3733995.938234] Lustre: Skipped 13 previous similar messages
Aug 20 21:21:32 atlas-oss1c7.ccs.ornl.gov kernel: [3734712.311909] Lustre: atlas1-OST0136: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 21:21:32 atlas-oss1c7.ccs.ornl.gov kernel: [3734712.339904] Lustre: Skipped 7 previous similar messages
Aug 20 21:24:03 atlas-oss1c7.ccs.ornl.gov kernel: [3734863.534034] LustreError: 137-5: atlas1-OST00a2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 21:24:03 atlas-oss1c7.ccs.ornl.gov kernel: [3734863.557411] LustreError: Skipped 5 previous similar messages
Aug 20 21:33:29 atlas-oss1c7.ccs.ornl.gov kernel: [3735429.842605] Lustre: atlas1-OST00a6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 21:33:29 atlas-oss1c7.ccs.ornl.gov kernel: [3735429.871875] Lustre: Skipped 10 previous similar messages
Aug 20 21:36:00 atlas-oss1c7.ccs.ornl.gov kernel: [3735581.025605] LustreError: 137-5: atlas1-OST0012_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 21:36:00 atlas-oss1c7.ccs.ornl.gov kernel: [3735581.048798] LustreError: Skipped 2 previous similar messages
Aug 20 21:46:04 atlas-oss1c7.ccs.ornl.gov kernel: [3736184.848820] LustreError: 137-5: atlas1-OST0372_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 21:48:36 atlas-oss1c7.ccs.ornl.gov kernel: [3736336.921934] Lustre: atlas1-OST01c6: Client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) reconnecting
Aug 20 21:48:36 atlas-oss1c7.ccs.ornl.gov kernel: [3736336.945834] Lustre: Skipped 9 previous similar messages
Aug 20 21:58:00 atlas-oss1c7.ccs.ornl.gov kernel: [3736901.728403] LustreError: 137-5: atlas1-OST0252_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 21:58:00 atlas-oss1c7.ccs.ornl.gov kernel: [3736901.760043] LustreError: Skipped 3 previous similar messages
Aug 20 21:59:57 atlas-oss1c7.ccs.ornl.gov kernel: [3737018.790532] Lustre: atlas1-OST02e6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 21:59:57 atlas-oss1c7.ccs.ornl.gov kernel: [3737018.814400] Lustre: Skipped 12 previous similar messages
Aug 20 22:10:00 atlas-oss1c7.ccs.ornl.gov kernel: [3737621.434508] LustreError: 137-5: atlas1-OST0012_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 22:10:00 atlas-oss1c7.ccs.ornl.gov kernel: [3737621.463135] LustreError: Skipped 5 previous similar messages
Aug 20 22:10:00 atlas-oss1c7.ccs.ornl.gov kernel: [3737622.026192] Lustre: atlas1-OST01c6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 22:10:00 atlas-oss1c7.ccs.ornl.gov kernel: [3737622.053195] Lustre: Skipped 11 previous similar messages
Aug 20 22:21:57 atlas-oss1c7.ccs.ornl.gov kernel: [3738338.728557] LustreError: 137-5: atlas1-OST00a2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 22:21:57 atlas-oss1c7.ccs.ornl.gov kernel: [3738338.754981] LustreError: Skipped 11 previous similar messages
Aug 20 22:23:51 atlas-oss1c7.ccs.ornl.gov kernel: [3738453.406585] Lustre: atlas1-OST01c6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 22:23:51 atlas-oss1c7.ccs.ornl.gov kernel: [3738453.438502] Lustre: Skipped 10 previous similar messages
Aug 20 22:35:50 atlas-oss1c7.ccs.ornl.gov kernel: [3739172.295333] Lustre: atlas1-OST0376: Client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) reconnecting
Aug 20 22:35:50 atlas-oss1c7.ccs.ornl.gov kernel: [3739172.313826] LustreError: 137-5: atlas1-OST0372_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 22:35:50 atlas-oss1c7.ccs.ornl.gov kernel: [3739172.313829] LustreError: Skipped 7 previous similar messages
Aug 20 22:35:50 atlas-oss1c7.ccs.ornl.gov kernel: [3739172.380774] Lustre: Skipped 12 previous similar messages
Aug 20 22:47:47 atlas-oss1c7.ccs.ornl.gov kernel: [3739889.490503] LustreError: 137-5: atlas1-OST00a2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 22:47:47 atlas-oss1c7.ccs.ornl.gov kernel: [3739889.493977] Lustre: atlas1-OST01c6: Client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) reconnecting
Aug 20 22:47:47 atlas-oss1c7.ccs.ornl.gov kernel: [3739889.493979] Lustre: Skipped 7 previous similar messages
Aug 20 22:47:47 atlas-oss1c7.ccs.ornl.gov kernel: [3739889.572908] LustreError: Skipped 6 previous similar messages
Aug 20 22:59:44 atlas-oss1c7.ccs.ornl.gov kernel: [3740606.835338] LustreError: 137-5: atlas1-OST0372_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 22:59:44 atlas-oss1c7.ccs.ornl.gov kernel: [3740606.865116] LustreError: Skipped 8 previous similar messages
Aug 20 22:59:48 atlas-oss1c7.ccs.ornl.gov kernel: [3740611.500093] Lustre: atlas1-OST00a6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 22:59:48 atlas-oss1c7.ccs.ornl.gov kernel: [3740611.527048] Lustre: Skipped 13 previous similar messages
Aug 20 23:11:41 atlas-oss1c7.ccs.ornl.gov kernel: [3741324.328132] LustreError: 137-5: atlas1-OST0372_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 23:11:41 atlas-oss1c7.ccs.ornl.gov kernel: [3741324.357033] LustreError: Skipped 8 previous similar messages
Aug 20 23:11:48 atlas-oss1c7.ccs.ornl.gov kernel: [3741330.869320] Lustre: atlas1-OST00a6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 23:11:48 atlas-oss1c7.ccs.ornl.gov kernel: [3741330.899692] Lustre: Skipped 11 previous similar messages
Aug 20 23:23:38 atlas-oss1c7.ccs.ornl.gov kernel: [3742041.711006] Lustre: atlas1-OST01c6: Client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) reconnecting
Aug 20 23:23:38 atlas-oss1c7.ccs.ornl.gov kernel: [3742041.739763] Lustre: Skipped 12 previous similar messages
Aug 20 23:23:45 atlas-oss1c7.ccs.ornl.gov kernel: [3742048.286560] LustreError: 137-5: atlas1-OST0372_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 23:23:45 atlas-oss1c7.ccs.ornl.gov kernel: [3742048.311746] LustreError: Skipped 7 previous similar messages
Aug 20 23:35:42 atlas-oss1c7.ccs.ornl.gov kernel: [3742765.716570] Lustre: atlas1-OST01c6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 23:35:42 atlas-oss1c7.ccs.ornl.gov kernel: [3742765.724975] LustreError: 137-5: atlas1-OST0372_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 23:35:42 atlas-oss1c7.ccs.ornl.gov kernel: [3742765.724977] LustreError: Skipped 6 previous similar messages
Aug 20 23:35:42 atlas-oss1c7.ccs.ornl.gov kernel: [3742765.794504] Lustre: Skipped 9 previous similar messages
Aug 20 23:47:39 atlas-oss1c7.ccs.ornl.gov kernel: [3743483.036685] Lustre: atlas1-OST00a6: Client 5d5389e1-62ad-c671-5318-48ff669e4a6e (at 10.38.145.2@o2ib4) reconnecting
Aug 20 23:47:39 atlas-oss1c7.ccs.ornl.gov kernel: [3743483.048083] LustreError: 137-5: atlas1-OST0372_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 23:47:39 atlas-oss1c7.ccs.ornl.gov kernel: [3743483.048086] LustreError: Skipped 10 previous similar messages
Aug 20 23:47:39 atlas-oss1c7.ccs.ornl.gov kernel: [3743483.116736] Lustre: Skipped 12 previous similar messages
Aug 20 23:59:17 atlas-oss1c7.ccs.ornl.gov kernel: [3744181.162211] Lustre: atlas1-OST00a6: Client 1942a1b8-14c2-1c85-f1cc-f5a627755ef9 (at 10.38.145.2@o2ib4) reconnecting
Aug 20 23:59:17 atlas-oss1c7.ccs.ornl.gov kernel: [3744181.190631] Lustre: Skipped 11 previous similar messages
Aug 20 23:59:36 atlas-oss1c7.ccs.ornl.gov kernel: [3744200.460129] LustreError: 137-5: atlas1-OST01c2_UUID: not available for connect from 10.38.145.2@o2ib4 (no target)
Aug 20 23:59:36 atlas-oss1c7.ccs.ornl.gov kernel: [3744200.487895] LustreError: Skipped 11 previous similar messages
