[LU-6015] mds evicting clients - lock timed out? Created: 10/Dec/14 Updated: 07/Jan/15 Resolved: 07/Jan/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Daire Byrne (Inactive) | Assignee: | Lai Siyao |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Environment: |
EL6 |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 16763 | ||||||||
| Description |
|
We have seen this issue twice in the last week as the load on one of our Lustre filesystems has increased. Long story short - we have 4 Lustre clients exporting the filesystem over NFS (round-robin). The MDS seems to get snagged on a lock and then the clients take it in turn to drop off which sends IO errors to the NFS clients. Dec 10 16:00:02 cmds1 kernel: LNet: Service thread pid 22181 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Dec 10 16:00:02 cmds1 kernel: Pid: 22181, comm: mdt00_026 Dec 10 16:00:02 cmds1 kernel: Dec 10 16:00:02 cmds1 kernel: Call Trace: Dec 10 16:00:02 cmds1 kernel: [<ffffffff8150f362>] schedule_timeout+0x192/0x2e0 Dec 10 16:00:02 cmds1 kernel: [<ffffffff810811e0>] ? process_timeout+0x0/0x10 Dec 10 16:00:02 cmds1 kernel: [<ffffffffa03fe6d1>] cfs_waitq_timedwait+0x11/0x20 [libcfs] Dec 10 16:00:02 cmds1 kernel: Lustre: lock timed out (enqueued at 1418227002, 200s ago) Dec 10 16:00:02 cmds1 kernel: [<ffffffffa06a601d>] ldlm_completion_ast+0x4ed/0x960 [ptlrpc] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa06a1790>] ? ldlm_expired_completion_wait+0x0/0x390 [ptlrpc] Dec 10 16:00:02 cmds1 kernel: [<ffffffff81063410>] ? default_wake_function+0x0/0x20 Dec 10 16:00:02 cmds1 kernel: [<ffffffffa06a5758>] ldlm_cli_enqueue_local+0x1f8/0x5d0 [ptlrpc] Dec 10 16:00:02 cmds1 kernel: LustreError: dumping log to /tmp/lustre-log.1418227202.22181 Dec 10 16:00:02 cmds1 kernel: [<ffffffffa06a5b30>] ? ldlm_completion_ast+0x0/0x960 [ptlrpc] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0d93a90>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0d99c7b>] mdt_object_lock0+0x33b/0xaf0 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0d93a90>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa06a5b30>] ? ldlm_completion_ast+0x0/0x960 [ptlrpc] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0d9a4f4>] mdt_object_lock+0x14/0x20 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0dc24f8>] mdt_object_open_lock+0x1c8/0x510 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0da6bfa>] ? mdt_attr_get_complex+0x38a/0x770 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0dc76b3>] mdt_open_by_fid_lock+0x443/0x7d0 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0dc829b>] mdt_reint_open+0x56b/0x20c0 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa041a82e>] ? upcall_cache_get_entry+0x28e/0x860 [libcfs] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa06cedcc>] ? lustre_msg_add_version+0x6c/0xc0 [ptlrpc] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0565f50>] ? lu_ucred+0x20/0x30 [obdclass] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0d93015>] ? mdt_ucred+0x15/0x20 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa0daf15c>] ? mdt_root_squash+0x2c/0x410 [mdt] Dec 10 16:00:02 cmds1 kernel: [<ffffffffa06f6646>] ? __req_capsule_get+0x166/0x700 [ptlrpc] .. .. Dec 10 16:07:58 cmds1 kernel: Dec 10 16:07:58 cmds1 kernel: LustreError: dumping log to /tmp/lustre-log.1418227678.7488 Dec 10 16:11:13 cmds1 kernel: Lustre: 22178:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-271), not sending early reply Dec 10 16:11:13 cmds1 kernel: req@ffff882fd6dfd000 x1486501717804232/t0(0) o101->4540220b-aff6-ec4a-3862-59a2488b6775@10.21.22.27@tcp:0/0 lens 576/3448 e 5 to 0 dl 1418227878 ref 2 fl Interpret:/0/0 rc 0/0 Dec 10 16:11:44 cmds1 kernel: Lustre: 9532:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-271), not sending early reply Dec 10 16:11:44 cmds1 kernel: req@ffff88427dd19400 x1486495779658676/t0(0) o101->dce9c151-cc13-2ac2-7911-8ded672c09d6@10.21.22.29@tcp:0/0 lens 608/3448 e 5 to 0 dl 1418227909 ref 2 fl Interpret:/0/0 rc 0/0 Dec 10 16:12:02 cmds1 kernel: Lustre: 7487:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-260), not sending early reply Dec 10 16:12:02 cmds1 kernel: req@ffff88026471d800 x1486501720068016/t0(0) o101->4540220b-aff6-ec4a-3862-59a2488b6775@10.21.22.27@tcp:0/0 lens 576/3448 e 3 to 0 dl 1418227927 ref 2 fl Interpret:/0/0 rc 0/0 Dec 10 16:12:34 cmds1 kernel: Lustre: 6826:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-134), not sending early reply Dec 10 16:12:34 cmds1 kernel: req@ffff8822e2c0d000 x1486501725457272/t0(0) o101->4540220b-aff6-ec4a-3862-59a2488b6775@10.21.22.27@tcp:0/0 lens 608/3448 e 1 to 0 dl 1418227959 ref 2 fl Interpret:/0/0 rc 0/0 Dec 10 16:13:17 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) reconnecting Dec 10 16:13:17 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) refused reconnection, still busy with 6 active RPCs Dec 10 16:13:42 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) reconnecting Dec 10 16:13:42 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) refused reconnection, still busy with 3 active RPCs Dec 10 16:13:48 cmds1 kernel: Lustre: charlie-MDT0000: Client dce9c151-cc13-2ac2-7911-8ded672c09d6 (at 10.21.22.29@tcp) reconnecting Dec 10 16:13:48 cmds1 kernel: Lustre: charlie-MDT0000: Client dce9c151-cc13-2ac2-7911-8ded672c09d6 (at 10.21.22.29@tcp) refused reconnection, still busy with 2 active RPCs Dec 10 16:13:53 cmds1 kernel: Lustre: 22166:0:(service.c:1339:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-431), not sending early reply Dec 10 16:13:53 cmds1 kernel: req@ffff882fe3e6a000 x1486492815968152/t0(0) o101->b9fd8649-3e4d-fb11-ba42-c2526b8f651f@10.21.22.28@tcp:0/0 lens 576/1152 e 4 to 0 dl 1418228038 ref 2 fl Interpret:/0/0 rc 0/0 Dec 10 16:14:07 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) reconnecting Dec 10 16:14:07 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) refused reconnection, still busy with 3 active RPCs Dec 10 16:14:13 cmds1 kernel: Lustre: charlie-MDT0000: Client dce9c151-cc13-2ac2-7911-8ded672c09d6 (at 10.21.22.29@tcp) reconnecting Dec 10 16:14:13 cmds1 kernel: Lustre: charlie-MDT0000: Client dce9c151-cc13-2ac2-7911-8ded672c09d6 (at 10.21.22.29@tcp) refused reconnection, still busy with 2 active RPCs Dec 10 16:14:32 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) reconnecting Dec 10 16:14:32 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) refused reconnection, still busy with 3 active RPCs Dec 10 16:14:57 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) reconnecting Dec 10 16:14:57 cmds1 kernel: Lustre: Skipped 1 previous similar message Dec 10 16:14:57 cmds1 kernel: Lustre: charlie-MDT0000: Client 4540220b-aff6-ec4a-3862-59a2488b6775 (at 10.21.22.27@tcp) refused reconnection, still busy with 3 active RPCs Dec 10 16:14:57 cmds1 kernel: Lustre: Skipped 1 previous similar message Both times, the only way to get back to a stable state was to reboot the mds and abort recovery. Logs attached. |
| Comments |
| Comment by Peter Jones [ 10/Dec/14 ] |
|
Lai Could you please advise on this issue? Thanks Peter |
| Comment by Lai Siyao [ 11/Dec/14 ] |
|
This looks to be that MDS enqueued a local lock, but timed out, in this case, it will sleep wait for it to become available. Normally the conflicting lock against this local lock should be released later (if other clients don't cooperate, MDS will evict them to achieve this). So there should be something go wrong, and cause deadlock. To make out the cause of this deadlock, I need to know who holds it, could you `echo t > /proc/sysrq-trigger` on MDS to collect all process backtraces when you see this error next time? |
| Comment by Daire Byrne (Inactive) [ 12/Dec/14 ] |
|
Okay we will do that. I was hoping that the lustre dumps to /tmp may have contained enough debug info to figure out what was going on but apparently not. |
| Comment by Daire Byrne (Inactive) [ 12/Dec/14 ] |
|
This just happened again (so only two days between occurrences now). I'm attaching the messages file after doing a sysrq. I then had to reboot the server to get things stable. |
| Comment by Daire Byrne (Inactive) [ 16/Dec/14 ] |
|
Are there any updates on this? Was the sysrq info provided useful? Can we provide any more information? It may or may not be useful but when we saw this issue for the first time 2 weeks ago, it was right after changing the network card hardware in 2/4 of the clients/NFS exporters. The process of bringing the clients back online caused the MDS to fall over. All 4 of the clients and the MDS have been rebooted a few times since then. Since then we seem to be able to hit it frequently under high load. And because the clients are not recovering, long running simulations are getting IO errors resulting in lots of lost render time. |
| Comment by Niu Yawei (Inactive) [ 16/Dec/14 ] |
|
From the full trace looks like that's a deadlock, should be dup of Dec 12 18:55:07 cmds1 kernel: mdt05_032 S 000000000000000b 0 15079 2 0x00000000 Dec 12 18:55:07 cmds1 kernel: ffff88500a907770 0000000000000046 0000000000000000 ffff88500a9077c0 Dec 12 18:55:07 cmds1 kernel: 0000000000000001 ffff885d74c78040 ffff88500a907740 ffffffffa03b92d1 Dec 12 18:55:07 cmds1 kernel: ffff885d74c785f8 ffff88500a907fd8 000000000000fb88 ffff885d74c785f8 Dec 12 18:55:07 cmds1 kernel: Call Trace: Dec 12 18:55:07 cmds1 kernel: [<ffffffffa03b92d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa03a96fe>] cfs_waitq_wait+0xe/0x10 [libcfs] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06810aa>] ldlm_completion_ast+0x57a/0x960 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa067c790>] ? ldlm_expired_completion_wait+0x0/0x390 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffff81063410>] ? default_wake_function+0x0/0x20 Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0680758>] ldlm_cli_enqueue_local+0x1f8/0x5d0 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0680b30>] ? ldlm_completion_ast+0x0/0x960 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d84a90>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d8ac7b>] mdt_object_lock0+0x33b/0xaf0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d84a90>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0680b30>] ? ldlm_completion_ast+0x0/0x960 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d8b4f4>] mdt_object_lock+0x14/0x20 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0db34f8>] mdt_object_open_lock+0x1c8/0x510 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d97bfa>] ? mdt_attr_get_complex+0x38a/0x770 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0db86b3>] mdt_open_by_fid_lock+0x443/0x7d0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0db929b>] mdt_reint_open+0x56b/0x20c0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa03c582e>] ? upcall_cache_get_entry+0x28e/0x860 [libcfs] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06a9dcc>] ? lustre_msg_add_version+0x6c/0xc0 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0540f50>] ? lu_ucred+0x20/0x30 [obdclass] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d84015>] ? mdt_ucred+0x15/0x20 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0da015c>] ? mdt_root_squash+0x2c/0x410 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06d1646>] ? __req_capsule_get+0x166/0x700 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0540f50>] ? lu_ucred+0x20/0x30 [obdclass] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0da4911>] mdt_reint_rec+0x41/0xe0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d89ae3>] mdt_reint_internal+0x4c3/0x780 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d8a06d>] mdt_intent_reint+0x1ed/0x520 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d87f1e>] mdt_intent_policy+0x39e/0x720 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0661831>] ldlm_lock_enqueue+0x361/0x8d0 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06881ef>] ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d883a6>] mdt_enqueue+0x46/0xe0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d8ea97>] mdt_handle_common+0x647/0x16d0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06aabac>] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0dc83f5>] mds_regular_handle+0x15/0x20 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06ba3c8>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: mdt03_012 S 000000000000002c 0 7511 2 0x00000000 Dec 12 18:55:07 cmds1 kernel: ffff882f6da2f8f0 0000000000000046 00000000548b3476 ffff882f6da2f940 Dec 12 18:55:07 cmds1 kernel: 0000000000000001 ffff882f8e4f8aa0 ffff882f6da2f8c0 ffffffffa03b92d1 Dec 12 18:55:07 cmds1 kernel: ffff882f8e4f9058 ffff882f6da2ffd8 000000000000fb88 ffff882f8e4f9058 Dec 12 18:55:07 cmds1 kernel: Call Trace: Dec 12 18:55:07 cmds1 kernel: [<ffffffffa03b92d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa03a96fe>] cfs_waitq_wait+0xe/0x10 [libcfs] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06810aa>] ldlm_completion_ast+0x57a/0x960 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa067c790>] ? ldlm_expired_completion_wait+0x0/0x390 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffff81063410>] ? default_wake_function+0x0/0x20 Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0680758>] ldlm_cli_enqueue_local+0x1f8/0x5d0 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0680b30>] ? ldlm_completion_ast+0x0/0x960 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d84a90>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d84a90>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d8ac7b>] mdt_object_lock0+0x33b/0xaf0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d84a90>] ? mdt_blocking_ast+0x0/0x2a0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0680b30>] ? ldlm_completion_ast+0x0/0x960 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d8b4f4>] mdt_object_lock+0x14/0x20 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d9a5a9>] mdt_getattr_name_lock+0xe19/0x1980 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06a9135>] ? lustre_msg_buf+0x55/0x60 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06d1646>] ? __req_capsule_get+0x166/0x700 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06ab3c4>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d9b3ad>] mdt_intent_getattr+0x29d/0x490 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d87f1e>] mdt_intent_policy+0x39e/0x720 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0661831>] ldlm_lock_enqueue+0x361/0x8d0 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06881ef>] ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d883a6>] mdt_enqueue+0x46/0xe0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0d8ea97>] mdt_handle_common+0x647/0x16d0 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06aabac>] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa0dc83f5>] mds_regular_handle+0x15/0x20 [mdt] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06ba3c8>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa03a95de>] ? cfs_timer_arm+0xe/0x10 [libcfs] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa03bad9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06b1729>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffff81055ad3>] ? __wake_up+0x53/0x70 Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06bb75e>] ptlrpc_main+0xace/0x1700 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06bac90>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06bac90>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffffa06bac90>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] Dec 12 18:55:07 cmds1 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 |
| Comment by Lai Siyao [ 16/Dec/14 ] |
|
This looks to be duplicate of |
| Comment by Daire Byrne (Inactive) [ 16/Dec/14 ] |
|
Okay, great - we'll organise some downtime to upgrade the code. Just to be clear - this only needs to be applied to the MDS? |
| Comment by Lai Siyao [ 17/Dec/14 ] |
|
yes, it's needed by MDS only. |
| Comment by Daire Byrne (Inactive) [ 07/Jan/15 ] |
|
Just to update - we have applied the patch and have not seen any further instances of this issue in 3 weeks now. You can close this if you wish and we'll re-open should we hit it again. Cheers. |
| Comment by Peter Jones [ 07/Jan/15 ] |
|
Good news - thanks Daire! |