[LU-1166] recovery never finished Created: 02/Mar/12 Updated: 18/Nov/16 Resolved: 29/Mar/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0 |
| Fix Version/s: | Lustre 2.3.0, Lustre 2.1.2 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alexey Lyashkov | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
2.1.0 + with minimal back porting from 2.2 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 4669 | ||||||||
| Description |
|
while testing we hit a situation when recovery never finished and recovery timer exceed a hard recovery timer. 00010000:00080000:20.0:1330709620.108824:0:19858:0:(ldlm_lib.c:1361:reset_recovery_timer()) snxs4-MDT0000: recovery timer will expire in 70 seconds ... after analyzing a logs that hand looks addressed to waiting in target_recovery_overseer function with check_for_clients() argument. in case of MDT second issue in that area - reset_recovery_timer function. 00010000:00080000:1.0:1330709942.007696:0:19858:0:(ldlm_lib.c:1361:reset_recovery_timer()) snxs4-MDT0000: recovery timer will expire in 4294967294 seconds |
| Comments |
| Comment by Mikhail Pershin [ 03/Mar/12 ] |
|
obd_no_conn is set in mdt_obd_notify -> mdt_allow_cli, morevoer that would be just useless to set it after recovery what means no any client can participate in recovery ever. I don't think this is related to the issue. About reset_recovery_timer(), can you show in details how it can become negative? Also note, that there were changes after 2.1.0 related to recovery timer changes: 2012-02-16 Jinshan Xiong unfortunately first one broke recovery timer and only with last one it is restored. Please check these patches weren't ported to your 2.1.0 from 2.2 separately. And maybe it makes sense to port them all together, because I see that now this function differs from 2.1.0, probably issue is fixed already. |
| Comment by Alexey Lyashkov [ 03/Mar/12 ] |
|
Hm.. you are right. , , obd_unlinked_exports = { next = 0xffff88042b7fd4a0, prev = 0xffff88042b7fd4a0 }, , , obd_recovery_start = 1330601299, }, } }, } obd_lock_replay_clients = { counter = 24 } , |
| Comment by Mikhail Pershin [ 03/Mar/12 ] |
|
After discussion with Alex it is clean that endless recovery cycle caused by wrong number of stale clients and connected, so the check below never true: check_for_clients(struct obd_device *obd)
{
unsigned int clnts = cfs_atomic_read(&obd->obd_connected_clients);
if (obd->obd_abort_recovery || obd->obd_recovery_expired)
return 1;
LASSERT(clnts <= obd->obd_max_recoverable_clients);
---> return (clnts + obd->obd_stale_clients ==
obd->obd_max_recoverable_clients);
}
I suspect this is caused by class_disconnect_stale_exports() which moves stale clients from obd_exports list but they are still in hash, so connected client can find export from stale list and connect to it. Therefore that export will be 'connected' and 'stale' at the same time. Solution could be removal export from hash along with removal from obd_exports list, but more close investigation is needed to check that there is no other races. |
| Comment by Alexey Lyashkov [ 03/Mar/12 ] |
|
per additional discussion with Mike, we have a verdict - that is race between class_disconnect and target_handle_connect. int class_disconnect(struct obd_export *export) { int already_disconnected; ENTRY; if (export == NULL) { fixme(); CDEBUG(D_IOCTL, "attempting to free NULL export %p\n", export); RETURN(-EINVAL); } cfs_spin_lock(&export->exp_lock); already_disconnected = export->exp_disconnected; export->exp_disconnected = 1; cfs_spin_unlock(&export->exp_lock); /* class_cleanup(), abort_recovery(), and class_fail_export() * all end up in here, and if any of them race we shouldn't * call extra class_export_puts(). */ if (already_disconnected) { LASSERT(cfs_hlist_unhashed(&export->exp_nid_hash)); GOTO(no_disconn, already_disconnected); } CDEBUG(D_IOCTL, "disconnect: cookie "LPX64"\n", export->exp_handle.h_cookie); if (!cfs_hlist_unhashed(&export->exp_nid_hash)) cfs_hash_del(export->exp_obd->obd_nid_hash, &export->exp_connection->c_peer.nid, &export->exp_nid_hash); class_export_recovery_cleanup(export); <<< wait where >>> class_unlink_export(export); no_disconn: class_export_put(export); RETURN(0); } if target_handle_connect will raced with class_export_recovery_cleanup in waiting on
obd_req_replay_clients = {
counter = 24
},
obd_lock_replay_clients = {
counter = 24
},
|
| Comment by Alexey Lyashkov [ 05/Mar/12 ] |
|
remote: New Changes: |
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 29/Mar/12 ] |
|
Integrated in Result = FAILURE
|
| Comment by Peter Jones [ 29/Mar/12 ] |
|
Landed for 2.3 |
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 02/May/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Bob Glossman (Inactive) [ 07/May/12 ] |
|
http://review.whamcloud.com/#change,2665 |
| Comment by Bob Glossman (Inactive) [ 22/May/12 ] |
|
http://review.whamcloud.com/#change,2874 |
| Comment by Gregoire Pichon [ 12/Jul/12 ] |
|
Hello Bob, I don't fully understand the portions of code impacted by this ticket, but could you explain why the additional change http://review.whamcloud.com/#change,2874 you submitted in b2_1 is not present in the master release. diff --git a/lustre/ldlm/ldlm_lib.c b/lustre/ldlm/ldlm_lib.c
index a8433fc..0798ba7 100644
--- a/lustre/ldlm/ldlm_lib.c
+++ b/lustre/ldlm/ldlm_lib.c
@@ -1069,7 +1069,8 @@ dont_check_exports:
class_disconnect->class_export_recovery_cleanup() race
*/
cfs_spin_lock(&target->obd_recovery_task_lock);
- if (target->obd_recovering && !export->exp_in_recovery) {
+ if (target->obd_recovering && !export->exp_in_recovery &&
+ !export->exp_disconnected) {
cfs_spin_lock(&export->exp_lock);
export->exp_in_recovery = 1;
export->exp_req_replay_needed = 1;
thanks. |
| Comment by Nathan Rutman [ 21/Nov/12 ] |
|
Xyratex-bug-id: MRP-451 |