[LU-9492] MDT reports passing recovery deadline prematurely Created: 11/May/17 Updated: 23/Feb/19 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Ned Bass | Assignee: | Emoly Liu |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
During MDT recovery multiple console messages appear containing the phrase "Recovery already passed deadline MM:SS". The MM:SS displays the minutes and seconds _remaining _ until the recovery deadline expires. This is confusing to system administrators. There are two issues to address here. 1. The wording of the message seems to be incorrect. The clarity of log messages pertaining to recovery is critically important, as that is a time when system administrators tend to watch the logs closely and they need to understand what is happening. May 10 09:17:07 zinc1 kernel: Lustre: lsh-MDT0000: Will be in recovery for at least 5:00, or until 2827 clients reconnect May 10 09:18:37 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:30. If you do not want to wait more, please abort the recovery by force. May 10 09:18:37 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:29. If you do not want to wait more, please abort the recovery by force. May 10 09:18:38 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:28. If you do not want to wait more, please abort the recovery by force. May 10 09:18:40 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:26. If you do not want to wait more, please abort the recovery by force. May 10 09:18:45 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:22. If you do not want to wait more, please abort the recovery by force. May 10 09:18:53 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 3:14. If you do not want to wait more, please abort the recovery by force. May 10 09:19:09 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 2:58. If you do not want to wait more, please abort the recovery by force. May 10 09:19:41 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 2:26. If you do not want to wait more, please abort the recovery by force. May 10 09:20:45 zinc1 kernel: Lustre: lsh-MDT0000: Recovery already passed deadline 1:22. If you do not want to wait more, please abort the recovery by force. May 10 09:22:07 zinc1 kernel: Lustre: lsh-MDT0000: Recovery over after 5:01, of 2827 clients 2651 recovered and 0 were evicted. |
| Comments |
| Comment by Peter Jones [ 11/May/17 ] |
|
Emoly Could you please assist with this one? Thanks Peter |
| Comment by Gerrit Updater [ 18/May/17 ] |
|
Emoly Liu (emoly.liu@intel.com) uploaded a new patch: https://review.whamcloud.com/27178 |