[LU-11471] IO Errors during failover with very few number of OSTs Created: 24/Dec/15 Updated: 16/Jan/22 Resolved: 16/Jan/22 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Rajeshwaran Ganesan | Assignee: | Mikhail Pershin |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre 2.5.X |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
one of our customer is noticed IO errors during failover with fewer OST. we would like to add a note into the Best practice section or OST planning section |
| Comments |
| Comment by Rajeshwaran Ganesan [ 04/Jan/16 ] |
|
we would like add, If the MDS has no currently active OSTs, create requests fail with an I/O error. And we will see the IO errors at the clients. |
| Comment by Joseph Gmitter (Inactive) [ 21/Sep/16 ] |
|
Hi Rajeshwaran, Are you familiar with how to push such an update to the manual? For details on how to submit changes to the manual, please see: |
| Comment by Andreas Dilger [ 11/Mar/17 ] |
|
I wonder whether this should rather be considered a bug in the code, and the MDS should block file creations if all of the OSTs become unavailable after startup? |
| Comment by Rajeshwaran Ganesan [ 06/Sep/17 ] |
|
please close this case |
| Comment by Andreas Dilger [ 04/Oct/18 ] |
|
Reopening this issue. With the advent of Data-on-MDT we will at some point want to allow filesystems with only MDTs to be created. At that point, this check has to be removed. As a starting point, we could add a tunable that allows this behavior to be selected by the admin - return an error if no OSTs are available, or cause the client to block and wait for an OST to become available. I think in the case where an OST was previously available, but they are temporarily offline due to failover, the client should block. If the file being created has a DoM component at the start, then it should not block. |
| Comment by Mikhail Pershin [ 16/Jan/22 ] |
|
Main ticket for remaining work is LU-10995 |