[LU-2568] MDT unable to start with corrupted llog files. Created: 03/Jan/13 Updated: 09/Jan/20 Resolved: 09/Jan/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Alexander Zarochentsev | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 6000 |
| Description |
|
here is a log from failed mdt start: Feb 12 22:04:14 tstmds0a01 kernel: LustreError: 5302:0:(llog_lvfs.c:616:llog_lvfs_create()) error looking up logfile 0x7a4801b:0x2790e5b9: rc -116 |
| Comments |
| Comment by Alexander Zarochentsev [ 04/Jan/13 ] |
|
Xyratex has a fix for this issue I will upload it later. |
| Comment by Mikhail Pershin [ 07/Jan/13 ] |
|
Zam, this doesn't look as master bug, is it some older Lustre version than 2.3? |
| Comment by Alexander Zarochentsev [ 07/Jan/13 ] |
|
Yes, it is older bug. but looks like it is still in master. The issue was with missing llog files and their inode numbers were re-used for other objects. The key fix was: diff --git a/lustre/obdclass/llog_lvfs.c b/lustre/obdclass/llog_lvfs.c index 0987020..60bad4c 100644 --- a/lustre/obdclass/llog_lvfs.c +++ b/lustre/obdclass/llog_lvfs.c @@ -615,6 +615,10 @@ static int llog_lvfs_create(struct llog_ctxt *ctxt, struct llog_handle **res, rc = PTR_ERR(dchild); CERROR("error looking up logfile "LPX64":0x%x: rc %d\n", logid->lgl_oid, logid->lgl_ogen, rc); + if (rc == -ESTALE) + /* handle reused inode same way as + non-existing one */ + GOTO(out, rc = -ENOENT); GOTO(out, rc); } I still think it is actual for the master branch. but I haven't tried to re-create it on master. |
| Comment by Mikhail Pershin [ 08/Jan/13 ] |
|
the llog_lvfs_create is not used anymore in master, llogs are OSD-based now, do you have any reproducer for this? I suppose it shouldn't be problem now if the reason was inode re-use because now llog object is fid-based, but we need to check that |
| Comment by Andreas Dilger [ 09/Jan/20 ] |
|
Close old ticket. |