[LU-3474] MDS LBUG on unlink? Created: 14/Jun/13 Updated: 05/Jun/14 Resolved: 19/Jul/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0, Lustre 2.5.0 |
| Fix Version/s: | Lustre 2.4.1, Lustre 2.5.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Daire Byrne (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | mn4 | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 8703 | ||||||||
| Description |
|
Hi, We have been testing v2.4 and have hit this LBUG which we have never experienced in v1.8.x for similar workloads. It looks like it is related to do an rm/unlink on certain files. I had to abort recovery and stop the ongoing file deletion in order to keep the MDS from repeatedly crashing with the same LBUG. We can supply more debug info should you need it. Cheers, Daire <0>LustreError: 6274:0:(linkea.c:169:linkea_links_find()) ASSERTION( ldata->ld_leh != ((void *)0) ) failed: |
| Comments |
| Comment by Peter Jones [ 14/Jun/13 ] |
|
Hi Daire Good to see you still in the Lustre world! Bruno Could you please look into this one? Thanks Peter |
| Comment by Bruno Faccini (Inactive) [ 14/Jun/13 ] |
|
Hello Daire, |
| Comment by Daire Byrne (Inactive) [ 14/Jun/13 ] |
|
Bruno, I have a kdump vmcore - it's too big to attach. I've put it in my dropbox account: https://dl.dropboxusercontent.com/u/24821368/vmcore.tgz The filesystem was formatted using Lustre v2.3. Peter, Yes it's been a while! We have actually been using Lustre for a couple of years now at my present company. We are currently building a new filesystem and it seems like about the right time to try out v2. We just need to help you debug it some more like we did at Framestore with v1.8 |
| Comment by Bruno Faccini (Inactive) [ 14/Jun/13 ] |
|
Thank's for the crash-dump Daire already! But in order to be able to analyze it I also need the Lustre-modules RPM and the kernel-debuginfo RPM, can you also provide them ?? |
| Comment by Daire Byrne (Inactive) [ 14/Jun/13 ] |
|
Bruno, We are using the whamcloud binary rpms so you can find them on the download site. |
| Comment by Prakash Surya (Inactive) [ 14/Jun/13 ] |
|
We've also seen this during our recent 2.4-ldiskfs testing at LLNL: LustreError: 7128:0:(linkea.c:169:linkea_links_find()) ASSERTION( ldata->ld_leh != ((void *)0) ) failed: LustreError: 7128:0:(linkea.c:169:linkea_links_find()) LBUG crash> bt PID: 7128 TASK: ffff8805cec96ae0 CPU: 10 COMMAND: "mdt02_034" #0 [ffff880605c6b878] machine_kexec at ffffffff81035bfb #1 [ffff880605c6b8d8] crash_kexec at ffffffff810c0932 #2 [ffff880605c6b9a8] panic at ffffffff8150d943 #3 [ffff880605c6ba28] lbug_with_loc at ffffffffa0646f4b [libcfs] #4 [ffff880605c6ba48] linkea_links_find at ffffffffa0838986 [obdclass] #5 [ffff880605c6bab8] mdd_linkea_prepare at ffffffffa0d9b645 [mdd] #6 [ffff880605c6bb08] mdd_links_rename at ffffffffa0d9bb01 [mdd] #7 [ffff880605c6bb88] mdd_unlink at ffffffffa0d9fae6 [mdd] #8 [ffff880605c6bc48] mdo_unlink at ffffffffa1009b98 [mdt] #9 [ffff880605c6bc58] mdt_reint_unlink at ffffffffa100cf40 [mdt] #10 [ffff880605c6bcd8] mdt_reint_rec at ffffffffa1009891 [mdt] #11 [ffff880605c6bcf8] mdt_reint_internal at ffffffffa0feeb03 [mdt] #12 [ffff880605c6bd38] mdt_reint at ffffffffa0feee04 [mdt] #13 [ffff880605c6bd58] mdt_handle_common at ffffffffa0ff3ab8 [mdt] #14 [ffff880605c6bda8] mds_regular_handle at ffffffffa102d155 [mdt] #15 [ffff880605c6bdb8] ptlrpc_server_handle_request at ffffffffa09b26d8 [ptlrpc] #16 [ffff880605c6beb8] ptlrpc_main at ffffffffa09b3a6e [ptlrpc] #17 [ffff880605c6bf48] kernel_thread at ffffffff8100c0ca |
| Comment by Di Wang [ 17/Jun/13 ] |
|
Hmm, apparently in mdd_links_read, it did not check the return value of linkea_init, so the following linked_links_find will deal with the wrong linkea, which cause the panic. Probably this patch should fix it diff --git a/lustre/mdd/mdd_dir.c b/lustre/mdd/mdd_dir.c
/** Read the link EA into a temp buffer. |
| Comment by Bruno Faccini (Inactive) [ 17/Jun/13 ] |
|
Thank's for the direction Di! |
| Comment by Daire Byrne (Inactive) [ 18/Jun/13 ] |
|
Bruno, I can verify the patch stops the LBUG but the syslog gets spammed instead with the likes of: Jun 18 11:42:51 bmds1 kernel: LustreError: 24036:0:(mdd_dir.c:995:mdd_links_rename()) link_ea del 'from_repo_revision' failed -61 [0x200000c68:0x795d:0x0] |
| Comment by Bruno Faccini (Inactive) [ 18/Jun/13 ] |
|
Humm, seems that you trigger cases where the link_ea_header has a NULL leh_reccount !!... I will finally need to have a look into earlier crash-dump to see how this can happen. You say syslog get spammed, but what does this mean in term of number of msgs per timing window ?? Also, the names in the msgs you attached are from Yum database and are known to be hard-links to the same inodes, so can you find how-many hard-links concern these FIDs if still present ? We may only trigger cases where the reverse link entry was not added in the link_ea because exceeding its maximum ?? And thus the msgs could be avoided ... |
| Comment by Di Wang [ 19/Jun/13 ] |
|
Hmm, it seems to me the -ENODATA and ENOENT check is wrong in mdd_links_prepare, which cause mdd_linkea_prepare wrongly return -ENODATA. probably this patch should help. diff --git a/lustre/mdd/mdd_dir.c b/lustre/mdd/mdd_dir.c
index f7947c8..4770c32 100644
--- a/lustre/mdd/mdd_dir.c
+++ b/lustre/mdd/mdd_dir.c
@@ -932,11 +932,10 @@ static int mdd_linkea_prepare(const struct lu_env *env,
if (oldpfid != NULL) {
rc = __mdd_links_del(env, mdd_obj, ldata, oldlname, oldpfid);
if (rc) {
- if ((check == 0) ||
- (rc != -ENODATA && rc != -ENOENT))
+ if ((check == 0 && (rc == -ENOENT || rc == -ENODATA)))
+ rc = 0;
+ else
RETURN(rc);
- /* No changes done. */
- rc = 0;
}
}
|
| Comment by Daire Byrne (Inactive) [ 19/Jun/13 ] |
|
So just to give you an idea of what we are doing here. We are essentially using "rsync --link-dest" to do backups of servers (hence the yum DB files). If the file is unchanged just hardlink it to the previous backup copy. So to trigger these messages we are simply deleting old backups which in many cases simply removes the hard link count by one. This workload has been found to be a good test of metadata and IO. We have never seen this issue in Lustre v1.8 in the 2 years this workload has been running on it. In terms of the messages maybe 20/s? It isn't constant so I guess not all files trigger it. And there can be half an hour between an influx of these messages. |
| Comment by Cory Spitz [ 19/Jun/13 ] |
|
Cray has been seeing this bug a lot. We'll try out the patch and report back. Speak up if you need any of our debug info. |
| Comment by Ann Koehler (Inactive) [ 20/Jun/13 ] |
|
re: http://review.whamcloud.com/6676 Should mdt_links_read() as well as mdd_links_read() be changed to return the rc from linkea_init()? I'm just noting the symmetry in the code. |
| Comment by Prakash Surya (Inactive) [ 20/Jun/13 ] |
At first glance it looks like it should. And if it shouldn't, a comment explaining the difference is needed. |
| Comment by Bruno Faccini (Inactive) [ 24/Jun/13 ] |
|
Di, Daire, Prakash, Ann, |
| Comment by Bruno Faccini (Inactive) [ 25/Jun/13 ] |
|
Just for my understanding and about the reproducer scenario, is it possible that the hard-links beeing removed/unlinked and causing the LBUGs/msgs may have been created when running with some early 2.x version (ie, with more limits in place during link_ea populate) ? Patch to be out soon. |
| Comment by Daire Byrne (Inactive) [ 25/Jun/13 ] |
|
The filesystem was formatted using the latest v2.3 release so many of the hardlinks would have been created under that version. |
| Comment by Ann Koehler (Inactive) [ 25/Jun/13 ] |
|
Cray sees the bug on a file system formatted with 2.4. |
| Comment by Prakash Surya (Inactive) [ 25/Jun/13 ] |
|
And we have a 2.1 formatted FS upgraded to Lustre 2.4 RPMs. |
| Comment by Bruno Faccini (Inactive) [ 25/Jun/13 ] |
|
Just pushed new version/patch-set #2 of change http://review.whamcloud.com/6676. It adds a few ENODATA error handling fixes, to avoid unnecessary msgs and also prevent early return, to original fix. And http://review.whamcloud.com/6772 is cosmetic patch for similar linkea_init() error handling in mdt layer. |
| Comment by Cory Spitz [ 26/Jun/13 ] |
|
Cray testing on change #6676 ps1 and 6772 shows that the changes resolve our problems with the LBUG. |
| Comment by Bruno Faccini (Inactive) [ 26/Jun/13 ] |
|
Thank's for the feed-back Cory, #6676 patch-set #2 should fix the LBUG AND the annoying (and erroneous!) msgs ... |
| Comment by Cory Spitz [ 26/Jun/13 ] |
|
I was mistaken, Cray has not yet tested w/6772 applied. However, 6676 ps1 did test successfully. |
| Comment by Daniel Basabe [ 27/Jun/13 ] |
|
Hello In our site, we had this problem with 2.4.50 version. It was produced moving a dir with lots of files to other destination. I can confirm that Patch-set #2 has fixed the problem. |
| Comment by Bruno Faccini (Inactive) [ 27/Jun/13 ] |
|
Hello Daniel, |
| Comment by Daire Byrne (Inactive) [ 03/Jul/13 ] |
|
I finally got around to testing the two patches - the LBUG has returned. I patched v2.4.0:
Jul 3 12:36:50 bmds1 kernel: LustreError: 13174:0:(linkea.c:169:linkea_links_find()) ASSERTION( ldata->ld_leh != ((void *)0) ) failed: |
| Comment by Bruno Faccini (Inactive) [ 04/Jul/13 ] |
|
Wow I am sorry Daire, I don't know how this happen but patch-set#3 of http://review.whamcloud.com/6676 contained a regression from patch-set #1/#2 (in fact it did not contain the main part/change from patch-set #1 that must be in to prevent the LBUG!!) .... Can you give a try to patch-set #4 that should be definitive one ?? |
| Comment by Bruno Faccini (Inactive) [ 12/Jul/13 ] |
|
Daire, have you been able to test patch-set #4 of http://review.whamcloud.com/6676 finally ?? |
| Comment by Daire Byrne (Inactive) [ 16/Jul/13 ] |
|
Bruno, I have patched it in and haven't seen the issue again yet. However I have not had the opportunity to run the same workload (large unlinks) but that should happen between now and next week. I will update if we have any further issues. Thanks for the help. |
| Comment by Prakash Surya (Inactive) [ 16/Jul/13 ] |
|
FWIW, I've applied #6672 and #6676 and have not hit the issue with our test workload (we did hit it repeatedly without #6676). |
| Comment by Jodi Levi (Inactive) [ 19/Jul/13 ] |
|
Patch landed to Master. Closing ticket. Please let me know if more work is needed and I will reopen. |
| Comment by Andreas Dilger [ 28/May/14 ] |
|
It seems that http://review.whamcloud.com/6676 was landed to b2_4 for 2.4.1, but http://review.whamcloud.com/6772 (which was not mentioned anywhere in this bug, but attributed to lfs fid2path /myth [0x10b466:0xfce641b5:0x0] /myth// There is never a linkEA for upgraded 1.x files until LFSCK 1.5 is run on a 2.5+ MDS. |
| Comment by Andreas Dilger [ 28/May/14 ] |
|
Cherry-pick http://review.whamcloud.com/6772 to b2_4: http://review.whamcloud.com/10464 |
| Comment by Bruno Faccini (Inactive) [ 28/May/14 ] |
|
I think I had mentionned #6772 in this ticket, but anyway I better had, as Di suggested at that time, to merge both patches to avoid such oversight! |