[LU-5145] kernel slab data memory coruption on MDS due to a file with truncated link extended attribute Created: 05/Jun/14 Updated: 06/May/20 Resolved: 06/May/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Sebastien Buisson (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 14196 | ||||||||
| Description |
|
Hi, CEA had 3 consecutive crashes of MDS server due to a corruption of kernel slab, and these crashes were concomitant to Robinhood startup. In fact in the Lustre Changelogs we can see a line like: 4281746500 01CREAT 17:47:33.690285184 2014.03.25 0x0 t=[0x22cb19e89:0x1f9c9:0x0] p=[0x22cb19e89:0x1f9c8:0x0] GRANDEURS_CENTREES And this is because Robinhood tried to process this Changelog entry that we crashed the MDS. Indeed, looking at the MDT with debugfs, we found out that this file was under the /OBJECTS directory (it used to be a regular file, that was moved here by Lustre for an unkown reason), and that its link EA was containing zero fid (link = "df f1 ea 11 00 00 00 00 18 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 " (24)). This issue can be reproduced by doing the following on a Lustre filesystem originally formatted with Lustre 2.1, and upgraded to Lustre 2.4:
The crash can be avoided with the patch from So at first we would need this patch to be landed to b2_4, or at least to know if this patch is suitable for use in production with Lustre 2.4.3. Secondly, given that a file system at CEA has more than 5,000 files in the OBJECTS directory, we are wondering: TIA, |
| Comments |
| Comment by Bruno Faccini (Inactive) [ 05/Jun/14 ] |
|
Hello Sebastien, Last, since you identified that patch from |
| Comment by Bruno Faccini (Inactive) [ 05/Jun/14 ] |
|
Sebastien, you speak about a corruption of Kernel slabs, can you better detail this ? Do you have at least a panic stack ? Fan Yong, just in case, do you have any idea why/how files can be "moved" to OBJECTS directory ? |
| Comment by Sebastien Buisson (Inactive) [ 05/Jun/14 ] |
|
Hey Bruno! a) lfsck was run on the file system, but many files under /OBJECTS directory have a creation date later that the time when lfsck was run. So lfsck is not responsible for this. Concerning the ticket's priority, I would need to know if the patch at http://review.whamcloud.com/10464 can be used in production before lowering priority. Sebastien. |
| Comment by Sebastien Buisson (Inactive) [ 05/Jun/14 ] |
|
Here is the information from the crash dump, get by Antoine. |
| Comment by Antoine Percher [ 05/Jun/14 ] |
|
Hi Bruno |
| Comment by Andreas Dilger [ 06/Jun/14 ] |
|
I verified that the 10464 patch is matching the fix that was landed for 2.5.0 (in 2.4.52, just after b2_4 was branched off master). I believe the patch should be safe for your production use in 2.4 (and we are planning to land it for the next 2.4.x release). As for files in OBJECTS, that is confusing since this directory should only be used for nameless objects such as the quota admin files, and such. Regular files should not end up there. It appears almost as if the OBJECTS directory was being confused with the PENDING directory, for files that are open but unlinked. In that case, it would make sense that the "link" xattr has no links anymore, since there are no names for this file anymore. If possible, could you please check the FIDs on both the /OBJECTS and /PENDING to see if they conflict? You can do this while the MDT is mounted using: debugfs -c -f /dev/stdin /dev/\{mdsdev\} <<-EOF
stat /OBJECTS
stat /PENDING
EOF
|
| Comment by Antoine Percher [ 06/Jun/14 ] |
|
I will check, I just can add that all files name in OBJECTS has |
| Comment by Bruno Faccini (Inactive) [ 06/Jun/14 ] |
|
Antoine, thanks for your help+work on this !! |
| Comment by Antoine Percher [ 06/Jun/14 ] |
|
Bruno, Here is a badly link example that could reproduce the corrupt with the patch 10464 setfattr -n trusted.link -v 0xdff1ea11000000002f00000000000000000000000000000000000000000200002b10000000010000000066696c6531 <mdt ldiskfs>/OBJECTS/file as you know, never trust data on disk |
| Comment by nasf (Inactive) [ 06/Jun/14 ] |
|
From the file name format, it seems more like that those files have been "created" under b2_1, but not been "moved" to /OBJECTS. Have you ever downgraded to b2_1 after upgraded to b2_4? Generally, the files under /OBJECTS should not be visible to client, and they should not be found in the Changelog, then whether it has valid linkEA will not affect the Robinhood. So would you please to verify whether those files under the /OBJECTS are users' normal files or not? |
| Comment by Antoine Percher [ 12/Jun/14 ] |
|
Sorry for the delay add debugfs trace for OBJECTS and PENDING stat information in debugfs.pdf answer about lustre downgrade : No, we never downgrade own lustre distribution The crtime of all user file in /OBJECTS have a date between Mars 19th |
| Comment by nasf (Inactive) [ 13/Jun/14 ] |
|
OI scrub will only update the OI mapping files, it may also move the orphans from /lost+found to /O/xxxx/ for OST device. It will not change other directory structure. |
| Comment by Sebastien Buisson (Inactive) [ 27/Jun/14 ] |
|
Hi Bruno, Did you have the opportunity to look at the enhanced fix proposal from Antoine? Cheers, |
| Comment by Bruno Faccini (Inactive) [ 04/Jul/14 ] |
|
Hello Antoine and Sebastien, I spent some time to review the enhanced patch version (provided at the end of trace_debug_lascaux110_corrupt_slab_mdt_get_info.txt attachment for this ticket) from Antoine vs the current one for Basically and as far as I understand, Antoine's version has 2 additional hunks : @@ -5688,7 +5688,7 @@ static int mdt_path_current(struct mdt_t
mdt_obj = mdt_object_find(info->mti_env, mdt,
&pli->pli_fids[pli->pli_fidcount]);
if (IS_ERR(mdt_obj))
- GOTO(out, rc = PTR_ERR(mdt_obj));
+ GOTO(out, rc = -ENOENT);
if (mdt_object_remote(mdt_obj)) {
mdt_object_put(info->mti_env, mdt_obj);
GOTO(remote_out, rc = -EREMOTE);
and @@ -5726,6 +5726,8 @@ static int mdt_path_current(struct mdt_t ptr -= tmpname->ln_namelen; if (ptr - 1 <= pli->pli_path) GOTO(out, rc = -EOVERFLOW); + if (tmpname->ln_namelen <= 0) + GOTO(out, rc = -ENOENT); strncpy(ptr, tmpname->ln_name, tmpname->ln_namelen); *(--ptr) = '/'; both in mdt_path_current()/mdt_handler.c routine/file. My opinion, and you 2 may want to discuss it further, is that with the common patch part for our 2 changes, which now enables correct error from mdt_links_read()/linkea_init() return+handling, the second additional hunk is not needed since due to leh_reccount being 0 in a link-EA with no parent-fid, linkea_init() will return ENODATA that will now be correctly forwarded by mdt_links_read() that will be handled in mdt_path_current() to not continue further+wrongly in trying to decode non-existing Link-EA entry, which has leaded to the unexpected behavior you have encountered (memory corruption due to a negative length passed to strcpy() when trying to extract parent-FID from un-exiting Link-EA entry). Concerning the 1st additional hunk, and we may already have discussed about it off-line ..., but I don't understand the reason why you want to unconditionally return -ENOENT upon mdt_object_find() error return. |
| Comment by Antoine Percher [ 06/Jul/14 ] |
|
I do this -ENOENT return because in this case, I want to be sure that "lustre/mdt/mdt_handler.c" :
5814 obj = mdt_object_find(info->mti_env, mdt, &fp->gf_fid);
5815 if (obj == NULL || IS_ERR(obj)) {
5816 CDEBUG(D_IOCTL, "no object "DFID": %ld\n", PFID(&fp->gf_fid),
5817 PTR_ERR(obj));
5818 RETURN(-EINVAL);
5819 }
You can remove this part of my patch ... Did you test a fid2path on a file modified by my proposal command the 06/Jun/14 9:18 ? You miss this part of my patch that also could useful @@ -5701,7 +5701,7 @@ static int mdt_path_current(struct mdt_t
rc = mdt_links_read(info, mdt_obj, &ldata);
mdt_object_put(info->mti_env, mdt_obj);
if (rc != 0)
- GOTO(out, rc = PTR_ERR(buf));
+ GOTO(out, rc);
|
| Comment by Andreas Dilger [ 06/May/20 ] |
|
Fixed with patch https://review.whamcloud.com/6772 in 2.5.0 and 2.4.1. |