[LU-2490] mdd_links_rename error seen on Grove test MDS Created: 13/Dec/12 Updated: 22/Dec/12 Resolved: 22/Dec/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Prakash Surya (Inactive) | Assignee: | Alex Zhuravlev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | MB, shh, topsequoia | ||
| Severity: | 3 |
| Rank (Obsolete): | 5844 |
| Description |
|
I just updated our Grove Test MDS to 2.3.57-1chaos and noticed the following error which I have not seen before. Is this something to be alarmed about? 2012-12-13 11:06:39 LustreError: 33478:0:(mdd_dir.c:2750:mdd_links_rename()) link_ea add 'simul_link.109' failed -75 [0x20001c38b:0xaa88:0x0] 2012-12-13 11:06:39 LustreError: 33348:0:(mdd_dir.c:2750:mdd_links_rename()) link_ea add 'simul_link.357' failed -75 [0x20001c38b:0xaa88:0x0] 2012-12-13 11:06:39 LustreError: 33226:0:(mdd_dir.c:2754:mdd_links_rename()) link_ea del 'simul_link.1' failed -2 [0x20001c38b:0xaa88:0x0] 2012-12-13 11:06:39 LustreError: 33226:0:(mdd_dir.c:2754:mdd_links_rename()) link_ea del 'simul_link.2' failed -2 [0x20001c38b:0xaa88:0x0] |
| Comments |
| Comment by Prakash Surya (Inactive) [ 13/Dec/12 ] |
|
I'm bumping the priority to "Blocker" as I'm seeing this constantly since upgrading the MDS. I haven't rebooted the OSTs yet, not sure if that's signifcant. |
| Comment by Peter Jones [ 13/Dec/12 ] |
|
Alex Could you pleae comment? thanks Peter |
| Comment by Andreas Dilger [ 14/Dec/12 ] |
|
The "-79" error is "-EOVERFLOW", which is returned if the number of hard links to a file exceed LINKEA_MAX_COUNT (hard-coded at 128 currently). That is the upper limit of linkEA entries on a file that are recorded in the inode, to avoid consuming all of the xattr space and potentially making the update of hard-linked files slow. Definitely the CERROR() messages here should be turned to CDEBUG(), and ideally also improved to be in the standard format "device: message: rc = %d\n". As part of the LFSCK 1.5 project, a larger number of linkEA entries will be allowed (depending on whether the the "large_xattr" feature is enabled to allow a larger max xattr size). |
| Comment by Andreas Dilger [ 14/Dec/12 ] |
|
PS - except for "simul", are there any real-world cases where files have more than 128 hard links? In all of the data at http://www.pdsi-scidac.org/fsstats/ (and similar data sent to me privately) this basically never happened. |
| Comment by Alex Zhuravlev [ 17/Dec/12 ] |
| Comment by Prakash Surya (Inactive) [ 17/Dec/12 ] |
|
Thanks for the explanation and patch. As far as "real" usage, I have no idea if we have files with more than 128 hard links. That seems like a very obscure use case, but I have no evidence if users are doing this or not. |
| Comment by Prakash Surya (Inactive) [ 17/Dec/12 ] |
|
Chris, do you know if we have any users with files with more than 128 hard links? |
| Comment by Christopher Morrone [ 17/Dec/12 ] |
|
I hope not. Lustre didn't used to have this limit though, that is new. If that isn't going to change, we'll need to change either simul or our use of simul (exclude that test, or make the test failure just a warning). I don't suppose there is a way for a user-space application to query this limit? |
| Comment by Andreas Dilger [ 17/Dec/12 ] |
|
There isn't a new limit on the number of hard links to a single file - it remains 65000 for ldiskfs, and 2^32-1 for ZFS. The limit is only for the number of reverse name entries for a single file, stored in the "link" xattr. This is useful for lfs fid2path to generate pathnames from a FID for error reporting or lustre_rsync. Beyond the 128 hard link count, the reverse links are dropped, though not as silently as they should be. The limit is due to efficiency - since an xattr is completely searched and rewritten for each link added and removed, it gets into an O(n^2) behaviour if there are too many reverse links. Keeping 128 links was decided as an acceptable real-world usage, and for cases like simul or similar it will still work with lots of links. I think we're increasing the hard link limit in the "link" xattr as part of the LFSCK changes, because we now allow large xattrs, but I don't see it ever being unlimited due to O(n^2) slowdowns for updating the xattr. While it isn't directly relevant given the above comments, it is possible to query the hard link limit for a filesystem via pathconf(3). Unfortunately this is actually emulated in glibc by statfs() checking the filesystem magic and returning a hard-coded value. It does not get the value from the kernel, so it isn't possible to get the real limit (i.e. it isn't pathconf(2) like it is on MacOS). Until very recently, pathconf(3) on Lustre returned the "I don't recognize this filesystem magic, so let's guess 128 hard links" minimum for POSIX. I finally figured out that this was a glibc problem, and sent a patch to the maintainer to increase the limit to 65000 for Lustre, but it will never be able to return 2^32-1 for a ZFS-backed MDT. It might even be in RHEL 6.latest, but is definitely in RHEL7. |
| Comment by Christopher Morrone [ 20/Dec/12 ] |
|
Thank you for the explanation! That optimization sounds reasonable. I doubt there will be many (any?) legitimate uses of more than 128 hard links to a single file. So if we get the console messages silenced, it sounds like we'll be fine. |
| Comment by Peter Jones [ 22/Dec/12 ] |
|
Landed for 2.4 |