[LU-3550] Stale file handle on mount when mounting Lustre 2.4 via NFS Created: 02/Jul/13 Updated: 20/Nov/13 Resolved: 18/Nov/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.5.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Patrick Farrell (Inactive) | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 8928 | ||||||||||||
| Description |
|
When attempting to mount NFS exported Lustre, the mount operation reports 'stale file handle' and fails to complete. This happens with 2.4 servers and a 2.4 client. It does NOT happen with a 2.4 client and 2.2 servers. Investigation of the NFS traffic between the NFS client and NFS server (Lustre client) shows the NFS client requesting the file handle for the mount, then receiving a file handle back from the server. There is a bit more chatter, then the client sends back the same file handle as part of an info request. Then the server responds with a stale file handle error. This is happening on both CentOS 6.4 and SLES11SP2 clients. I'm attaching a series of logs of this issue. For analyzing the tcpdump (if you need it - I suspect the NFS debug logs will make it irrelevant), the IP addresses: The /var/log/messages logs are not trimmed, sorry. Look for the last debug markers from Lustre in those files and you can line them up with the rest of the logs. |
| Comments |
| Comment by Patrick Farrell (Inactive) [ 08/Jul/13 ] |
|
The underlying issue is that NFS on Linux does not currently support 64 bit root inodes. This means that Lustre, with a 2.4 MDS, cannot be exported over NFS. After working this out by reading the NFS code, I discovered Andreas has a patch going upstream to the kernel for this: Is this documented somewhere as a known regression? |
| Comment by Patrick Farrell (Inactive) [ 08/Jul/13 ] |
|
Excuse me - A closer look at the patch from Andreas suggests it's for a related issue but not exactly the one we're facing. The issue I'm looking at comes up in mk_fsid, called from fh_compose, which is called from exp_rootfh: static inline void mk_fsid(int vers, u32 *fsidv, dev_t dev, ino_t ino, u32 fsid, unsigned char *uuid) { u32 *up; switch(vers) { case FSID_DEV: fsidv[0] = htonl((MAJOR(dev)<<16) | MINOR(dev)); fsidv[1] = ino_t_to_u32(ino); break; } Where we see the inode being coerced to 32 bits. This is what goes out on the wire to the client, even though Lustre has 64 bit inodes. I will have to look more closely at Andreas's patch and the issue it's resolving, as well as the code I noted above, to understand fully what's going on. |
| Comment by Patrick Farrell (Inactive) [ 09/Jul/13 ] |
|
Andreas's patch is for an issue with parsing 64 bit inode numbers in NFS-utils, and so isn't involved here. The problem is this: Linux includes an option for a 64 bit inode type, in that case: case FSID_UUID16_INUM: /* 8 byte inode and 16 byte fsid */ *(u64*)fsidv = (u64)ino; memcpy(fsidv+2, uuid, 16); break; — } else if (exp->ex_flags & NFSEXP_FSID) { fsid_type = FSID_NUM; } else if (exp->ex_uuid) { if (fhp->fh_maxsize >= 64) { if (is_root_export(exp)) fsid_type = FSID_UUID16; else fsid_type = FSID_UUID16_INUM; } else { if (is_root_export(exp)) fsid_type = FSID_UUID8; else fsid_type = FSID_UUID4_INUM; } } else if (!old_valid_dev(exp_sb(exp)->s_dev)) /* for newer device numbers, we must use a newer fsid format */ fsid_type = FSID_ENCODE_DEV; else fsid_type = FSID_DEV; The export option in question (ex_uuid) is one I can't quite figure out how to set in the export options. On the other hand, when I do -o fsid= on the export I can specify an integer or a UUID. Presumably this is hitting this case in set_version_and_fsid: } else if (exp->ex_flags & NFSEXP_FSID) { fsid_type = FSID_NUM; In any case, this appears to be a work around. Longer term, if we don't wish to have to specify -o fsid=, the NFS code in the kernel would need to change somehow to support 64 bit inodes in FSIDs. |
| Comment by Patrick Farrell (Inactive) [ 11/Jul/13 ] |
|
I've discussed this internally at Cray with someone with NFS expertise. He agrees that this work around (using the -o fsid= option to exportfs when exporting Lustre over NFS) is the appropriate solution, as the only other option is a fairly invasive patch to the NFS code in the Linux kernel. In light of that, WC may want to update documentation for exporting Lustre over NFS, but no code changes are necessary. |
| Comment by nasf (Inactive) [ 12/Jul/13 ] |
|
There are two issues for this topic: 1) Originally, Lustre did not return FSID via statfs() to nfs-utils. This issue has been resolved by the patch http://review.whamcloud.com/6493, which has already been landed to master (Lustre-2.5) 2) The nfs-utils defect of converting 64-bits ino# into 32-bits and causes information lost as to cannot locate the right root. It can be resolved by the patch: diff --git a/utils/mountd/cache.c b/utils/mountd/cache.c
index 517aa62..a7212e7 100644
--- a/utils/mountd/cache.c
+++ b/utils/mountd/cache.c
@@ -388,7 +388,7 @@ struct parsed_fsid {
int fsidtype;
/* We could use a union for this, but it would be more
* complicated; why bother? */
- unsigned int inode;
+ uint64_t inode;
unsigned int minor;
unsigned int major;
unsigned int fsidnum;
--
1.7.1
If you have chance, you can test above two patches together for verification. Thanks! |
| Comment by Patrick Farrell (Inactive) [ 12/Jul/13 ] |
|
nasf, I've been trying to build nfs-utils 1.2.3 [default in CentOS 6.4] (without patches, just to verify I can) and I am stuck in a dependency hell, with it not finding various installed packages. A bit of searching shows that patching has been done to nfs-utils to clean up a lot of unnecessary dependencies, which include the ones I'm dealing with. However, as I understand it, the kernel nfsd /proc interface has changed since CentOS 6.4 and SLES11SP2, so I can't just go grab the latest nfs-utils and expect it to work. Do you have a particular version you recommend building, or any tips on this? I may be able to land that linking patch by itself without problem and will try that next, but I thought I'd ask you as well.
|
| Comment by nasf (Inactive) [ 13/Jul/13 ] |
|
Above patch is for the latest nfs-utils. If you want to use nfs-utils-1.2.3, then the following one: 343,344c343,344 < uint64_t inode=0; < uint64_t inode64; --- > unsigned int inode=0; > unsigned long long inode64; |
| Comment by Patrick Farrell (Inactive) [ 14/Jul/13 ] |
|
nasf, Has WC tested the latest nfs-utils with CentOS 6.4? I thought I saw a proc interface change between the CentOS 6.4 kernel and the kernels targeted by 1.2.8, but I could be wrong about that.
|
| Comment by nasf (Inactive) [ 14/Jul/13 ] |
|
Hi Patrick, I downloaded the nfs-untils-1.2.3 source, and patched/compiled/tested on RHEL6 (2.6.32-358.6.1.el6). Not care the proc changes. |
| Comment by Patrick Farrell (Inactive) [ 24/Oct/13 ] |
|
I'm not sure what the long term plan is regarding this bug. The fundamental limitation isn't in Lustre, and we've got an acceptable workaround with setting FSID manually. Is further work planned on the Intel side, or should this bug be closed? Cray is getting along fine with the work around. |
| Comment by nasf (Inactive) [ 18/Nov/13 ] |
|
We have submitted related patch to the kernel maintainer, and hope the issue can be resolved from root. From Intel side, we cannot do more but waiting for the respond. If you have got your things work, we can close this ticket, and reopen it in future when needed. |