Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3550

Stale file handle on mount when mounting Lustre 2.4 via NFS

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.5.0
    • Lustre 2.4.0
    • None
    • 3
    • 8928

    Description

      When attempting to mount NFS exported Lustre, the mount operation reports 'stale file handle' and fails to complete. This happens with 2.4 servers and a 2.4 client. It does NOT happen with a 2.4 client and 2.2 servers.

      Investigation of the NFS traffic between the NFS client and NFS server (Lustre client) shows the NFS client requesting the file handle for the mount, then receiving a file handle back from the server. There is a bit more chatter, then the client sends back the same file handle as part of an info request. Then the server responds with a stale file handle error.

      This is happening on both CentOS 6.4 and SLES11SP2 clients.

      I'm attaching a series of logs of this issue.
      Here's a description of what's in those logs:
      Lustre MDS (2.4). (Full DK logs provided)
      Lustre Client(2.4)/NFS Server [The source of the NFS export] (Full DK logs & /var/log/messages with nfsd debug on full (0x7FFF))
      NFS Client (/var/log/messages with nfs debug set to 1, and a tcpdump of all traffic)

      For analyzing the tcpdump (if you need it - I suspect the NFS debug logs will make it irrelevant), the IP addresses:
      NFS Server: 172.29.53.155
      NFS Client: 172.29.53.160

      The /var/log/messages logs are not trimmed, sorry. Look for the last debug markers from Lustre in those files and you can line them up with the rest of the logs.

      Attachments

        Issue Links

          Activity

            People

              yong.fan nasf (Inactive)
              paf Patrick Farrell (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: