[LU-4113] kerberized clients hangs while mounting/accessing due to uncatched error -ETIMEDOUT in gss_svc_upcall Created: 16/Oct/13  Updated: 25/Oct/13  Resolved: 25/Oct/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.1, Lustre 2.5.0
Fix Version/s: Lustre 2.6.0

Type: Bug Priority: Major
Reporter: Thomas Stibor Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: gssapi, kerberos, patch
Environment:

debian wheezy 3.6.11 lustre patched


Severity: 3
Rank (Obsolete): 11069

 Description   

Since kernel version 2.6.20 the function cache_check() in net/sunrpc/cache.c can return the error -ETIMEDOUT.

@@ -107,27 +237,14 @@ int cache_check(struct cache_detail *detail,
        }
 
        if (rv == -EAGAIN)
-               cache_defer_req(rqstp, h);
+               if (cache_defer_req(rqstp, h) != 0)
+                       rv = -ETIMEDOUT;
 
-       if (rv && h)
-               detail->cache_put(h, detail);
+       if (rv)
+               cache_put(h, detail);
        return rv;
 }

This error is uncaught in gss_svc_upcall.c in function
gss_svc_upcall_handle_init()

...
rc = cache_check(&rsi_cache, &rsip->h, &cache_upcall_chandle);
...
if (rc)
		GOTO(out, rc = SECSVC_DROP);

and causes the client to drop the security negotiation, and results in hanging until the flavor is set back to null (lctl conf_param ldomov.srpc.flavor.default=null)

I provide a patch to fix it.



 Comments   
Comment by Andreas Dilger [ 16/Oct/13 ]

Patch is at http://review.whamcloud.com/7960

Comment by Doug Oucharek (Inactive) [ 16/Oct/13 ]

Nathaniel, can you keep an eye on the patch to get it through?

Comment by Peter Jones [ 25/Oct/13 ]

Landed for 2.6

Generated at Sat Feb 10 01:39:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.