Details
-
Question/Request
-
Resolution: Fixed
-
Minor
-
None
-
Lustre 2.5.0
-
None
-
lustre ee client-2.5.42.4-2.6.32_642.6.2
-
9223372036854775807
Description
We have a curious problem where a user is managing to wedge a few COS6 compute nodes every day to the point they don't even respond to the console. The only thing we see logged before this happens is:
Jan 12 15:59:14 n602 kernel: [3463048.109323] LustreError: 21705:0:(file.c:3256:ll_inode_revalidate_fini()) fouo5: revalidate FID [0x2000068c4:0x16fd0:0x0] error: rc = -4
I have no idea if this is a symptom of the node breaking down in some way unrelated to Lustre or if this is part of the cause, so I'd like to figure out what this error means. After a quick look at the code this should be the return code from either md_intent_lock or md_getattr, but I can't find where the error code -4 is defined.
Any tips?