Details
-
Question/Request
-
Resolution: Duplicate
-
Minor
-
None
-
Lustre 2.8.0
-
None
-
Dell servers running TOSS (RHEL 7.5) IB connected to DDN SFA hardware
-
9223372036854775807
Description
We are evaluating Starfish for file system usage detail and have successfully scanned numerous lustre and NFS file systems. However, when we start ingesting data via changelogs we are running into a condition where it will hang and the only resolution is to power cycle the client we are testing with.
In trying to identify the problem, we found when we tried to unmount the file system that process would also hang and could not be aborted.
lsof of the file system showed the hung process was stuck onĀ /<file system>/.lustre/fid. Up until this point, we didn't even know that the hidden directory even existed nor its purpose. In scanning Jira, it is involve in lustre rsync and lfsck operations but not a lot of information regarding other roles it plays.
One thing is certain: Starfish uses FIDs in there monitoring tools and we can see that .lustre/fid is being identified by the Starfish process.
We're hoping we can get some additional information on what's going on with changelogs/.lustre.