Details
-
Bug
-
Resolution: Not a Bug
-
Minor
-
None
-
Lustre 2.12.6
-
None
-
EL7
-
3
-
9223372036854775807
Description
On our MDS running ZFS backing, we're seeing a frequent issue which'll hang the clients and show the attached stack trace. I'll also attach the lustre-logs for the relevant time period of this morning's instance.
Once this has happened, it seems the only option to reconnect the client is a reboot of the MDS.
This may be related to the MDT filling up - we changed the zpool topology to increase the size of the MDT and all seemed well for a few days after before these issues started to occur.
I'm running an lfsck which has so far repaired a large number of namespaces but the problem as occurred again while that was running.
Any help as always much appreciated.