Details
-
Bug
-
Resolution: Won't Fix
-
Blocker
-
None
-
Lustre 2.4.2
-
3
-
15338
Description
Our production MDS systems occasionally get stuck with many service threads stuck in ldlm_completion_ast(). The details were described in LU-4579, but that issue was closed when the patch landed which fixed how timeouts are reported.
When this happens, client access hangs and the MDS appears completely idle.
Attachments
Issue Links
- is related to
-
LU-4579 Timeout system horribly broken
-
- Resolved
-
We ran Patch Set 9 of change 9488 as part of 2.4.2, but we dropped it in favor of whatever landed on b2_5 now that we are based on 2.5.3.