Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
server (asp4) lustre-2.14.0_21.llnl-5.t4.x86_64=
clients (oslic) lustre-2.12.9_6.llnl-2.t4.x86_64, (ruby) lustre-2.12.9_7.llnl-1.t4.x86_64
TOSS 4.6-6
-
3
-
9223372036854775807
Description
mdt-aspls3-MDT0003 is stuck and not responding to clients. It has many (~244) threads stuck in ldlm_completion_ast, stopping and starting lustre does not fix the problem.
Yes, we've been trying to move to 2.15 for over a year but have kept running into complex LNet issues with orelic and other routers. Talking internally this morning, we want to try 2.15 again in case these new tunings resolve the problems we were seeing.