Details
-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
Lustre 2.1.3
-
None
-
3
-
10187
Description
Hi,
Several Bull customers running lustre 2.1.x had a hang in ldlm_pools_shrink on a lustre client (login node).
The system was hung and a dump was initiated from the the bmc, by sending a NMI.
The dump shows there was no more activity on the system. The 12 CPUs are idle (swapper).
A lot of processes are in page_fault(), blocked in ldlm_pools_shrink().
I have attached the output of the "foreach bt" crash command. Let me know if you need the vmcore file.
Each time, we can see a lot of OOM messages in the syslog of the dump files.
This issue looks like LU-2468.
Thanks,
Sebastien.