Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Server running with b2_7_fe
Clients are a mix of IEEL3 (RH7/SCS5), 2.5.3.90 (RH6/AE4), 2.7.3 (CentOS7)
-
3
-
9223372036854775807
Description
I have been on-site to work with Bruno Travouillon (Atos) on one of the crash-dumps they have.
After joint analysis, it looks like a huge memory part is being consumed by "ptlrpc_request_buffer_desc" (17KB size each due to the embedded req, and that have been allocated in 32KB Slabs to increase/double side effect!).
Having a look to the concerned source code, it looks like these "ptlrpc_request_buffer_desc" could be additionally allocated upon need by ptlrpc_check_rqbd_pool(), but will never be freed until OST umount/stop by ptlrpc_service_purge_all().
This problem has caused several OSS failovers to fail due to OOM.