Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
EXA performance test script hanged on SLES15sp3 on NUMA system.
Found that once we disabled unstable_check, the test can pass.
lctl set_param llite.*.unstable_stats=0 # disable unstable check
Finally found the root reason: we are using NR_UNSTABLE_NFS wrongly, it was deprecated (DO NOT USE) on the SLES15sp3:
NR_UNSTABLE_NFS, /* NFS unstable pages - DEPRECATED DO NOT USE */
Moreover, the cgroups (memcg) does not work for the newer kernel,
the reason is that NR_UNSTABLE_NFS was removed, and it is wrongly
using NR_ZONE_WRITE_PENDING for memory accounting.
According to the kernel patch:
"mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead"
kernel v5.8-rc1 commit: 8d92890bd6b8502d6aee4b37430ae6444ade7a8c
it should account unstable pages in NR_WRITEBACK and WB_WRITEBACK.
We should fix these accordingly.