Details
-
Question/Request
-
Resolution: Incomplete
-
Critical
-
None
-
None
-
None
-
Luster 2.10.0
-
9223372036854775807
Description
Hi All,
We have used IOR to stress our Lustre file system that includes 4 X OSS servers and 1X MGS/MDS server. After running a few hours, some LNet warning messages listed below are found in /var/log/messages.
Please kindly give us some suggestions for how to debug our Lustre file system.
Thanks a lot!
===================================================================
Sep 7 21:00:59 oss1 kernel: LNet: Service thread pid 25482 completed after 41.14s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Sep 7 21:22:03 oss1 kernel: LNet: Service thread pid 1088 completed after 67.46s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Sep 7 21:36:50 oss1 kernel: LNet: Service thread pid 21711 completed after 52.04s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Sep 7 22:20:42 oss1 kernel: LNet: Service thread pid 28459 completed after 22.70s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Sep 7 22:41:20 oss1 kernel: LNet: Service thread pid 2743 completed after 52.61s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Sep 7 23:51:26 oss1 kernel: LNet: Service thread pid 406 completed after 50.76s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Sep 8 00:30:52 oss1 kernel: LNet: Service thread pid 885 completed after 23.67s. This indicates the system was overloaded (too many service threads, or there were not enough hardware