Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.12.0, Lustre 2.13.0
-
None
-
3
-
9223372036854775807
Description
It's possible for lnet_is_health_check() to return "true" when the
message has not hit the network. In this situation the message is freed
without detaching the MD. As a result, requests do not receive their
unlink events and these requests are stuck forever.
This issue was discovered while testing the MR routing feature under LNet router failure conditions.
Bug was introduced by the LNet health feature commit 70616605dd44be37068f4e1a4745a2f8b90eb1f5 https://review.whamcloud.com/32764