Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12402

LNet Health: lnet_finalize() recursion

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      When there are many messages being dropped, health feature introduced a path where it is possible to enter into a deep recursion path.

      lnet_finalize()->lnet_health_check()->lnet_msg_decommit_tx()->
      lnet_return_tx_credits_locked()->lnet_post_send_locked()->lnet_finalize()

      This was dealth with in lnet_finalize() via keeping track of the finalizers thread using msc_finalizers. And returning if all slots are busy.

      The above path doesn't have the same mechanism, therefore is susceptible to this problem.

      Attachments

        Issue Links

          Activity

            People

              ashehata Amir Shehata (Inactive)
              ashehata Amir Shehata (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: