Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13534

Landing an LU-12678 high likely introduce a random memory corruption bug

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      with landing an LU-12678, ptlrpc hold an object pointer without reference to it (lnet_me don't have a reference).
      Scenario is
      lnet monitor_thread found an expired response and start a kill MD once no references MD start to kill an ME entry, but ptlrpc have a reference to the ME object and try to kill ME itself.

      void
      lnet_md_unlink(struct lnet_libmd *md)
      {
              if ((md->md_flags & LNET_MD_FLAG_ZOMBIE) == 0) {
                      /* first unlink attempt... */
                      struct lnet_me *me = md->md_me;
      
                      md->md_flags |= LNET_MD_FLAG_ZOMBIE;
      
                      /* Disassociate from ME (if any), and unlink it if it was created
                       * with LNET_UNLINK */
                      if (me != NULL) {
                              /* detach MD from portal */
                              lnet_ptl_detach_md(me, md);
                              if (me->me_unlink == LNET_UNLINK)
                                      lnet_me_unlink(me);
                      }
      
                      /* ensure all future handle lookups fail */
                      lnet_res_lh_invalidate(&md->md_lh);
              }
      
              if (md->md_refcount != 0) {
                      CDEBUG(D_NET, "Queueing unlink of md %p\n", md);
                      return;
              }
      

      

      so lnet_me isn't protected by MD reference.

      Attachments

        Issue Links

          Activity

            People

              neilb Neil Brown
              shadow Alexey Lyashkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: