Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4081

mdc_enqueue() may return a freed lock in intent

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.5.0
    • 3
    • 10963

    Description

      This is the sister bug to LU-4079. Any failure in mdc_finish_enqueue() after the lock handle is copied over to the intent will cause mdc_enqueue() to drop its reference to the lock but return it in the intent anyway. The lock cookie returned through the lockh param is 0. There are several allocations that could fail in mdc_finish_enqueue() but I just added a fail check for simplicity.

      mdc_finish_enqueue()
      {
              ...
              intent->it_disposition = (int)lockrep->lock_policy_res1;
              intent->it_status = (int)lockrep->lock_policy_res2;
              intent->it_lock_mode = einfo->ei_mode;
              intent->it_lock_handle = lockh->cookie;
              intent->it_data = req;
      
              ...
      
              DEBUG_REQ(D_RPCTRACE, req, "op: %d disposition: %x, status: %d",
                        it->it_op, intent->it_disposition, intent->it_status);
      
      +       if (OBD_FAIL_CHECK(0x3000))
      +               RETURN(-EPROTO);
      +
              ...
      }
      
      mdc_enqueue()
      {
              ...
              rc = mdc_finish_enqueue(exp, req, einfo, it, lockh, rc);
              if (rc < 0) {
                      if (lustre_handle_is_used(lockh)) {
                              ldlm_lock_decref(lockh, einfo->ei_mode);
                              memset(lockh, 0, sizeof(*lockh));
                      }
                      ptlrpc_req_finished(req);
              }
              RETURN(rc);
      }
      
      # llmount.sh
      # sh ./lustre/tests/racer.sh
      ...
      == racer test 1: racer on clients: t DURATION=300 == 15:04:21 (1381349061)
      racers pids: 28220 28221
      ...
      # lctl set_param fail_loc=0x3000
      
      Lustre: *** cfs_fail_loc=3000, val=0***
      LustreError: 31331:0:(ldlm_lock.c:851:ldlm_lock_decref_internal_nolock()) ASSERTION( lock\
      ->l_readers > 0 ) failed:
      LustreError: 31331:0:(ldlm_lock.c:851:ldlm_lock_decref_internal_nolock()) LBUG
      Pid: 31331, comm: mkdir
      
      Call Trace:
       [<ffffffffa0d95895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa0d95e97>] lbug_with_loc+0x47/0xb0 [libcfs]
       [<ffffffffa10167f2>] ldlm_lock_decref_internal_nolock+0xd2/0x180 [ptlrpc]
       [<ffffffffa101aabd>] ldlm_lock_decref_internal+0x4d/0xad0 [ptlrpc]
       [<ffffffffa0eb6ae5>] ? class_handle2object+0x95/0x190 [obdclass]
       [<ffffffffa101bf79>] ldlm_lock_decref+0x39/0x90 [ptlrpc]
       [<ffffffffa073bd2f>] ll_intent_drop_lock+0xaf/0x150 [lustre]
       [<ffffffffa07716cb>] ? ll_finish_md_op_data+0x2cb/0x410 [lustre]
       [<ffffffffa073e5a8>] ll_revalidate_it+0xbe8/0x1b20 [lustre]
       [<ffffffffa0786940>] ? ll_md_blocking_ast+0x0/0x790 [lustre]
       [<ffffffffa0786940>] ? ll_md_blocking_ast+0x0/0x790 [lustre]
       [<ffffffffa073f613>] ll_revalidate_nd+0x133/0x3e0 [lustre]
       [<ffffffff8118fa45>] __lookup_hash+0x85/0x160
       [<ffffffff8119016a>] lookup_hash+0x3a/0x50
       [<ffffffff811901ee>] lookup_create+0x6e/0xd0
       [<ffffffff81193aac>] sys_mkdirat+0x7c/0x130
       [<ffffffff811a36d0>] ? mntput_no_expire+0x30/0x110
       [<ffffffff811a36d0>] ? mntput_no_expire+0x30/0x110
       [<ffffffff81193b78>] sys_mkdir+0x18/0x20
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      

      Attachments

        Activity

          People

            dmiter Dmitry Eremin (Inactive)
            jhammond John Hammond
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: