Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11952

open+create resend can recreate a file after unlink

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.15.0
    • Lustre 2.10.0, Lustre 2.11.0, Lustre 2.12.0
    • 3
    • 9223372036854775807

    Description

      While testing a LU-11444 with a racer, we found a crash related to the assert in this patch.

      [  365.369900] Kernel panic - not syncing: LBUG
      [  365.369902] CPU: 1 PID: 11263 Comm: mdt00_026 Tainted: G           OE  ------------   3.10.0-327.36.3.x3.0.22.x86_64 #1
      [  365.369903] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  365.369906]  ffffffffa04d7a03 00000000df5aaa35 ffff880099457530 ffffffff81634b81
      [  365.369908]  ffff8800994575b0 ffffffff8162e410 ffffffff00000008 ffff8800994575c0
      [  365.369909]  ffff880099457560 00000000df5aaa35 ffffffffa08ed140 ffffffffa04e2525
      [  365.369910] Call Trace:
      [  365.369914]  [<ffffffff81634b81>] dump_stack+0x19/0x1b
      [  365.369916]  [<ffffffff8162e410>] panic+0xd8/0x1e7
      [  365.369924]  [<ffffffffa04b6013>] lbug_with_loc+0xb3/0xc0 [libcfs]
      [  365.369968]  [<ffffffffa08b69a7>] tgt_add_reply_data+0x217/0x220 [ptlrpc]
      [  365.370003]  [<ffffffffa08b8bc2>] tgt_last_rcvd_update+0x742/0xe40 [ptlrpc]
      [  365.370036]  [<ffffffffa08bda82>] tgt_txn_stop_cb+0x1a2/0x4a0 [ptlrpc]
      [  365.370045]  [<ffffffffa0a88b18>] ? osd_attr_set+0x218/0x960 [osd_ldiskfs]
      [  365.370066]  [<ffffffffa05ebc43>] dt_txn_hook_stop+0x63/0x80 [obdclass]
      [  365.370076]  [<ffffffffa0a8a5fd>] osd_trans_stop+0x12d/0x430 [osd_ldiskfs]
      [  365.370096]  [<ffffffffa060993e>] ? lu_ucred+0x1e/0x30 [obdclass]
      [  365.370104]  [<ffffffffa0d9eace>] ? mdd_changelog_ns_store+0x2e/0x650 [mdd]
      [  365.370111]  [<ffffffffa0d325a2>] lod_trans_stop+0x52/0x890 [lod]
      [  365.370118]  [<ffffffffa0dbc270>] mdd_trans_stop+0x20/0x34 [mdd]
      [  365.370125]  [<ffffffffa0da6225>] mdd_create+0x1075/0x12a0 [mdd]
      [  365.370138]  [<ffffffffa0c79062>] mdt_reint_open+0x1f92/0x2e00 [mdt]
      [  365.370167]  [<ffffffffa0857da0>] ? lustre_msg_buf_v2+0x1b0/0x1b0 [ptlrpc]
      [  365.370179]  [<ffffffffa0c7a48d>] mdt_reconstruct_open+0x5bd/0xb80 [mdt]
      [  365.370192]  [<ffffffffa0c7025c>] mdt_reconstruct+0x4c/0x160 [mdt]
      [  365.370204]  [<ffffffffa0c4cbd8>] mdt_reint_internal+0x998/0xb30 [mdt]
      [  365.370213]  [<ffffffffa0c4ced2>] mdt_intent_reint+0x162/0x420 [mdt]
      [  365.370222]  [<ffffffffa0c508f5>] mdt_intent_opc+0x215/0xb30 [mdt]
      [  365.370258]  [<ffffffffa085c480>] ? lustre_swab_ldlm_policy_data+0x30/0x30 [ptlrpc]
      [  365.370268]  [<ffffffffa0c58668>] mdt_intent_policy+0x138/0x320 [mdt]
      [  365.370298]  [<ffffffffa0807373>] ldlm_lock_enqueue+0x343/0xae0 [ptlrpc]
      [  365.370307]  [<ffffffffa04c9cc0>] ? cfs_hash_lookup+0x90/0xc0 [libcfs]
      [  365.370339]  [<ffffffffa082ef23>] ldlm_handle_enqueue0+0x9c3/0x1840 [ptlrpc]
      [  365.370375]  [<ffffffffa085c500>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc]
      [  365.370411]  [<ffffffffa08c0cd2>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [  365.370443]  [<ffffffffa08c59e9>] tgt_request_handle+0x949/0x1240 [ptlrpc]
      [  365.370480]  [<ffffffffa08676cb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
      [  365.370519]  [<ffffffffa0864fa5>] ? ptlrpc_wait_event+0xa5/0x350 [ptlrpc]
      [  365.370549]  [<ffffffffa086b00e>] ptlrpc_main+0xc1e/0x2300 [ptlrpc]
      [  365.370551]  [<ffffffff810c22ee>] ? dequeue_task_fair+0x42e/0x640
      [  365.370553]  [<ffffffff810bb7d5>] ? sched_clock_cpu+0x85/0xc0
      [  365.370588]  [<ffffffffa086a3f0>] ? ptlrpc_register_service+0x1070/0x1070 [ptlrpc]
      [  365.370590]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
      [  365.370593]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
      [  365.370595]  [<ffffffff81644fd8>] ret_from_fork+0x58/0x90
      [  365.370597]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
      

      This assert have checked we have no xid step down for the slot.

      Original idea was regression from a patch, but later we have decided it's bug in open replay code. Bug looks addressed to the valid situation when object may don't 'exist' or object fid is different than client requested.

              if (mdt_get_disposition(ldlm_rep, DISP_OPEN_CREATE)) {
                      struct obd_export *exp = req->rq_export;
                      /*
                       * We failed after creation, but we do not know in which step
                       * we failed. So try to check the child object.
                       */
                      parent = mdt_object_find(env, mdt, rr->rr_fid1);
                      if (IS_ERR(parent)) {
                              rc = PTR_ERR(parent);
                              LCONSOLE_WARN("Parent "DFID" lookup error %d."
                                            " Evicting client %s with export %s.\n",
                                            PFID(rr->rr_fid1), rc,
                                            obd_uuid2str(&exp->exp_client_uuid),
                                            obd_export_nid2str(exp));
                              mdt_export_evict(exp);
                              RETURN_EXIT;
                      }
      
                      child = mdt_object_find(env, mdt, rr->rr_fid2);
                      if (IS_ERR(child)) {
                              rc = PTR_ERR(child);
                              LCONSOLE_WARN("cannot lookup child "DFID": rc = %d; "
                                            "evicting client %s with export %s\n",
                                            PFID(rr->rr_fid2), rc,
                                            obd_uuid2str(&exp->exp_client_uuid),
                                            obd_export_nid2str(exp));
                              mdt_object_put(env, parent);
                              mdt_export_evict(exp);
                              RETURN_EXIT;
                      }
      
                      if (unlikely(mdt_object_remote(child))) {
                              /* the child object was created on remote server */
                              if (!mdt_is_dne_client(exp)) {
                                      /* Return -EIO for old client */
                                      mdt_object_put(env, parent);
                                      mdt_object_put(env, child);
                                      GOTO(out, rc = -EIO);
                              }
                              repbody->mbo_fid1 = *rr->rr_fid2;
                              repbody->mbo_valid |= (OBD_MD_FLID | OBD_MD_MDS);
                              rc = 0;
                      } else {
                              if (mdt_object_exists(child)) {
                                      mdt_prep_ma_buf_from_rep(info, child, ma);
                                      rc = mdt_attr_get_complex(info, child, ma);
                                      if (rc == 0)
                                              rc = mdt_finish_open(info, parent,
                                                                   child, open_flags,
                                                                   1, ldlm_rep);
                              } else {
                                      /* the child does not exist, we should do
                                       * regular open */
                                      mdt_object_put(env, parent);
                                      mdt_object_put(env, child);
                                      GOTO(regular_open, 0); <<<
                              }
                      }
                      mdt_object_put(env, parent);
                      mdt_object_put(env, child);
      

      operation order restored from requests history hash
      1) fileA created from client 1
      2) fileA open(O_CREATE) from client 2
      3) link to the client2 failed before reply send (from client view)
      4) unlink(fileA) from client1, it have blocked until reply arrived to the client2
      5) client2 started to resend a open(O_CREATE)
      6) unlink(fileA) was finished
      – it looks lu_site cache remove this object from cache in this time.
      7) resend handled by MDT and found request->fid2 isn't exist, so full create needs.

      ---- thats all.

      variant of this case.

      a) without unlink - p.2 need to found an object by name and reply with different fid, so resend will don't found a child by fid2 from request (object created without LOHA_EXIST)

      Attachments

        Issue Links

          Activity

            People

              askulysh Andriy Skulysh
              shadow Alexey Lyashkov
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: