Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3233

tgt_cb_last_committed()) ASSERTION( ccb->llcc_exp->exp_obd == ccb->llcc_tgt->lut_obd ) failed:

Details

    • 3
    • 7934

    Description

      I see this running racer on a single node MDSCOUNT=2 llmount.sh filesystem. (I upgraded the LASSERT() to an LASSERTF().)

      LustreError: 10307:0:(mdt_recovery.c:390:mdt_last_rcvd_update()) Trying to overwrite bigger transno:on-disk: 4295820775, new: 4295609707 replay: 0. see LU-617.
      LustreError: 10277:0:(tgt_lastrcvd.c:412:tgt_cb_last_committed()) ASSERTION( ccb->llcc_exp->exp_obd == ccb->llcc_tgt->lut_obd ) failed: exp_obd ffff88014fb182f8 != lut_obd ffff88016c536178
      
      Breakpoint 2, lbug_with_loc (msgdata=0xffffffffa12c9360) at /root/lustre-release/libcfs/libcfs/linux/linux-debug.c:167
      167	        libcfs_debug_msg(msgdata, "LBUG\n");
      (gdb) bt
      #0  lbug_with_loc (msgdata=0xffffffffa12c9360)
          at /root/lustre-release/libcfs/libcfs/linux/linux-debug.c:167
      #1  0xffffffffa12613e9 in tgt_cb_last_committed (env=<value optimized out>, 
          th=<value optimized out>, cb=0xffff88017377cb40, err=<value optimized out>)
          at /root/lustre-release/lustre/ptlrpc/../../lustre/target/tgt_lastrcvd.c:410
      #2  0xffffffffa0874a24 in osd_trans_commit_cb (sb=<value optimized out>, 
          jcb=<value optimized out>, error=0)
          at /root/lustre-release/lustre/osd-ldiskfs/osd_handler.c:646
      #3  0xffffffffa082610a in ldiskfs_journal_commit_callback (
          journal=<value optimized out>, txn=0xffff880177622a80)
          at /root/lustre-release/ldiskfs/ldiskfs/super.c:329
      #4  0xffffffffa007584f in jbd2_journal_commit_transaction (journal=0xffff880150255800)
          at fs/jbd2/commit.c:1049
      #5  0xffffffffa007b698 in kjournald2 (arg=0xffff880150255800) at fs/jbd2/journal.c:163
      #6  0xffffffff81090626 in kthread (_create=0xffff880160ab15c8) at kernel/kthread.c:78
      #7  0xffffffff8100c0ca in ?? () at arch/x86/kernel/entry_64.S:1211
      #8  0x0000000000000000 in ?? ()
      (gdb) p *((struct obd_device *) 0xffff88014fb182f8)
      $2 = {
        obd_type = 0xffff88019705ca40, 
        obd_magic = 2874988271, 
        obd_name = "lustre-MDT0001", '\000' <repeats 113 times>, 
        obd_uuid = {
          uuid = "lustre-MDT0001_UUID", '\000' <repeats 20 times>
        }, 
      ...
      (gdb) p *((struct obd_device *) 0xffff88016c536178)
      $3 = {
        obd_type = 0xffff88019705ca40, 
        obd_magic = 2874988271, 
        obd_name = "lustre-MDT0000", '\000' <repeats 113 times>, 
        obd_uuid = {
          uuid = "lustre-MDT0000_UUID", '\000' <repeats 20 times>
        }, 
      ...
      

      Attachments

        Issue Links

          Activity

            [LU-3233] tgt_cb_last_committed()) ASSERTION( ccb->llcc_exp->exp_obd == ccb->llcc_tgt->lut_obd ) failed:
            pjones Peter Jones added a comment -

            Landed for 2.4.1 and 2.5.0

            pjones Peter Jones added a comment - Landed for 2.4.1 and 2.5.0
            jhammond John Hammond added a comment -

            See Niu's patches to remove M_CHECK_STALE and make open by FID more sane.

            jhammond John Hammond added a comment - See Niu's patches to remove M_CHECK_STALE and make open by FID more sane.
            jhammond John Hammond added a comment -

            Both patches landed to master.

            Dropping to minor as Jinshan and I had discussed whether the logic of using M_CHECK_STALE and having the server return the FID of the new file (along with an open handle) really made any sense here. It appeared that someone planned an optimization around returning the new FID but it never got implemented. I believe that open might be made more robust by having the server ignore any supplied name if MDS_OPEN_BY_FID is set, and perhaps by having the client not supply a name when it sets MDS_OPEN_BY_FID. More analysis is needed however.

            jhammond John Hammond added a comment - Both patches landed to master. Dropping to minor as Jinshan and I had discussed whether the logic of using M_CHECK_STALE and having the server return the FID of the new file (along with an open handle) really made any sense here. It appeared that someone planned an optimization around returning the new FID but it never got implemented. I believe that open might be made more robust by having the server ignore any supplied name if MDS_OPEN_BY_FID is set, and perhaps by having the client not supply a name when it sets MDS_OPEN_BY_FID. More analysis is needed however.
            jhammond John Hammond added a comment -

            Indeed. In any case I think I may have a reasonable fix for the MDT case without touching too much else.

            Please see http://review.whamcloud.com/6938.

            jhammond John Hammond added a comment - Indeed. In any case I think I may have a reasonable fix for the MDT case without touching too much else. Please see http://review.whamcloud.com/6938 .
            green Oleg Drokin added a comment -

            Right, there's normally an overtrust to those handles and underchecking their validness (mostly in locks), but granted most of the time it does not end up with such drastic crashes on server if you get the handle wrong.

            green Oleg Drokin added a comment - Right, there's normally an overtrust to those handles and underchecking their validness (mostly in locks), but granted most of the time it does not end up with such drastic crashes on server if you get the handle wrong.
            jhammond John Hammond added a comment -

            > So, do I get this right, a data from client crashes server?

            Yes, the och gets the wrong FID, so LMV routes the right handle to the wrong MDT (on the same MDS as the right MDT). The MDT totally trusts the handle (until we hit the assertion above).

            > Let's hold on on the client patch and fix the server first so that it does not crash!

            OK good.

            I'm concerned that once we don't trust handles then class_handle2object() and the synchronization (or lack of) around it needs a total rethink. Can someone suggest a less drastic solution?

            jhammond John Hammond added a comment - > So, do I get this right, a data from client crashes server? Yes, the och gets the wrong FID, so LMV routes the right handle to the wrong MDT (on the same MDS as the right MDT). The MDT totally trusts the handle (until we hit the assertion above). > Let's hold on on the client patch and fix the server first so that it does not crash! OK good. I'm concerned that once we don't trust handles then class_handle2object() and the synchronization (or lack of) around it needs a total rethink. Can someone suggest a less drastic solution?
            green Oleg Drokin added a comment -

            So, do I get this right, a data from client crashes server?
            Let's hold on on the client patch and fix the server first so that it does not crash!

            green Oleg Drokin added a comment - So, do I get this right, a data from client crashes server? Let's hold on on the client patch and fix the server first so that it does not crash!
            jhammond John Hammond added a comment - Please see http://review.whamcloud.com/6695 .
            jhammond John Hammond added a comment -

            I think this is because ll_och_fill() may use the wrong FID under certain circumstances. Consider the following:

            ll_intent_file_open()
                    lmv_intent_lock()
                            mdc_intent_lock()
                                    mdc_finish_intent_lock()
            

            And assume that mdc_finish_intent_lock() returns -ESTALE because the FID in body does not match either of the FIDs in op_data. Then ll_intent_file_open() see -ESTALE and we have:

            ll_intent_file_open(file, ..., it)
                    ll_release_openhandle(dentry, it)
                            OBD_ALLOC(och)
                            ll_och_fill(md_exp, lli, it, och)
                                    req = it->d.lustre.it_data
                                    body = req_capsule_server_get(req...)
                                    och->och_fh = body->handle
                                    och->och_fid = lli->lli_fid
                            ll_close_inode_openhandle(md_exp, inode, och)
            

            But ll_och_fill() should be using the FID from body for this close, not from the lli. Then lmv_set_open_replay_data() uses this bad FID to choose the wrong target, and so on.

            It seems like the fix is to have ll_och_fill() use the FID from body in general. Perhaps mdt_close() should also check that the FID sent matched the FID of the object in mfd. Now it totally trusts the cookie/mfd, and we only notice the mismatch when DNE is enabled and only because of the assertion in tgt_cb_last_committed().

            jhammond John Hammond added a comment - I think this is because ll_och_fill() may use the wrong FID under certain circumstances. Consider the following: ll_intent_file_open() lmv_intent_lock() mdc_intent_lock() mdc_finish_intent_lock() And assume that mdc_finish_intent_lock() returns -ESTALE because the FID in body does not match either of the FIDs in op_data. Then ll_intent_file_open() see -ESTALE and we have: ll_intent_file_open(file, ..., it) ll_release_openhandle(dentry, it) OBD_ALLOC(och) ll_och_fill(md_exp, lli, it, och) req = it->d.lustre.it_data body = req_capsule_server_get(req...) och->och_fh = body->handle och->och_fid = lli->lli_fid ll_close_inode_openhandle(md_exp, inode, och) But ll_och_fill() should be using the FID from body for this close, not from the lli. Then lmv_set_open_replay_data() uses this bad FID to choose the wrong target, and so on. It seems like the fix is to have ll_och_fill() use the FID from body in general. Perhaps mdt_close() should also check that the FID sent matched the FID of the object in mfd. Now it totally trusts the cookie/mfd, and we only notice the mismatch when DNE is enabled and only because of the assertion in tgt_cb_last_committed().
            jhammond John Hammond added a comment -

            In mtd_reint_open() should we be doing lookup if the MDS_OPEN_BY_FID flag is set? We already have what should be the child FID in that case. Moreover if we trust that FID when MDS_OPEN_BY_FID is set then I can no longer reproduce this. But I cannot say I fully understand this code.

            jhammond John Hammond added a comment - In mtd_reint_open() should we be doing lookup if the MDS_OPEN_BY_FID flag is set? We already have what should be the child FID in that case. Moreover if we trust that FID when MDS_OPEN_BY_FID is set then I can no longer reproduce this. But I cannot say I fully understand this code.

            People

              jhammond John Hammond
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: