Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7409

llog declares write region that don't match actually write region later for osd_zfs

Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      The typical stack trace is as follows:

      Call Trace:
       [<ffffffff8156bf13>] ? panic+0xac/0x179
       [<ffffffffa0f4e5cc>] ? zio_wait+0x21c/0x3e0 [zfs]
       [<ffffffffa0e7ef87>] ? dmu_tx_dirty_buf+0x247/0x3d0 [zfs]
       [<ffffffffa0f4e2f3>] ? zio_destroy+0xb3/0x170 [zfs]
       [<ffffffffa0e5e55f>] ? dbuf_dirty+0x5f/0x16d0 [zfs]
       [<ffffffff8157156b>] ? _spin_unlock+0x2b/0x40
       [<ffffffffa0e848ea>] ? dnode_rele+0x5a/0xa0 [zfs]
       [<ffffffffa0e61501>] ? dmu_buf_will_dirty+0x91/0x100 [zfs]
       [<ffffffffa0e6cc70>] ? dmu_write+0xa0/0x230 [zfs]
       [<ffffffffa08444c1>] ? osd_write+0x1d1/0x3a0 [osd_zfs]
       [<ffffffffa06b9bdd>] ? dt_record_write+0x3d/0x130 [obdclass]
       [<ffffffffa067955a>] ? llog_osd_write_rec+0xd6a/0x1b70 [obdclass]
       [<ffffffffa06673f6>] ? llog_write_rec+0xb6/0x270 [obdclass]
       [<ffffffffa066c1b8>] ? llog_write+0x298/0x430 [obdclass]
       [<ffffffffa066c1cf>] ? llog_write+0x2af/0x430 [obdclass]
       [<ffffffffa14780a1>] ? record_marker+0x1c1/0x1e0 [mgs]
       [<ffffffffa14779ea>] ? record_start_log+0x38a/0x4a0 [mgs]
       [<ffffffffa14787cf>] ? mgs_write_log_lov+0x38f/0x6b0 [mgs]
       [<ffffffffa148a5c6>] ? mgs_write_log_mdt+0x326/0x1630 [mgs]
       [<ffffffff810c156d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffffa148d475>] ? mgs_write_log_target+0xb55/0x1980 [mgs]
       [<ffffffff810c156d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffffa057cc11>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
       [<ffffffffa1471d89>] ? mgs_target_reg+0xa19/0xe50 [mgs]
       [<ffffffffa0940b3f>] ? tgt_request_handle+0x8cf/0x1300 [ptlrpc]
       [<ffffffffa08eb85a>] ? ptlrpc_main+0xdaa/0x18b0 [ptlrpc]
       [<ffffffffa08eaab0>] ? ptlrpc_main+0x0/0x18b0 [ptlrpc]
       [<ffffffff810a728e>] ? kthread+0x9e/0xc0
       [<ffffffff8100c38a>] ? child_rip+0xa/0x20
       [<ffffffff815714b0>] ? _spin_unlock_irq+0x30/0x40
       [<ffffffff8100bb90>] ? restore_args+0x0/0x30
       [<ffffffff810a71f0>] ? kthread+0x0/0xc0
       [<ffffffff8100c380>] ? child_rip+0x0/0x20
      

      Patch will be submitted shortly

      Attachments

        Issue Links

          Activity

            [LU-7409] llog declares write region that don't match actually write region later for osd_zfs
            jay Jinshan Xiong (Inactive) added a comment - - edited

            This patch can work reasonably well on normal llog because I reserved large enough buffer for cushion, but it has problems with cat log for unlink case. Right now cat log is being used for unlink, changelog, and HSM, but I think the change to these logs are predictable at declare phase?

            jay Jinshan Xiong (Inactive) added a comment - - edited This patch can work reasonably well on normal llog because I reserved large enough buffer for cushion, but it has problems with cat log for unlink case. Right now cat log is being used for unlink, changelog, and HSM, but I think the change to these logs are predictable at declare phase?

            declaration is just accounting and actual serializations happens against specific dbufs at actual write.

            bzzz Alex Zhuravlev added a comment - declaration is just accounting and actual serializations happens against specific dbufs at actual write.

            Does declaring/holding a large range of the file cause ZFS to serialize IO to that region, or is this just accounting and serialization happens elsewhere? I'm just recalling the case of file creates where the TXG is serialized because (IIRC) you cannot modify a dnode in the same TXG as it is created in.

            adilger Andreas Dilger added a comment - Does declaring/holding a large range of the file cause ZFS to serialize IO to that region, or is this just accounting and serialization happens elsewhere? I'm just recalling the case of file creates where the TXG is serialized because (IIRC) you cannot modify a dnode in the same TXG as it is created in.
            bzzz Alex Zhuravlev added a comment - - edited

            I didn't see the patch, but in theory - yes. it should be consisting of two pieces:
            1) recognize a special (like -1) offset and reserve slightly more credits like we're going to write at huge offset resulting in a deep tree
            2) changes to the debugging code so that such a "undefined" write can be satisfied with that special declaration at (1)

            my original point was that we can't land this patch.

            bzzz Alex Zhuravlev added a comment - - edited I didn't see the patch, but in theory - yes. it should be consisting of two pieces: 1) recognize a special (like -1) offset and reserve slightly more credits like we're going to write at huge offset resulting in a deep tree 2) changes to the debugging code so that such a "undefined" write can be satisfied with that special declaration at (1) my original point was that we can't land this patch.

            Alex, I recall that the actual patch to the ZFS declare code is not very complex? How hard would it be to recreate that patch?

            adilger Andreas Dilger added a comment - Alex, I recall that the actual patch to the ZFS declare code is not very complex? How hard would it be to recreate that patch?
            jay Jinshan Xiong (Inactive) added a comment - - edited

            it can allow me to mount my MDS after debug enabled.

            jay Jinshan Xiong (Inactive) added a comment - - edited it can allow me to mount my MDS after debug enabled.

            well, then what's the purpose of the ticket? looks like a duplication of LU-2160 ?

            bzzz Alex Zhuravlev added a comment - well, then what's the purpose of the ticket? looks like a duplication of LU-2160 ?

            Yes, this is with debug enabled, and it failed at file system mount time. I saw there is a discussion about llog append therefore this patch is not a final solution, but simply an attempt to pass the check so that I can do some real test.

            jay Jinshan Xiong (Inactive) added a comment - Yes, this is with debug enabled, and it failed at file system mount time. I saw there is a discussion about llog append therefore this patch is not a final solution, but simply an attempt to pass the check so that I can do some real test.

            Ricardo developed a patch to allow append declaration, but I have no idea where that patch is. is your zfs build configured with debug enabled?

            bzzz Alex Zhuravlev added a comment - Ricardo developed a patch to allow append declaration, but I have no idea where that patch is. is your zfs build configured with debug enabled?

            LLOG can't declare exact region as actual size is choosed after dmu_tx_assign() - it's append, essentially.

            bzzz Alex Zhuravlev added a comment - LLOG can't declare exact region as actual size is choosed after dmu_tx_assign() - it's append, essentially.

            Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/17085
            Subject: LDEV-239 osd-zfs: declare exact object region to write
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d0ee25c74f33cb99a0a7c2564e6112ca267ff06e

            gerrit Gerrit Updater added a comment - Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/17085 Subject: LDEV-239 osd-zfs: declare exact object region to write Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d0ee25c74f33cb99a0a7c2564e6112ca267ff06e

            People

              bzzz Alex Zhuravlev
              jay Jinshan Xiong (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: