Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2716

DNE on ZFS create remote directory suffers from long sync.

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.4.0
    • 2347

    Description

      The ZFS transaction sync operation will passively wait for a txg commit, instead of actively requesting TXG commit start before waiting. This causes each synchronous operation to wait up to 5s for the normal txg flush.

      Until such a time that we have implemented Lustre ZIL support (LU-4009), it makes sense that transaction handles that are marked as synchronous start a TXG commit instead of passively waiting. This might impact aggregate performance, but DNE operations will be relatively rare, and should not impact normal operations noticeably. That is especially true for flash-based ZFS devices, since the IOPS to update the überblock copies at the start of each block device do not need multiple mechanical full-patter seeks, so the cost of forcing a commit is relatively low.

      As an optimization, there could be a "batch wait and reschedule" for TXG sync, as there is for jbd2 since commit v2.6.28-5737-ge07f7183a486 "jbd2: improve jbd2 fsync batching" so that it allows multiple active threads to join the same TXG before it is closed for commit but does not wait unnecessarily between commits.

      Attachments

        Issue Links

          Activity

            [LU-2716] DNE on ZFS create remote directory suffers from long sync.
            adilger Andreas Dilger made changes -
            Description Original: The ZFS transaction sync operation will passively wait for a txg commit, instead of actively requesting TXG commit start before waiting. This causes each synchronous operation to wait up to 5s for the normal txg flush.

            Until such a time that we have implemented Lustre ZIL support (LU-4009), it makes sense that transaction handles that are marked as synchronous start a TXG commit instead of passively waiting. This might impact aggregate performance, but DNE operations will be relatively rare, and should not impact normal operations noticeably. That is especially true for flash-based ZFS devices, since the IOPS to update the überblock copies at the start of each block device do not need multiple mechanical full-patter seeks, so the cost of forcing a commit is relatively low.

            As an optimization, there could be a "batch wait and reschedule" for TXG sync, as there is for jbd2 since commit v2.6.28-5737-ge07f7183a486 "{{jbd2: improve jbd2 fsync batching]}" so that it allows multiple active threads to join the same TXG before it is closed for commit but does not wait unnecessarily between commits.
            New: The ZFS transaction sync operation will passively wait for a txg commit, instead of actively requesting TXG commit start before waiting. This causes each synchronous operation to wait up to 5s for the normal txg flush.

            Until such a time that we have implemented Lustre ZIL support (LU-4009), it makes sense that transaction handles that are marked as synchronous start a TXG commit instead of passively waiting. This might impact aggregate performance, but DNE operations will be relatively rare, and should not impact normal operations noticeably. That is especially true for flash-based ZFS devices, since the IOPS to update the überblock copies at the start of each block device do not need multiple mechanical full-patter seeks, so the cost of forcing a commit is relatively low.

            As an optimization, there could be a "batch wait and reschedule" for TXG sync, as there is for jbd2 since commit v2.6.28-5737-ge07f7183a486 "{{jbd2: improve jbd2 fsync batching}}" so that it allows multiple active threads to join the same TXG before it is closed for commit but does not wait unnecessarily between commits.
            adilger Andreas Dilger made changes -
            Description Original: The ZFS transaction sync operation will passively wait for a txg commit, instead of actively requesting txg commit start before waiting. This causes each synchronous operation to wait up to 5s for the normal txg flush.

            Until such a time that we have implemented Lustre ZIL support, it makes sense that transaction handles that are marked as synchronous start a txg commit instead of passively waiting. This might impact aggregate performance, but DNE operations will be relatively rare, and should not impact normal operations noticeably.

            As an optimization, there could be a "batch wait and reschedule" for txg sync, as there is for jbd, so that it allows multiple active threads to join the same txg before it is closed for commit.
            New: The ZFS transaction sync operation will passively wait for a txg commit, instead of actively requesting TXG commit start before waiting. This causes each synchronous operation to wait up to 5s for the normal txg flush.

            Until such a time that we have implemented Lustre ZIL support (LU-4009), it makes sense that transaction handles that are marked as synchronous start a TXG commit instead of passively waiting. This might impact aggregate performance, but DNE operations will be relatively rare, and should not impact normal operations noticeably. That is especially true for flash-based ZFS devices, since the IOPS to update the überblock copies at the start of each block device do not need multiple mechanical full-patter seeks, so the cost of forcing a commit is relatively low.

            As an optimization, there could be a "batch wait and reschedule" for TXG sync, as there is for jbd2 since commit v2.6.28-5737-ge07f7183a486 "{{jbd2: improve jbd2 fsync batching]}" so that it allows multiple active threads to join the same TXG before it is closed for commit but does not wait unnecessarily between commits.
            adilger Andreas Dilger made changes -
            Assignee Original: Di Wang [ di.wang ] New: WC Triage [ wc-triage ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-4009 [ LU-4009 ]
            utopiabound Nathaniel Clark made changes -
            Labels Original: performance zfs New: dne performance zfs
            utopiabound Nathaniel Clark made changes -
            Labels Original: zfs New: performance zfs
            adilger Andreas Dilger made changes -
            Link New: This issue is duplicated by OSF-175 [ OSF-175 ]
            adilger Andreas Dilger made changes -
            Labels New: zfs
            adilger Andreas Dilger made changes -
            Affects Version/s New: Lustre 2.4.0 [ 10154 ]
            adilger Andreas Dilger made changes -
            Priority Original: Minor [ 4 ] New: Major [ 3 ]

            People

              wc-triage WC Triage
              di.wang Di Wang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: