Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8786

Terrible i/o performance of a test application doing repeatable writes and truncates

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      A customer application benchmark required for acceptance testing shows terrible i/o performance. A simple test case was created to mimic the customer application behavior. This test never completes when run with a walltime limit of 30 minutes (job is killed after walltime exceeded); the same test run to the /tmp fs on the client node completes within a few seconds.
      The tests completes in few seconds in Lustre-2.1.

       program iotest
            PARAMETER (NA=6400)
            dimension IA(NA)
            call init (IA,NA)
            call sleep(1)
            open (unit=22, file='time.step')
            do i=1,na
            call my_write(iA,NA,i)
            end do
            STOP
            END
            SUBROUTINE my_write(IA,NA,I)
            dimension ia(na)
            kt=ia(i)
            WRITE ( 22, '(1x, i8)' )   kt
            REWIND (22)
            return
            end
            subroutine init(IA,NA)
            dimension iA(na)
            do i=1,NA
            ia(i)=i
            end do
            return
            end
      

      Attachments

        Issue Links

          Activity

            [LU-8786] Terrible i/o performance of a test application doing repeatable writes and truncates

            The optimization is already included into LU-10048: osd: async truncate:

            @@ -1937,49 +1949,51 @@ static int osd_punch(const struct lu_env *env, struct dt_object *dt,
                    oh = container_of(th, struct osd_thandle, ot_super);
                    LASSERT(oh->ot_handle->h_transaction != NULL);
             
            -       osd_trans_exec_op(env, th, OSD_OT_PUNCH);
            +       /* we used to skip truncate to current size to
            +        * optimize truncates on OST. with DoM we can
            +        * get attr_set to set specific size (MDS_REINT)
            +        * and then get truncate RPC which essentially
            +        * would be skipped. this is bad.. so, disable
            +        * this optimization on MDS till the client stop
            +        * to sent MDS_REINT (LU-11033) -bzzz */
            +       if (osd->od_is_ost && i_size_read(inode) == start)
            +               RETURN(0);
             
            -       tid = oh->ot_handle->h_transaction->t_tid;
            +       osd_trans_exec_op(env, th, OSD_OT_PUNCH);
             
                    spin_lock(&inode->i_lock);
            +       if (i_size_read(inode) < start)
            +               grow = true;
                    i_size_write(inode, start);
                    spin_unlock(&inode->i_lock);
                    ll_truncate_pagecache(inode, start);
            
            zam Alexander Zarochentsev added a comment - The optimization is already included into LU-10048 : osd: async truncate: @@ -1937,49 +1949,51 @@ static int osd_punch( const struct lu_env *env, struct dt_object *dt, oh = container_of(th, struct osd_thandle, ot_super); LASSERT(oh->ot_handle->h_transaction != NULL); - osd_trans_exec_op(env, th, OSD_OT_PUNCH); + /* we used to skip truncate to current size to + * optimize truncates on OST. with DoM we can + * get attr_set to set specific size (MDS_REINT) + * and then get truncate RPC which essentially + * would be skipped. this is bad.. so, disable + * this optimization on MDS till the client stop + * to sent MDS_REINT (LU-11033) -bzzz */ + if (osd->od_is_ost && i_size_read(inode) == start) + RETURN(0); - tid = oh->ot_handle->h_transaction->t_tid; + osd_trans_exec_op(env, th, OSD_OT_PUNCH); spin_lock(&inode->i_lock); + if (i_size_read(inode) < start) + grow = true ; i_size_write(inode, start); spin_unlock(&inode->i_lock); ll_truncate_pagecache(inode, start);

            The patch restores a truncate optimisation lost in obdfilter->ofd rewrite. It explains why the test works well in Lustre-2.1.

            zam Alexander Zarochentsev added a comment - The patch restores a truncate optimisation lost in obdfilter->ofd rewrite. It explains why the test works well in Lustre-2.1.

            Alexander Zarochentsev (alexander.zarochentsev@seagate.com) uploaded a new patch: http://review.whamcloud.com/23502
            Subject: LU-8786 osd: unnecessary truncate in osd_punch()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fcac693a8f722758c45961a97fa17b75556c0b4d

            gerrit Gerrit Updater added a comment - Alexander Zarochentsev (alexander.zarochentsev@seagate.com) uploaded a new patch: http://review.whamcloud.com/23502 Subject: LU-8786 osd: unnecessary truncate in osd_punch() Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fcac693a8f722758c45961a97fa17b75556c0b4d

            People

              wc-triage WC Triage
              zam Alexander Zarochentsev
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: