Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10923

cl_io_loop improperly assumes all ios are rw-type IOs

Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      cl_io_loop has this code:

                      pos   = io->u.ci_rw.rw_range.cir_pos;
                      count = io->u.ci_rw.rw_range.cir_count;
      
                      if (io->ci_pio) {
                              /* submit this range for parallel execution */
                              pt = cl_io_submit_pt(io, pos, count);
                              if (IS_ERR(pt)) {
                                      cl_io_iter_fini(env, io);
                                      rc = PTR_ERR(pt);
                                      break;
                              }
      
                              *tail = pt;
                              tail = &pt->cip_next;
                      } else {
                              size_t nob = io->ci_nob;
      
                              CDEBUG(D_VFSTRACE,
                                      "execute type %u range: [%llu, %llu) nob: %zu %s\n",
                                      io->ci_type, pos, pos + count, nob,
                                      io->ci_continue ? "continue" : "stop");
      

      Now the io->u.ci_rw. is only valid for accessing if the IO is of type CIT_READ/WRITE, otherwise the union is populated differently.

      And cl_io_loop IS called by other places, for example:

              if (cl_io_init(env, io, CIT_FSYNC, io->ci_obj) == 0)
                      result = cl_io_loop(env, io);
      

      The problematic code was introduced by LU-8964 - parallel io code.
      commit db59ecb5d1d0284fb918def6348a11e0966d7767

      Attachments

        Issue Links

          Activity

            [LU-10923] cl_io_loop improperly assumes all ios are rw-type IOs
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-11825 [ LU-11825 ]
            adilger Andreas Dilger made changes -
            Resolution New: Won't Fix [ 2 ]
            Status Original: In Progress [ 3 ] New: Resolved [ 5 ]
            dmiter Dmitry Eremin (Inactive) made changes -
            Status Original: Open [ 1 ] New: In Progress [ 3 ]
            dmiter Dmitry Eremin (Inactive) made changes -
            Priority Original: Major [ 3 ] New: Minor [ 4 ]
            pjones Peter Jones made changes -
            Assignee Original: WC Triage [ wc-triage ] New: Dmitry Eremin [ dmiter ]
            green Oleg Drokin made changes -
            Description Original: cl_io_loop has this code:

            {code}
                            pos = io->u.ci_rw.rw_range.cir_pos;
                            count = io->u.ci_rw.rw_range.cir_count;

                            if (io->ci_pio) {
                                    /* submit this range for parallel execution */
                                    pt = cl_io_submit_pt(io, pos, count);
                                    if (IS_ERR(pt)) {
                                            cl_io_iter_fini(env, io);
                                            rc = PTR_ERR(pt);
                                            break;
                                    }

                                    *tail = pt;
                                    tail = &pt->cip_next;
                            } else {
                                    size_t nob = io->ci_nob;

                                    CDEBUG(D_VFSTRACE,
                                            "execute type %u range: [%llu, %llu) nob: %zu %s\n",
                                            io->ci_type, pos, pos + count, nob,
                                            io->ci_continue ? "continue" : "stop");
            {code}

            Now the io->u.ci_rw. is only valid for accessing if the IO is of type CIT_READ/WRITE, otherwise the union is populated differently.

            And cl_io_loop IS called by other places, for example:

            {code}
                    if (cl_io_init(env, io, CIT_FSYNC, io->ci_obj) == 0)
                            result = cl_io_loop(env, io);
            {noformat}

            The problematic code was introduced by LU-8964 - parallel io code.
            commit db59ecb5d1d0284fb918def6348a11e0966d7767
            New: cl_io_loop has this code:

            {code}
                            pos = io->u.ci_rw.rw_range.cir_pos;
                            count = io->u.ci_rw.rw_range.cir_count;

                            if (io->ci_pio) {
                                    /* submit this range for parallel execution */
                                    pt = cl_io_submit_pt(io, pos, count);
                                    if (IS_ERR(pt)) {
                                            cl_io_iter_fini(env, io);
                                            rc = PTR_ERR(pt);
                                            break;
                                    }

                                    *tail = pt;
                                    tail = &pt->cip_next;
                            } else {
                                    size_t nob = io->ci_nob;

                                    CDEBUG(D_VFSTRACE,
                                            "execute type %u range: [%llu, %llu) nob: %zu %s\n",
                                            io->ci_type, pos, pos + count, nob,
                                            io->ci_continue ? "continue" : "stop");
            {code}

            Now the io->u.ci_rw. is only valid for accessing if the IO is of type CIT_READ/WRITE, otherwise the union is populated differently.

            And cl_io_loop IS called by other places, for example:

            {code}
                    if (cl_io_init(env, io, CIT_FSYNC, io->ci_obj) == 0)
                            result = cl_io_loop(env, io);
            {code}

            The problematic code was introduced by LU-8964 - parallel io code.
            commit db59ecb5d1d0284fb918def6348a11e0966d7767
            green Oleg Drokin made changes -
            Key Original: LDEV-647 New: LU-10923
            Workflow Original: jira [ 68633 ] New: Sub-task Blocking [ 68634 ]
            Project Original: Lustre Development [ 11016 ] New: Lustre [ 10000 ]
            green Oleg Drokin created issue -

            People

              dmiter Dmitry Eremin (Inactive)
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: