Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8926

Race in in job stats code results in untracked I/O

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      There is a race condition between updating the job id in lustre_get_jobid and setting the job id in outbound RPCs (primarily when getting the job id from the an environment variable is enabled).

      The function lustre_get_jobid is used near the beginning of every I/O to set the job id in the Lustre inode info (lli_jobid, from vvp_io_init)), and then the job id is read out from there when building an RPC. (osc_build_rpc, cl_req_attr_set, vvp_req_attr_set, then it's used in lustre_msg_set_jobid).

      lustre_get_jobid starts out by memsetting the jobid to zero, then re-reading it from the source. Since osc_build_rpc is asynchronous from this and happens in another thread, it can read the jobid at any time, including while it's zero.

      Since cfs_get_environ is a very expensive operation, this can happen a lot for small IO operations.

      In particular, with 4k write operations, we see up to 2/3 of our IOs with a null job id, so they are not tracked.

      Using a lock or other hard synchronization here would be far too expensive, and it's OK if job stats are occasionally inaccurate. So my proposed patch just cuts the window in which the jobid will be invalid from very large to very small. (Also, in practice, the job id should not change much in the cases we really care about, namely when set from a job scheduler.)

      Attachments

        Activity

          People

            paf Patrick Farrell
            paf Patrick Farrell
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: