Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11827

Race between llog_cat_declare_add_rec and llog_cat_current_log

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Lustre 2.13.0, Lustre 2.12.1
    • Labels:
      None
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      llog_cat_declare_add_rec() operates on &cathandle->u.chd.chd_next_log without having it protected:

      int llog_cat_declare_add_rec(const struct lu_env *env,
      ...
              rc = llog_cat_prep_log(env, cathandle,
                                     &cathandle->u.chd.chd_current_log, th);
      ...
      	rc = llog_cat_prep_log(env, cathandle, &cathandle->u.chd.chd_next_log,
                                     th);
      

      That races with llog_cat_current_log() when it switches to next log and updates cathandle->u.chd.chd_next_log:

      static struct llog_handle *llog_cat_current_log(struct llog_handle *cathandle,
      ...
              down_write_nested(&cathandle->lgh_lock, LLOGH_CAT);
      ...
              CDEBUG(D_INODE, "use next log\n");
       
              loghandle = cathandle->u.chd.chd_next_log;
              cathandle->u.chd.chd_current_log = loghandle;
              cathandle->u.chd.chd_next_log = NULL;
              down_write_nested(&loghandle->lgh_lock, LLOGH_LOG);
      ...
      

      The following trace has been observed:
      Process 177713 enters llog_cat_declare_add_rec():

      00000040:00000001:19.0:1545138333.143874:0:177713:0:(llog_cat.c:605:llog_cat_declare_add_rec()) Process entered
      00000040:00000001:19.0:1545138333.143875:0:177713:0:(llog.c:940:llog_exist()) Process leaving (rc=1 : 1 : 1)
      00000040:00000001:19.0:1545138333.143876:0:177713:0:(llog.c:940:llog_exist()) Process leaving (rc=0 : 0 : 0)
      

      Process 99986 jumps in and switches pointer to next log in cathalog handle to NULL:

      00000040:00000002:21.0:1545138333.143876:0:99986:0:(llog_cat.c:521:llog_cat_current_log()) use next log
      

      Process 177713 continues: llog_cat_prep_log->llog_declare_create->llog_handle2ops, find NULL in and fails in llog_handle2ops() with -22 as long as *ploghandle is NULL:

      00000040:00000001:19.0:1545138333.143877:0:177713:0:(llog.c:954:llog_declare_create()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                laisiyao Lai Siyao
                Reporter:
                vsaveliev Vladimir Saveliev
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: