Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9714

Changelog consumer test reports 'Local llog found corrupted'

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.11.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      [admin@snx11253n002 ~]$ lctl get_param mdd.snx11253-MDT0000.changelog_users;
      mdd.snx11253-MDT0000.changelog_users=
      current index: 8936329
      ID index
      cl2 7049068
      cl31347 3348651
      cl31349 3348651
      cl31351 3348651
      cl33628 6330946
      cl33632 6335474
      cl2 7049068
      cl33962 7379382
      cl33963 7379382
      cl33964 7379382

      TEST PROCEDURE:
      For each rank:
      (I) CONSUME_LOGS (STARTREC=<arg> or 1)
      [1] Set endrec = startrec + params.chunksize
      [2] Read CL in parallel with N threads with range (startrec, endrec), and extract
      actual last record read (lastrec) .
      a] If no new files are found ...
      i] Retry until allowed (by refresh timeout, refresh retries).
      Otherwise quit.
      b] If new files are found ...
      i] If clearing is enabled, clear logs up to lastrec.
      ii] Increment record ranges
      (startrec = lastrec, endrec = startrec + chunksize)
      d] repeat step [2]
      TEST EXECUTION:
      The test was executed with the follow arguments:
      $ changelog-consumer.py -v --read-clear --clusers cl2 cl2 cl2 cl2 --mdt snx11253-MDT0000 --chunksize 9999 --read-threads 8 --update-retries 6 --update-after 30
      Basically this is reading 9999 Changelogs in 8 threads, then clearing 9999 Changelogs in 1 thread (implied, not show). These actions are being duplicated across 4 separate nodes, and against the "cl2" user.

      The root cause is parallel processing llog with modification.
      The llog has llog_handle which include

      lgh_cur_idx; /* used during llog_process */
      lgh_cur_offset; /* used during llog_process */

      They both are modified by llog_process_thread every time we process record from llog.

      Those fields are used at llog_osd_write_rec for modification/rewrite llog record. The race exist when two or more threads are processing the same llog and at least one of them do modification.
      1) llog_process_thread lgh_cur_idx=10 lgh_cur_offset=2000,
      2) llog_process_thread lgh_cur_idx=5 lgh_cur_offset=1000,

      1) log_process_thread >lpi_cb>mdd_changelog_user_purge_cb->llog_write... llog_osd_write_rec and then

      if (idx != loghandle->lgh_cur_idx) {                         
                                      CERROR("%s: modify index mismatch %d %d\n",          
                                             o->do_lu.lo_dev->ld_obd->obd_name, idx,       
                                             loghandle->lgh_cur_idx);                      
                                     RETURN(-EFAULT);                                     
                }  
      
      
      

      From logs

      May 4 19:55:29 snx11253n002 kernel: LustreError: 29737:0:(llog_osd.c:441:llog_osd_write_rec()) snx11253-MDT0000-osd: modify index mismatch 2 34132

      This was the first case.

      (2) Lets imagine that second thread modify lgh_cur_idx/lgh_cur_offset right after this check

      if (idx != loghandle->lgh_cur_idx)

      then

      lgi->lgi_off = loghandle->lgh_cur_offset;

      lgi->lgi_off became 1000 instead of 2000. And the wrong modification will happen.

      Attachments

        Activity

          People

            aboyko Alexander Boyko
            aboyko Alexander Boyko
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: