Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1573

avoid data corruption for direct io data

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.0
    • None
    • None
    • any lustre version
    • 4
    • 3
    • 4001

    Description

      when we call a shutdown (without -f) we a set a 'notransno' flag to put all requests in replay queue.

      case 'A':
                                      LCONSOLE_WARN("Failing over %s\n",
                                                    obd->obd_name);
                                      obd->obd_fail = 1;
                                      obd->obd_no_transno = 1;
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      

      if that will be raced with obd_commitrw_write() which process a DIO request - reply will be sent without last_commited trasno update.
      ptlrpc client will be put that request in replay queue as transno > last_commited and send a completion event to user land application... we have a request with brw pages directly pointed in user data in replay queue.
      OOPS.

      If user land application will be reused a same buffer for different data after exit from write(2) call, ptlrpc will started to replay that request, but send a invalid data to the OST. so we a corrupt data on OST side.

      replicate that bug is very easy.
      use lctl --device notransno command and use directio write. that will don't blocked and exited - but ptlrpc request will have a pointer to userspace.

      we found that bug in testing DIO under failover.
      we call a default replay_barier / fail functions on ost side and see - sometimes file a corrupted.
      corruption fully addressed to the requests replayed after reconnect.
      after disable sending a reply from a OST to the client for sync journal case - we have found that bug fixes,
      but looks it's affected not just testing environment - but race window be smaller.

      Attachments

        Activity

          People

            bzzz Alex Zhuravlev
            shadow Alexey Lyashkov
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: