Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15869

In a striped directory, there is a cache inconsistency problem on multiple clients when renaming files

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.12.4
    • None
    • 3
    • 9223372036854775807

    Description

       

      This is a problem that can be stably reproduced. The reproduction steps are as follows:

      1. Create a striped directory on Client A

       

      lfs mkdir -c -1 testdir

      2. Create two files on client A as follows

       

      cd testdir
      echo "a" > file
      echo "b" > tmpfile

      3. stat files that have been created on clinet A

       

       

      stat file
      stat tmpfile

      4. stat files that have been created on clinet B

       

       

      cd testdir
      stat file
      stat tmpfile

      5. mv file to file.1 on client A

       

       

      mv file file.1

      6. stat file on two clients

       

       

      // client A
      stat file
      // client B
      stat file

      7. mv tmpfile to file on clinet A

       

       

      mv tmpfile file

       

      8. stat file on two clients

       

      // client A
      stat file
      // client B
      stat file

      9. mv file to file.2 on client A

       

       

      mv file to file.2

      10. The results of all the above steps seem to be correct, then stat file on the two clients, I found that client B read the wrong file dentry

       

       

      // client A
      stat file
      // client B, Read the file dentry, that has been mv stat file 
      stat file

       

       

      Screenshots of all steps are shown below

      Some clues have been found

      1. The last stat on client B, the dentry of the file read from the cache

      2. From the perspective of client processing flow, the cache is not released

      3. systemtap trace

      4. It looks like some locks are not cancelled。 I found that the lock bits=27 allocated by client B in the penultimate stat file, when receiving ldlm_callback on client B due to rename, there is no callback for bits=27 lock

       

      Attachments

        Activity

          People

            wc-triage WC Triage
            chenhui Hui Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: