Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11479

Error replicating xattr for /tmp/target/d8.lustre-rsync-test/d07/d073/b4: 2

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.12.0
    • Lustre 2.12.0
    • None
    • 3
    • 9223372036854775807

    Description

      The lustre-rsync-test test_8() is spewing thousands of messages:

      Registered 1 changelog users: 'cl14'
      lustre-MDT0000: Registered changelog user cl14
      Error replicating  xattr for /tmp/target/d8.lustre-rsync-test/d01/d011/c0: 2
      Error replicating  xattr for /tmp/target/d8.lustre-rsync-test/d01/d011/c0: 2
      Error replicating  xattr for /tmp/target/d8.lustre-rsync-test/d01/d011/c0: 2
      :
      Error replicating  xattr for /tmp/target/d8.lustre-rsync-test/d07/d073/b4: 2
      :
      Source: /mnt/lustre
      Target: /tmp/target
      Statuslog: /tmp/lustre_rsync.log
      Changelog registration: cl14
      Starting changelog record: 0
      Clear changelog after use: no
      Errors: 8100
      lustre_rsync took 324 seconds
      Changelog records consumed: 5121
      

      Each of the identical messsges if printed 5x before the next file is listed. Which makes it seem like it isn't working correctly, even though the test is not marked as failing.

      This looks like it started failing around 2018-07-31, but it is slow to track it back exactly because it involves looking at each passing test individually. I bisected the results to narrow it down to this date (+/- 1 day or so).

      A good run looks like:

      Starting changelog record: 0
      Clear changelog after use: no
      Errors: 0
      lustre_rsync took 191 seconds
      Changelog records consumed: 3501
      

      Attachments

        Issue Links

          Activity

            [LU-11479] Error replicating xattr for /tmp/target/d8.lustre-rsync-test/d07/d073/b4: 2
            pjones Peter Jones added a comment -

            Landed for 2.12

            pjones Peter Jones added a comment - Landed for 2.12

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33373/
            Subject: LU-11479 rsync: replicate attributes of file in .lustrerepl
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 337f230565ea033d126653e8da01315211470665

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33373/ Subject: LU-11479 rsync: replicate attributes of file in .lustrerepl Project: fs/lustre-release Branch: master Current Patch Set: Commit: 337f230565ea033d126653e8da01315211470665
            jhammond John Hammond added a comment -

            Please note that, even though LU-11450 makes the messages go away, there is still a bug in lustre_rsync which is fixed by https://review.whamcloud.com/33373. So let's leave this open until that change is landed.

            jhammond John Hammond added a comment - Please note that, even though LU-11450 makes the messages go away, there is still a bug in lustre_rsync which is fixed by https://review.whamcloud.com/33373 . So let's leave this open until that change is landed.

            John L. Hammond (jhammond@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33373
            Subject: LU-11479 rsync: replicate attributes of file in .lustrerepl
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 7ec4297fd6ab1f0e8ae4199c7646960f6e047c46

            gerrit Gerrit Updater added a comment - John L. Hammond (jhammond@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33373 Subject: LU-11479 rsync: replicate attributes of file in .lustrerepl Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7ec4297fd6ab1f0e8ae4199c7646960f6e047c46
            qian_wc Qian Yingjin added a comment -

            After applied the patch LU-11450 trusted.som xattr is logged in changelog, the error messages was gone.

            qian_wc Qian Yingjin added a comment - After applied the patch  LU-11450 trusted.som xattr is logged in changelog, the error messages was gone.

            The LSOM patch was landed on 2018-07-30, so is likely to be the cause of this problem. It will hopefully go away when the patch for LU-11466 lands.

            However, it isn't clear whether we should return an error from trying to set the trusted.lsom xattr or not? Some tools like "cp" and "tar" will try to copy all of the xattrs, and since trusted.som is listed it will generate an error and lists of noise. Similarly, we silently eat any attempt to set trusted.lov directly on an existing file, so that tools don't complain.

            Separately, it would be good to get a better error message in lustre_rsync, as we discussed.

            adilger Andreas Dilger added a comment - The LSOM patch was landed on 2018-07-30, so is likely to be the cause of this problem. It will hopefully go away when the patch for LU-11466 lands. However, it isn't clear whether we should return an error from trying to set the trusted.lsom xattr or not? Some tools like "cp" and "tar" will try to copy all of the xattrs, and since trusted.som is listed it will generate an error and lists of noise. Similarly, we silently eat any attempt to set trusted.lov directly on an existing file, so that tools don't complain. Separately, it would be good to get a better error message in lustre_rsync, as we discussed.

            People

              jhammond John Hammond
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: