Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13804

LustreError (osd_index.c:1201:osd_dir_delete()) lquake-MDT0000: failed to destroy agent object (0) for the entry data049: rc = -22

Details

    • Bug
    • Resolution: Not a Bug
    • Minor
    • None
    • None
    • Lustre 2.10.8 and 2.12.5
      RHEL 7.8
    • 3
    • 9223372036854775807

    Description

      jet1 console log reports the following during an I/O SWL (and a few other misc jobs) on Opal:

      Jul 15 09:51:54 jet1 kernel: LustreError: 1391:0:(osd_index.c:1201:osd_dir_delete()) lquake-MDT0000: failed to destroy agent object (0) for the entry data049: rc = -22
      Jul 15 09:51:54 jet1 kernel: LustreError: 1391:0:(osd_index.c:1201:osd_dir_delete()) Skipped 1 previous similar message
      Jul 15 09:51:54 jet1 kernel: LustreError: 17478:0:(osd_index.c:1201:osd_dir_delete()) lquake-MDT0000: failed to destroy agent object (0) for the entry data026: rc = -22
      Jul 15 09:51:54 jet1 kernel: LustreError: 17478:0:(osd_index.c:1201:osd_dir_delete()) Skipped 75 previous similar messages
      

      There were no console log messages on Opal that appeared to correspond.  No messages at 09:51, and no unusual messages at all on opal.

      Testing both under Lustre 2.10 and Lustre 2.12 included creating striped directories via lfs mkdir -i3 -c4 <target>. Some of these directories were likely created under 2.10 and deleted under 2.12.

      Before this occurred, the jet servers had been upgraded from Lustre 2.10 to Lustre 2.12, then downgraded to 2.10, and then upgraded to 2.12 again. Significant I/O was performed between each Lustre version, and changelog users were deregistered and logs cleared.

      Attachments

        Activity

          [LU-13804] LustreError (osd_index.c:1201:osd_dir_delete()) lquake-MDT0000: failed to destroy agent object (0) for the entry data049: rc = -22
          ofaaland Olaf Faaland added a comment -

          Thank you

          ofaaland Olaf Faaland added a comment - Thank you
          laisiyao Lai Siyao added a comment -

          Yes, it's exactly what you described.

          laisiyao Lai Siyao added a comment - Yes, it's exactly what you described.
          pjones Peter Jones added a comment -

          Lai

          Could you please advise?

          Thanks

          Peter

          pjones Peter Jones added a comment - Lai Could you please advise? Thanks Peter
          ofaaland Olaf Faaland added a comment -

          Note this was performed on our staging file system, not a production one; but we are about to put Lustre 2.12.5 on all our file systems, so I'm working through error messages to minimize risk.

          ofaaland Olaf Faaland added a comment - Note this was performed on our staging file system, not a production one; but we are about to put Lustre 2.12.5 on all our file systems, so I'm working through error messages to minimize risk.
          ofaaland Olaf Faaland added a comment - - edited

          The osd_dir_delete() CERROR was added after 2.10 branched. The commit is

          • c0a455e LU-10190 osd-zfs: create agent object for remote object

          Based on the commit message, I believe it's possible that creating remote objects under lustre 2.10, and then deleting them under 2.12, would trigger this error message.

          Reaching this error message does not result in osd_dir_delete() returning early or returning an error.

          So this seems like this is harmless (IE does not indicate damage to the file system). And it also seems like it is explained by my downgrade-then-upgrade.

          Please confirm, thanks.

          ofaaland Olaf Faaland added a comment - - edited The osd_dir_delete() CERROR was added after 2.10 branched. The commit is c0a455e LU-10190 osd-zfs: create agent object for remote object Based on the commit message, I believe it's possible that creating remote objects under lustre 2.10, and then deleting them under 2.12, would trigger this error message. Reaching this error message does not result in osd_dir_delete() returning early or returning an error. So this seems like this is harmless (IE does not indicate damage to the file system). And it also seems like it is explained by my downgrade-then-upgrade. Please confirm, thanks.

          People

            laisiyao Lai Siyao
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: