Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17170

Likely at unlink: many LustreError: mdt_open.c:1217:mdt_cross_open() fsname-MDTxxxx: [FID] doesn't exist!: rc = -14

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Minor
    • None
    • Lustre 2.15.3
    • None
    • CentOS 7.9 kernel 3.10.0-1160.90.1.el7_lustre.pl1.x86_64
    • 3
    • 9223372036854775807

    Description

      With 2.15.3 on Sherlock's scratch filesystem (Fir), we are seeing a LOT of the following messages on all four MDTs when files are being purged by Robinhood:

      # clush -w @mds -L "journalctl -n 10 -k | grep LustreError"
      fir-md1-s1: Oct 05 15:30:56 fir-md1-s1 kernel: LustreError: 32843:0:(mdt_open.c:1570:mdt_reint_open()) fir-MDT0000: name '[0x20005b5ae:0x14198:0x0]' present, but FID [0x20005b5ae:0x14198:0x0] is invalid
      fir-md1-s1: Oct 05 15:31:45 fir-md1-s1 kernel: LustreError: 51313:0:(mdt_open.c:1570:mdt_reint_open()) fir-MDT0000: name '[0x20005b5b5:0x1dd34:0x0]' present, but FID [0x20005b5b5:0x1dd34:0x0] is invalid
      fir-md1-s1: Oct 05 15:33:22 fir-md1-s1 kernel: LustreError: 32959:0:(mdt_open.c:1570:mdt_reint_open()) fir-MDT0000: name '[0x20005b5cf:0xff79:0x0]' present, but FID [0x20005b5cf:0xff79:0x0] is invalid
      fir-md1-s2: Oct 05 15:35:57 fir-md1-s2 kernel: LustreError: 125135:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0001: [0x24007e440:0x83be:0x0] doesn't exist!: rc = -14
      fir-md1-s2: Oct 05 15:35:57 fir-md1-s2 kernel: LustreError: 125135:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 605 previous similar messages
      fir-md1-s2: Oct 05 15:36:06 fir-md1-s2 kernel: LustreError: 125409:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0001: [0x24007e440:0x88ad:0x0] doesn't exist!: rc = -14
      fir-md1-s2: Oct 05 15:36:06 fir-md1-s2 kernel: LustreError: 125409:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 1256 previous similar messages
      fir-md1-s2: Oct 05 15:36:25 fir-md1-s2 kernel: LustreError: 125341:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0001: [0x24007e440:0x92bd:0x0] doesn't exist!: rc = -14
      fir-md1-s2: Oct 05 15:36:25 fir-md1-s2 kernel: LustreError: 125341:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 3743 previous similar messages
      fir-md1-s2: Oct 05 15:37:03 fir-md1-s2 kernel: LustreError: 125341:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0001: [0x24007e50d:0x15e22:0x0] doesn't exist!: rc = -14
      fir-md1-s2: Oct 05 15:37:03 fir-md1-s2 kernel: LustreError: 125341:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 8438 previous similar messages
      fir-md1-s2: Oct 05 15:38:18 fir-md1-s2 kernel: LustreError: 125341:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0001: [0x24007e50e:0x13804:0x0] doesn't exist!: rc = -14
      fir-md1-s2: Oct 05 15:38:18 fir-md1-s2 kernel: LustreError: 125341:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 16783 previous similar messages
      fir-md1-s3: Oct 05 15:01:52 fir-md1-s3 kernel: LustreError: 14993:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0002: [0x2c006c67d:0x2a0b:0x0] doesn't exist!: rc = -14
      fir-md1-s3: Oct 05 15:01:52 fir-md1-s3 kernel: LustreError: 14993:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 18907 previous similar messages
      fir-md1-s3: Oct 05 15:17:31 fir-md1-s3 kernel: LustreError: 12198:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0002: [0x2c006c67d:0x2950:0x0] doesn't exist!: rc = -14
      fir-md1-s3: Oct 05 15:17:31 fir-md1-s3 kernel: LustreError: 12198:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 19208 previous similar messages
      fir-md1-s3: Oct 05 15:46:14 fir-md1-s3 kernel: LustreError: 65665:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0002: [0x2c006c606:0x524d:0x0] doesn't exist!: rc = -14
      fir-md1-s3: Oct 05 15:46:14 fir-md1-s3 kernel: LustreError: 65665:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 49094 previous similar messages
      fir-md1-s3: Oct 05 15:47:29 fir-md1-s3 kernel: LustreError: 12352:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0002: [0x2c006c65c:0x145df:0x0] doesn't exist!: rc = -14
      fir-md1-s3: Oct 05 15:47:29 fir-md1-s3 kernel: LustreError: 12352:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 12772 previous similar messages
      fir-md1-s3: Oct 05 15:49:59 fir-md1-s3 kernel: LustreError: 14987:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0002: [0x2c006c710:0x15304:0x0] doesn't exist!: rc = -14
      fir-md1-s3: Oct 05 15:49:59 fir-md1-s3 kernel: LustreError: 14987:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 32807 previous similar messages
      fir-md1-s4: Oct 05 15:39:54 fir-md1-s4 kernel: LustreError: 23103:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0003: [0x280067e5f:0x1c1ab:0x0] doesn't exist!: rc = -14
      fir-md1-s4: Oct 05 15:39:54 fir-md1-s4 kernel: LustreError: 23103:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 19686 previous similar messages
      fir-md1-s4: Oct 05 15:40:10 fir-md1-s4 kernel: LustreError: 23395:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0003: [0x28006d889:0x18767:0x0] doesn't exist!: rc = -14
      fir-md1-s4: Oct 05 15:40:10 fir-md1-s4 kernel: LustreError: 23395:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 2687 previous similar messages
      fir-md1-s4: Oct 05 15:40:42 fir-md1-s4 kernel: LustreError: 23445:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0003: [0x28006d889:0x195c0:0x0] doesn't exist!: rc = -14
      fir-md1-s4: Oct 05 15:40:42 fir-md1-s4 kernel: LustreError: 23445:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 6453 previous similar messages
      fir-md1-s4: Oct 05 15:41:46 fir-md1-s4 kernel: LustreError: 23017:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0003: [0x28006d889:0x1cf16:0x0] doesn't exist!: rc = -14
      fir-md1-s4: Oct 05 15:41:46 fir-md1-s4 kernel: LustreError: 23017:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 15651 previous similar messages
      fir-md1-s4: Oct 05 15:43:54 fir-md1-s4 kernel: LustreError: 23367:0:(mdt_open.c:1217:mdt_cross_open()) fir-MDT0003: [0x28006daa0:0xd855:0x0] doesn't exist!: rc = -14
      fir-md1-s4: Oct 05 15:43:54 fir-md1-s4 kernel: LustreError: 23367:0:(mdt_open.c:1217:mdt_cross_open()) Skipped 23918 previous similar messages
      

      However, these errors seem to be harmless, at least we have not been able to find any problem so far. We have verified that those FIDs are files being automatically unlinked by Robinhood (we purge after 90 days) and the LustreError are happening at the same second than the unlink.

       

      Attachments

        Activity

          People

            wc-triage WC Triage
            sthiell Stephane Thiell
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: