Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6719

(osd_object.c:410:osd_object_init()) lsd-MDT0000: lookup [0x200042d88:0x1885d:0x0]/0x45f89372 failed: rc = -17

Details

    • 3
    • 9223372036854775807

    Description

      On at least one of our MDTs, we have gotten the following error many times:

      2015-05-27 14:57:28 LustreError: 14118:0:(osd_object.c:410:osd_object_init()) lsd-MDT0000: lookup [0x200042d88:0x1885d:0x0]/0x45f89372 failed: rc = -17

      -17 is EEXIST.

      There are no other Lustre messages in the console logs anywhere near these messages.

      We are using ZFS OSDs.

      Lustre version 2.5.3-5chaos. (See github.com/chaos/lustre)

      If this is a bug, then please work on a fix. If this is not a bug, then please silence the scary message.

      Attachments

        Activity

          [LU-6719] (osd_object.c:410:osd_object_init()) lsd-MDT0000: lookup [0x200042d88:0x1885d:0x0]/0x45f89372 failed: rc = -17

          Could you please provide a backported patch for Lustre 2.5?

          marc@llnl.gov D. Marc Stearman (Inactive) added a comment - Could you please provide a backported patch for Lustre 2.5?

          Patch has landed for 2.8

          jgmitter Joseph Gmitter (Inactive) added a comment - Patch has landed for 2.8

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18054/
          Subject: LU-6719 osd-zfs: Ignore EEXIST during object init
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: d96c38aa85d80f5e0630c28d242f8279987c8d5e

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18054/ Subject: LU-6719 osd-zfs: Ignore EEXIST during object init Project: fs/lustre-release Branch: master Current Patch Set: Commit: d96c38aa85d80f5e0630c28d242f8279987c8d5e

          Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/18054
          Subject: LU-6719 obdclass: Retry lookup
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: f6c8ff3e40e8fa403d23192a3b8f819a6f8a0e77

          gerrit Gerrit Updater added a comment - Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/18054 Subject: LU-6719 obdclass: Retry lookup Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: f6c8ff3e40e8fa403d23192a3b8f819a6f8a0e77

          With +info added to debugging, the logs cycle rather quickly. Hopefully there is enough info to find where the code is having problems.

          marc@llnl.gov D. Marc Stearman (Inactive) added a comment - With +info added to debugging, the logs cycle rather quickly. Hopefully there is enough info to find where the code is having problems.

          Quick triage of debug_log attached, I find one instance of the message. I've pulled out all the references to that fid from the log:

          Sun Jan  3 20:38:12 2016 [23949] (mdt_handler.c:2488:mdt_object_new()) Allocate object for [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [23949] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [23949] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 1
          Sun Jan  3 20:38:12 2016 [23837] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [24058] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2
          Sun Jan  3 20:38:12 2016 [24058] (ldlm_resource.c:1521:ldlm_resource_dump()) ### ### ns: mdt-lse-MDT0000_UUID lock: ffff880856255d00/0xee63e92424c3a04e lrc: 2/0,0 mode: CR/CR res: [0x20006c6de:0x1c752:0x0].0 bits 0x9 rrc: 2 type: IBT flags: 0x40200000000000 nid: 192.168.115.26@o2ib10 remote: 0x1e9fdd93a70839d9 expref: 1571 pid: 23949 timeout: 0 lvb_type: 0 used 402457368
          Sun Jan  3 20:38:12 2016 [23949] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [23949] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2
          Sun Jan  3 20:38:12 2016 [23949] (ldlm_resource.c:1521:ldlm_resource_dump()) ### ### ns: mdt-lse-MDT0000_UUID lock: ffff880856255d00/0xee63e92424c3a04e lrc: 2/0,0 mode: CR/CR res: [0x20006c6de:0x1c752:0x0].0 bits 0x9 rrc: 2 type: IBT flags: 0x40200000000000 nid: 192.168.115.26@o2ib10 remote: 0x1e9fdd93a70839d9 expref: 1571 pid: 23949 timeout: 0 lvb_type: 0 used 402457368
          Sun Jan  3 20:38:12 2016 [23949] (mdt_lib.c:545:mdt_dump_lmm()) objid 0x1c752:444126, magic 0x0BD10BD0, pattern 0X1
          Sun Jan  3 20:38:12 2016 [23911] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [23837] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [23837] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [24058] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2
          Sun Jan  3 20:38:12 2016 [24058] (ldlm_resource.c:1521:ldlm_resource_dump()) ### ### ns: mdt-lse-MDT0000_UUID lock: ffff880856255d00/0xee63e92424c3a04e lrc: 2/0,0 mode: CR/CR res: [0x20006c6de:0x1c752:0x0].0 bits 0x9 rrc: 2 type: IBT flags: 0x40200000000000 nid: 192.168.115.26@o2ib10 remote: 0x1e9fdd93a70839d9 expref: 1570 pid: 23949 timeout: 0 lvb_type: 0 used 402457368
          Sun Jan  3 20:38:12 2016 [24058] (mdt_lib.c:545:mdt_dump_lmm()) objid 0x1c752:444126, magic 0x0BD10BD0, pattern 0X1
          Sun Jan  3 20:38:12 2016 [23813] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [21992] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2
          Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
          Sun Jan  3 20:38:12 2016 [23837] (osd_object.c:410:osd_object_init()) lse-MDT0000: lookup [0x20006c6de:0x1c752:0x0]/0x4649efde failed: rc = -17
          Sun Jan  3 20:38:12 2016 [23837] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
          
          utopiabound Nathaniel Clark added a comment - Quick triage of debug_log attached, I find one instance of the message. I've pulled out all the references to that fid from the log: Sun Jan 3 20:38:12 2016 [23949] (mdt_handler.c:2488:mdt_object_new()) Allocate object for [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [23949] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [23949] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 1 Sun Jan 3 20:38:12 2016 [23837] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [24058] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [24058] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [24058] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2 Sun Jan 3 20:38:12 2016 [24058] (ldlm_resource.c:1521:ldlm_resource_dump()) ### ### ns: mdt-lse-MDT0000_UUID lock: ffff880856255d00/0xee63e92424c3a04e lrc: 2/0,0 mode: CR/CR res: [0x20006c6de:0x1c752:0x0].0 bits 0x9 rrc: 2 type: IBT flags: 0x40200000000000 nid: 192.168.115.26@o2ib10 remote: 0x1e9fdd93a70839d9 expref: 1571 pid: 23949 timeout: 0 lvb_type: 0 used 402457368 Sun Jan 3 20:38:12 2016 [23949] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [23949] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2 Sun Jan 3 20:38:12 2016 [23949] (ldlm_resource.c:1521:ldlm_resource_dump()) ### ### ns: mdt-lse-MDT0000_UUID lock: ffff880856255d00/0xee63e92424c3a04e lrc: 2/0,0 mode: CR/CR res: [0x20006c6de:0x1c752:0x0].0 bits 0x9 rrc: 2 type: IBT flags: 0x40200000000000 nid: 192.168.115.26@o2ib10 remote: 0x1e9fdd93a70839d9 expref: 1571 pid: 23949 timeout: 0 lvb_type: 0 used 402457368 Sun Jan 3 20:38:12 2016 [23949] (mdt_lib.c:545:mdt_dump_lmm()) objid 0x1c752:444126, magic 0x0BD10BD0, pattern 0X1 Sun Jan 3 20:38:12 2016 [23911] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [23837] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [23837] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [24058] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [24058] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [24058] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2 Sun Jan 3 20:38:12 2016 [24058] (ldlm_resource.c:1521:ldlm_resource_dump()) ### ### ns: mdt-lse-MDT0000_UUID lock: ffff880856255d00/0xee63e92424c3a04e lrc: 2/0,0 mode: CR/CR res: [0x20006c6de:0x1c752:0x0].0 bits 0x9 rrc: 2 type: IBT flags: 0x40200000000000 nid: 192.168.115.26@o2ib10 remote: 0x1e9fdd93a70839d9 expref: 1570 pid: 23949 timeout: 0 lvb_type: 0 used 402457368 Sun Jan 3 20:38:12 2016 [24058] (mdt_lib.c:545:mdt_dump_lmm()) objid 0x1c752:444126, magic 0x0BD10BD0, pattern 0X1 Sun Jan 3 20:38:12 2016 [23813] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [24058] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [24058] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [21992] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2 Sun Jan 3 20:38:12 2016 [24058] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0] Sun Jan 3 20:38:12 2016 [23837] (osd_object.c:410:osd_object_init()) lse-MDT0000: lookup [0x20006c6de:0x1c752:0x0]/0x4649efde failed: rc = -17 Sun Jan 3 20:38:12 2016 [23837] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]

          People

            utopiabound Nathaniel Clark
            morrone Christopher Morrone (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: