[LU-6719] (osd_object.c:410:osd_object_init()) lsd-MDT0000: lookup [0x200042d88:0x1885d:0x0]/0x45f89372 failed: rc = -17 Created: 12/Jun/15  Updated: 18/Jul/16  Resolved: 05/Feb/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.3
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Minor
Reporter: Christopher Morrone Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: llnl, llnlfixready

Attachments: Text File debug_log.porter-mds1.1451871492.gz     File debug_log.vesta-mds1.1444422149    
Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

On at least one of our MDTs, we have gotten the following error many times:

2015-05-27 14:57:28 LustreError: 14118:0:(osd_object.c:410:osd_object_init()) lsd-MDT0000: lookup [0x200042d88:0x1885d:0x0]/0x45f89372 failed: rc = -17

-17 is EEXIST.

There are no other Lustre messages in the console logs anywhere near these messages.

We are using ZFS OSDs.

Lustre version 2.5.3-5chaos. (See github.com/chaos/lustre)

If this is a bug, then please work on a fix. If this is not a bug, then please silence the scary message.



 Comments   
Comment by Peter Jones [ 14/Jun/15 ]

Hi Nathaniel

Could you please help with this one?

Thanks

Peter

Comment by Nathaniel Clark [ 02/Jul/15 ]

Do you have the lustre debug logs from one of these occurrences? What version of ZFS/SPL are you running?

Are any of the clients getting errors?

Comment by D. Marc Stearman (Inactive) [ 26/Aug/15 ]

I do not have debug logs from any of the occurrences, but they are still happening today and we are running the following:
zfs-0.6.4.2-1.ch5.3.x86_64
spl-0.6.4.2-1.ch5.3.x86_64
lustre-2.5.4-4chaos_2.6.32_504.16.2.1chaos.ch5.3.x86_64.x86_64

On the MDS node:
2015-08-26 10:10:53 LustreError: 21134:0:(osd_object.c:410:osd_object_init()) fsv-MDT0000: lookup [0x20000f69c:0xca0c:0x0]/0x2c998679 failed: rc = -17

On a client node at the same time:
2015-08-26 10:10:53 LustreError: 11-0: fsv-MDT0000-mdc-c0000007adf95800: Communicating with 172.20.20.201@o2ib500, operation ldlm_enqueue failed with -17.
2015-08-26 10:10:53 LustreError: 30355:0:(mdc_locks.c:918:mdc_enqueue()) ldlm_cli_enqueue: -17

I see similar sets of messages in the logs where a client ldlm_cli_enqueue receives a -17 error code, and the MDS reports the lookup failed rc = -17.

Comment by D. Marc Stearman (Inactive) [ 29/Sep/15 ]

Is there any progress on this? Are you waiting on something from LLNL?

Comment by John Fuchs-Chesney (Inactive) [ 02/Oct/15 ]

Marc – I've nudged the support engineer.

Thanks,
~ jfc.

Comment by Nathaniel Clark [ 05/Oct/15 ]

I was very much hoping for a lustre debug log from the MDS for when it happens. (Side-note 2.5.4-llnl isn't up github.com/chaos/lustre)

Do you know what the client is doing when it gets this error?

The EEXIST is percolating up from the ZFS layer (sa_buf_hold()).

Comment by D. Marc Stearman (Inactive) [ 05/Oct/15 ]

It is not happening frequently enough for us to warrant watching the logs for an occurrence to dump the logs. With the volume of traffic we have the logs roll over pretty quickly. Also, with the frequency at which jobs cycle through, I'm not sure what the client is doing at the time either.

Chris, can you point Nathaniel to where our source code is held?

Comment by Christopher Morrone [ 05/Oct/15 ]

Our code is based on Intel's private FE branch. Be carefully what you ask for Nathaniel.

Comment by D. Marc Stearman (Inactive) [ 12/Oct/15 ]

I was able to modify our syslog-ng config to force a dump of the logs when a certain syslog message came through. Do you want just the MDS dump, or all the OSS nodes as well?

Comment by Nathaniel Clark [ 12/Oct/15 ]

Definitely the MDS. From your previous comments it seems like the OSS nodes don't report any errors, so I don't think I need those logs, unless it's free for you to collect them. The client if you can get it might also prove helpful.

Comment by D. Marc Stearman (Inactive) [ 12/Oct/15 ]

Here is the MDS. If you want the OSS logs, I can tar them up and attach as well. I cannot capture the client logs, as I don't have a suitable means to trigger a dump when it happens on the client side. Hopefully this helps.

Comment by D. Marc Stearman (Inactive) [ 28/Oct/15 ]

I have a few more log dumps if you need/want them. Were you able to get anything interesting from the logs I uploaded?

Comment by Nathaniel Clark [ 30/Oct/15 ]

There is amazingly little in the debug log of interest. I don't suppose the messages file or dmesg has anything more interesting when this happens?

Comment by D. Marc Stearman (Inactive) [ 30/Oct/15 ]

No, just what I've given you. It seems to happen pretty often. I could increase the debugging level on the servers for a few days and wait for another one to trigger a dump. Do you want any specific levels added, rather then just a "-1"? We are running with:

ioctl neterror warning error emerg config console

Comment by Christopher Morrone [ 30/Oct/15 ]

-1 on the MDS is probably not a good idea for very long.

Nathaniel, what have you learned from the code so far? Can you explain why the server calls osd_object_init() on an existing object? Under what conditions does it fail to check if the object exists before calling the initialization function?

Comment by Nathaniel Clark [ 03/Nov/15 ]

It appears from my reading that lustre/obdclass/lu_object.c::lu_object_find_try() does the following:

  1. lock hash
    • Check to see if fid has an object in has
  2. unlock hash
  3. if object not in hash, create an object and initialize it
  4. lock hash
    • Check to see if object is in hash, if not add it, if it is destroy new object
  5. unlock hash
  6. return object that was in cache

It looks like multiple attempts can be made to create the same FID, thus resulting in an error being returned because the first one succeeds, and later ones can't initialize correctly.

If you traced with +info, I'm pretty sure you'd see either mdt_object_new() or mdt_object_find() directly before this error message.

Possible issue: lu_object_find_try() nor lu_object_find_at() above it will retry the lookup on an EEXIST

Comment by D. Marc Stearman (Inactive) [ 15/Dec/15 ]

This is happening fairly regularly. Would you like me to enable +info for a day and capture some dumps?

Comment by Nathaniel Clark [ 16/Dec/15 ]

Yes. That would be helpful to figure out how best to resolve this issue.

Comment by D. Marc Stearman (Inactive) [ 04/Jan/16 ]

I have attached a debug log with +info added to debugging.

Comment by Nathaniel Clark [ 05/Jan/16 ]

Quick triage of debug_log attached, I find one instance of the message. I've pulled out all the references to that fid from the log:

Sun Jan  3 20:38:12 2016 [23949] (mdt_handler.c:2488:mdt_object_new()) Allocate object for [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [23949] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [23949] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 1
Sun Jan  3 20:38:12 2016 [23837] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [24058] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2
Sun Jan  3 20:38:12 2016 [24058] (ldlm_resource.c:1521:ldlm_resource_dump()) ### ### ns: mdt-lse-MDT0000_UUID lock: ffff880856255d00/0xee63e92424c3a04e lrc: 2/0,0 mode: CR/CR res: [0x20006c6de:0x1c752:0x0].0 bits 0x9 rrc: 2 type: IBT flags: 0x40200000000000 nid: 192.168.115.26@o2ib10 remote: 0x1e9fdd93a70839d9 expref: 1571 pid: 23949 timeout: 0 lvb_type: 0 used 402457368
Sun Jan  3 20:38:12 2016 [23949] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [23949] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2
Sun Jan  3 20:38:12 2016 [23949] (ldlm_resource.c:1521:ldlm_resource_dump()) ### ### ns: mdt-lse-MDT0000_UUID lock: ffff880856255d00/0xee63e92424c3a04e lrc: 2/0,0 mode: CR/CR res: [0x20006c6de:0x1c752:0x0].0 bits 0x9 rrc: 2 type: IBT flags: 0x40200000000000 nid: 192.168.115.26@o2ib10 remote: 0x1e9fdd93a70839d9 expref: 1571 pid: 23949 timeout: 0 lvb_type: 0 used 402457368
Sun Jan  3 20:38:12 2016 [23949] (mdt_lib.c:545:mdt_dump_lmm()) objid 0x1c752:444126, magic 0x0BD10BD0, pattern 0X1
Sun Jan  3 20:38:12 2016 [23911] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [23837] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [23837] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [24058] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2
Sun Jan  3 20:38:12 2016 [24058] (ldlm_resource.c:1521:ldlm_resource_dump()) ### ### ns: mdt-lse-MDT0000_UUID lock: ffff880856255d00/0xee63e92424c3a04e lrc: 2/0,0 mode: CR/CR res: [0x20006c6de:0x1c752:0x0].0 bits 0x9 rrc: 2 type: IBT flags: 0x40200000000000 nid: 192.168.115.26@o2ib10 remote: 0x1e9fdd93a70839d9 expref: 1570 pid: 23949 timeout: 0 lvb_type: 0 used 402457368
Sun Jan  3 20:38:12 2016 [24058] (mdt_lib.c:545:mdt_dump_lmm()) objid 0x1c752:444126, magic 0x0BD10BD0, pattern 0X1
Sun Jan  3 20:38:12 2016 [23813] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:2505:mdt_object_find()) Find object for [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:5322:mdt_object_init()) object init, fid = [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [21992] (ldlm_resource.c:1515:ldlm_resource_dump()) --- Resource: [0x20006c6de:0x1c752:0x0].0 (ffff880f801a1980) refcount = 2
Sun Jan  3 20:38:12 2016 [24058] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
Sun Jan  3 20:38:12 2016 [23837] (osd_object.c:410:osd_object_init()) lse-MDT0000: lookup [0x20006c6de:0x1c752:0x0]/0x4649efde failed: rc = -17
Sun Jan  3 20:38:12 2016 [23837] (mdt_handler.c:5342:mdt_object_free()) object free, fid = [0x20006c6de:0x1c752:0x0]
Comment by D. Marc Stearman (Inactive) [ 06/Jan/16 ]

With +info added to debugging, the logs cycle rather quickly. Hopefully there is enough info to find where the code is having problems.

Comment by Gerrit Updater [ 20/Jan/16 ]

Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/18054
Subject: LU-6719 obdclass: Retry lookup
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f6c8ff3e40e8fa403d23192a3b8f819a6f8a0e77

Comment by Gerrit Updater [ 05/Feb/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18054/
Subject: LU-6719 osd-zfs: Ignore EEXIST during object init
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d96c38aa85d80f5e0630c28d242f8279987c8d5e

Comment by Joseph Gmitter (Inactive) [ 05/Feb/16 ]

Patch has landed for 2.8

Comment by D. Marc Stearman (Inactive) [ 05/Feb/16 ]

Could you please provide a backported patch for Lustre 2.5?

Generated at Sat Feb 10 02:02:39 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.