[LU-5630] mdt_getattr_name_lock()) ASSERTION( lock != NULL ) Created: 16/Sep/14 Updated: 01/Feb/22 Resolved: 14/Dec/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Christopher Morrone | Assignee: | Oleg Drokin |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
Lustre 2.4.2-14chaos (see github.com/chaos/lustre) |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 15745 | ||||||||
| Description |
2014-09-11 21:10:30 LustreError: 0:0:(ldlm_lockd.c:402:waiting_locks_callback()) ### lock callback timer expired after 100s: evicting client at 192.168.120.199@o2ib7 ns: mdt-lsd-MDT0000_UUID lock: ffff880321a4a480/0x6bd4680b789ee41f lrc: 4/0,0 mode: PR/PR res: [0x2000112f3:0xf:0x0].0 bits 0x13 rrc: 4 type: IBT flags: 0x200000000020 nid: 192.168.120.199@o2ib7 remote: 0xf350c14aff003b28 expref: 30 pid: 17248 timeout: 6838410913 lvb_type: 0 used 0 2014-09-11 21:10:30 LustreError: 15075:0:(mdt_handler.c:1423:mdt_getattr_name_lock()) ASSERTION( lock != NULL ) failed: Invalid lock handle 0x6bd4680b789ee41f 2014-09-11 21:10:30 LustreError: 15075:0:(mdt_handler.c:1423:mdt_getattr_name_lock()) LBUG 2014-09-11 21:10:30 Pid: 15075, comm: mdt00_069 The backtrace is: PID: 15075 TASK: ffff880d7001f540 CPU: 2 COMMAND: "mdt00_069" #0 [ffff880d70021938] machine_kexec+0x18b at ffffffff810391ab #1 [ffff880d70021998] crash_kexec+0x72 at ffffffff810c5ee2 #2 [ffff880d70021a68] panic+0xae at ffffffff8152b247 #3 [ffff880d70021ae8] lbug_with_loc+0x9b at ffffffffa0601f4b [libcfs] #4 [ffff880d70021b08] mdt_getattr_name_lock+0x18d0 at ffffffffa0e99900 [mdt] #5 [ffff880d70021bc8] mdt_intent_getattr+0x29d at ffffffffa0e99c5d [mdt] #6 [ffff880d70021c28] mdt_intent_policy+0x39e at ffffffffa0e86fde [mdt] #7 [ffff880d70021c68] ldlm_lock_enqueue+0x361 at ffffffffa08b8911 [ptlrpc] #8 [ffff880d70021cc8] ldlm_handle_enqueue0+0x4ef at ffffffffa08e1a7f [ptlrpc] #9 [ffff880d70021d38] mdt_enqueue+0x46 at ffffffffa0e87466 [mdt] #10 [ffff880d70021d58] mdt_handle_common+0x647 at ffffffffa0e8c0d7 [mdt] #11 [ffff880d70021da8] mds_regular_handle+0x15 at ffffffffa0ec7c75 [mdt] #12 [ffff880d70021db8] ptlrpc_server_handle_request+0x398 at ffffffffa0912188 [ptlrpc] #13 [ffff880d70021eb8] ptlrpc_main+0xace at ffffffffa091351e [ptlrpc] #14 [ffff880d70021f48] child_rip+0xa at ffffffff8100c24a This looks like the same assertion assertion as |
| Comments |
| Comment by Liang Zhen (Inactive) [ 16/Sep/14 ] |
|
I think this is an issue we also hit on master, Vitaly has already posted a patch on |
| Comment by Peter Jones [ 16/Sep/14 ] |
|
Oleg Can you confirm whether this is a duplicate of Thanks Peter |
| Comment by Oleg Drokin [ 16/Sep/14 ] |
|
Yes, I think the bug is the same. This is not the final solution, I am starting to have my doubts that we should return ESTALE on resend as the client is not really at fault here and reprocessign the entire request might be a better idea. |
| Comment by Christopher Morrone [ 16/Sep/14 ] |
|
How will the client behave when it gets ESTALE? |
| Comment by Oleg Drokin [ 17/Sep/14 ] |
|
I suspect ESTALE would propagate all the way up to userspace. On the other hand, if it's due to eviction of that same client, it does not matter due to a bunch of EIO and other stuff this client will get anyway. |