Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.5.0
-
3
-
8634
Description
On 2.4.50-56-gc42672d, I'm seeing an LBUG in the call to req_capsule_server_get() from mdt_intent_opc(). It follows
mdt_intent_layout() returning -ESTALE.
static int mdt_intent_opc(long itopc, struct mdt_thread_info *info, struct ldlm_lock **lockp, __u64 flags) { ... if (rc == 0 && flv->it_act != NULL) { struct ldlm_reply *rep; /* execute policy */ rc = flv->it_act(opc, info, lockp, flags); rep = req_capsule_server_get(pill, &RMF_DLM_REP); rep->lock_policy_res2 = ptlrpc_status_hton(rep->lock_policy_res2); } else { rc = -EOPNOTSUPP; } RETURN(rc); }
00000004:00000001:3.0:1370903334.569591:0:745:0:(mdt_handler.c:5048:mdt_object_free()) Process leaving 00000020:00000001:3.0:1370903334.569593:0:745:0:(lu_object.c:238:lu_object_alloc()) Process leaving (rc=18446744073709551500 : -116 : ffffffffffffff8c) 00000004:00000001:3.0:1370903334.569596:0:745:0:(mdt_handler.c:2388:mdt_object_find()) Process leaving (rc=18446744073709551500 : -116 : ffffffffffffff8c) 00000004:00000001:3.0:1370903334.569598:0:745:0:(mdt_handler.c:3757:mdt_intent_layout()) Process leaving (rc=18446744073709551500 : -116 : ffffffffffffff8c) 00000100:00040000:3.0:1370903334.569602:0:745:0:(layout.c:1916:__req_capsule_get()) ASSERTION( msg != ((void *)0) ) failed: 00000100:00040000:3.0:1370903334.572661:0:745:0:(layout.c:1916:__req_capsule_get()) LBUG
Lustre: DEBUG MARKER: == sanity test 34h: ftruncate file under grouplock should not block == 17:28:54 (1370903334) LustreError: 745:0:(layout.c:1916:__req_capsule_get()) ASSERTION( msg != ((void *)0) ) failed: LustreError: 745:0:(layout.c:1916:__req_capsule_get()) LBUG Pid: 745, comm: mdt01_000 Call Trace: [<ffffffffa0f5b895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa0f5be97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa1232de2>] __req_capsule_get+0x632/0x700 [ptlrpc] [<ffffffffa0f66d88>] ? libcfs_log_return+0x28/0x40 [libcfs] [<ffffffffa0f66d88>] ? libcfs_log_return+0x28/0x40 [libcfs] [<ffffffffa1232fb8>] req_capsule_server_get+0x18/0x20 [ptlrpc] [<ffffffffa06faf71>] mdt_intent_policy+0x3d1/0x760 [mdt] [<ffffffffa11c23f1>] ldlm_lock_enqueue+0x361/0x8d0 [ptlrpc] [<ffffffffa11e939f>] ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc] [<ffffffffa06fb406>] mdt_enqueue+0x46/0xe0 [mdt] [<ffffffffa0701af8>] mdt_handle_common+0x648/0x1660 [mdt] [<ffffffffa073b185>] mds_regular_handle+0x15/0x20 [mdt] [<ffffffffa121b6a8>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] [<ffffffffa0f5c5de>] ? cfs_timer_arm+0xe/0x10 [libcfs] [<ffffffffa0f6dd8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] [<ffffffffa1212a09>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] [<ffffffffa0f6c2c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffff81055ab3>] ? __wake_up+0x53/0x70 [<ffffffffa121ca3e>] ptlrpc_main+0xace/0x1700 [ptlrpc] [<ffffffffa121bf70>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffffa121bf70>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffffa121bf70>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
It's easily reproduced by running sanityn.sh in a loop.