[LU-4269] ldlm_lock_put()) ASSERTION( (((( lock))->l_flags & (1ULL << 50)) != 0) ) failed Created: 18/Nov/13 Updated: 27/Feb/14 Resolved: 21/Feb/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0 |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.5.1 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Cliff White (Inactive) | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | MB, mn4 | ||
| Environment: |
Hyperion/LLNL |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 11723 |
| Description |
|
Running SWL - miranda IO test. 2013-11-16 17:27:14 LustreError: 105183:0:(ldlm_lock.c:222:ldlm_lock_put()) ASSERTION( (((( lock))->l_flags & (1ULL << 50)) != 0) ) failed: 2013-11-16 17:27:14 LustreError: 105183:0:(ldlm_lock.c:222:ldlm_lock_put()) LBUG 2013-11-16 17:27:14 Pid: 105183, comm: miranda_io 2013-11-16 17:27:14 Nov 16 17:27:14 2013-11-16 17:27:14 Call Trace: 2013-11-16 17:27:14 iwc48 kernel: LustreError: 10518 [<ffffffffa056b895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 2013-11-16 17:27:14 3:0:(ldlm_lock.c:222:ldlm_lock_p [<ffffffffa056be97>] lbug_with_loc+0x47/0xb0 [libcfs] 2013-11-16 17:27:14 ut()) ASSERTION( (((( lock))->l_flags & (1ULL << [<ffffffffa080f94d>] ldlm_lock_put+0x44d/0x560 [ptlrpc] 2013-11-16 17:27:14 50)) != 0) ) failed: 2013-11-16 17:27:14 Nov 16 17:27:14 iwc48 kernel: LustreError [<ffffffffa0822d22>] ldlm_cli_cancel_list+0xf2/0x3e0 [ptlrpc] 2013-11-16 17:27:14 : 105183:0:(ldlm_lock.c:222:ldlm_lock_put()) LBUG 2013-11-16 17:27:14 [<ffffffffa0824952>] ldlm_cli_cancel+0x132/0x360 [ptlrpc] 2013-11-16 17:27:14 [<ffffffffa09fdbce>] osc_lock_cancel+0xfe/0x1c0 [osc] 2013-11-16 17:27:14 [<ffffffffa0708525>] cl_lock_cancel0+0x75/0x160 [obdclass] 2013-11-16 17:27:14 [<ffffffffa07090db>] cl_lock_cancel+0x13b/0x140 [obdclass] 2013-11-16 17:27:14 [<ffffffffa070d7e4>] cl_lock_enqueue_wait+0xc4/0x2d0 [obdclass] 2013-11-16 17:27:14 [<ffffffffa070aa17>] ? cl_lock_mutex_put+0x77/0x90 [obdclass] 2013-11-16 17:27:14 [<ffffffffa0a9126b>] lov_lock_enqueue+0x3fb/0x850 [lov] 2013-11-16 17:27:14 [<ffffffffa070c65c>] cl_enqueue_try+0xfc/0x300 [obdclass] 2013-11-16 17:27:14 [<ffffffffa070da5f>] cl_enqueue_locked+0x6f/0x1f0 [obdclass] 2013-11-16 17:27:14 [<ffffffffa070e6ae>] cl_lock_request+0x7e/0x270 [obdclass] 2013-11-16 17:27:14 [<ffffffffa071391c>] cl_io_lock+0x3cc/0x560 [obdclass] 2013-11-16 17:27:14 [<ffffffffa0713b52>] cl_io_loop+0xa2/0x1b0 [obdclass] 2013-11-16 17:27:14 [<ffffffffa0b0e290>] ll_file_io_generic+0x460/0x610 [lustre] 2013-11-16 17:27:14 [<ffffffffa0701e39>] ? cl_env_get+0x29/0x350 [obdclass] 2013-11-16 17:27:14 [<ffffffffa0b0ecb2>] ll_file_aio_write+0x142/0x2c0 [lustre] 2013-11-16 17:27:14 [<ffffffffa0b0ef9c>] ll_file_write+0x16c/0x2a0 [lustre] 2013-11-16 17:27:14 [<ffffffff81181398>] vfs_write+0xb8/0x1a0 2013-11-16 17:27:14 [<ffffffff81181c91>] sys_write+0x51/0x90 2013-11-16 17:27:14 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b 2013-11-16 17:27:14 Lctl dump attached |
| Comments |
| Comment by Zhenyu Xu [ 11/Dec/13 ] |
|
What I observed in the log line 140698, ldlm_cli_cancel() Cliff, what's the latest git commit is this code? Since I've seen cl_lock_discard_pages() while current master branch does not have this function in it, I need to know the exact code base you were using when you hit this problem. |
| Comment by Shuichi Ihara (Inactive) [ 16/Dec/13 ] |
|
We saw exactly same problem on multiple clients during mdtest is running. The client was running with the latest master branch. <0>LustreError: 10212:0:(ldlm_lock.c:222:ldlm_lock_put()) ASSERTION( (((( lock))->l_flags & (1ULL << 50)) != 0) ) failed: <0>LustreError: 10212:0:(ldlm_lock.c:222:ldlm_lock_put()) LBUG <4>Pid: 10212, comm: mdtest <4> <4>Call Trace: <4> [<ffffffffa05c5895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4> [<ffffffffa05c5e97>] lbug_with_loc+0x47/0xb0 [libcfs] <4> [<ffffffffa0951e27>] ldlm_lock_put+0x547/0x630 [ptlrpc] <4> [<ffffffffa09708f2>] ldlm_cli_cancel_list+0xf2/0x3e0 [ptlrpc] <4> [<ffffffffa0972e0a>] ldlm_prep_elc_req+0x21a/0x4b0 [ptlrpc] <4> [<ffffffffa1043637>] mdc_unlink+0xe7/0x500 [mdc] <4> [<ffffffffa12ade7b>] lmv_unlink+0x1db/0x7a0 [lmv] <4> [<ffffffffa11c4a1e>] ? ll_i2gids+0x2e/0xd0 [lustre] <4> [<ffffffffa11cb228>] ll_unlink+0x158/0x610 [lustre] <4> [<ffffffff8118fec0>] vfs_unlink+0xa0/0xf0 <4> [<ffffffff8118ebfa>] ? lookup_hash+0x3a/0x50 <4> [<ffffffff81192265>] do_unlinkat+0xf5/0x1b0 <4> [<ffffffff810d94e2>] ? unroll_tree_refs+0xe2/0x120 <4> [<ffffffff8100c535>] ? math_state_restore+0x45/0x60 <4> [<ffffffff81192336>] sys_unlink+0x16/0x20 <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b <4> <0>Kernel panic - not syncing: LBUG <4>Pid: 10212, comm: mdtest Not tainted 2.6.32-358.23.2.el6_lustre.ge975b1c.x86_64 #1 <4>Call Trace: <4> [<ffffffff8150deec>] ? panic+0xa7/0x16f <4> [<ffffffffa05c5eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] <4> [<ffffffffa0951e27>] ? ldlm_lock_put+0x547/0x630 [ptlrpc] <4> [<ffffffffa09708f2>] ? ldlm_cli_cancel_list+0xf2/0x3e0 [ptlrpc] <4> [<ffffffffa0972e0a>] ? ldlm_prep_elc_req+0x21a/0x4b0 [ptlrpc] <4> [<ffffffffa1043637>] ? mdc_unlink+0xe7/0x500 [mdc] <4> [<ffffffffa12ade7b>] ? lmv_unlink+0x1db/0x7a0 [lmv] <4> [<ffffffffa11c4a1e>] ? ll_i2gids+0x2e/0xd0 [lustre] <4> [<ffffffffa11cb228>] ? ll_unlink+0x158/0x610 [lustre] <4> [<ffffffff8118fec0>] ? vfs_unlink+0xa0/0xf0 <4> [<ffffffff8118ebfa>] ? lookup_hash+0x3a/0x50 <4> [<ffffffff81192265>] ? do_unlinkat+0xf5/0x1b0 <4> [<ffffffff810d94e2>] ? unroll_tree_refs+0xe2/0x120 <4> [<ffffffff8100c535>] ? math_state_restore+0x45/0x60 <4> [<ffffffff81192336>] ? sys_unlink+0x16/0x20 <4> [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b we have crashdump as well. Please let me know it helps for debugging. |
| Comment by Zhenyu Xu [ 16/Dec/13 ] |
|
Hi Ihara-San, Do you have debug log of it? |
| Comment by Shuichi Ihara (Inactive) [ 16/Dec/13 ] |
|
Hi Bobijam, vmware-dmesg.txt attached. Please let me know if you need vmware itself. I can upload it as well. |
| Comment by Zhenyu Xu [ 16/Dec/13 ] |
|
besides dmesg, do you have debug log? (lctl dk) |
| Comment by Cliff White (Inactive) [ 19/Dec/13 ] |
|
The test was run Nov 18th, build information was not in the version: build: 2.5.51--PRISTINE-2.6.32-358.23.2.el6_lustre.x86_64 - so it was likely the 2.5.51 tag. |
| Comment by Cliff White (Inactive) [ 02/Jan/14 ] |
|
Sysrq -t from hung client. |
| Comment by Cliff White (Inactive) [ 03/Jan/14 ] |
|
Attempting to attach file again |
| Comment by Jinshan Xiong (Inactive) [ 07/Jan/14 ] |
|
From the feedback from Ihara, it looks like there exists race condition to access the flags of dlm lock. Let's create a debug patch to add a line into ldlm_lock_destroy_internal() to print out what's exact state of the lock |
| Comment by Zhenyu Xu [ 07/Jan/14 ] |
|
debug patch at http://review.whamcloud.com/8759, please unset panic_on_lbug (echo 0 > /proc/sys/lnet/panic_on_lbug) and run this debug patched version so that we can collect the debug log when this assertion is hit. |
| Comment by Li Xi (Inactive) [ 13/Jan/14 ] |
|
I've post a patch for this. However, it is not sure whether it fixes this problem, since we don't have a reliable way to reproduce this problem yet. |
| Comment by Zhenyu Xu [ 13/Jan/14 ] |
|
Thank you LiXi, but the iwc48.lctl.log.txt shows in line 140770 00010000:00000001:0.0:1384651634.269628:0:105183:0:(ldlm_request.c:1760:ldlm_prepare_lru_list()) Process leaving (rc=0 : 0 : 0) which shows that when we canceled the problematic lock, ldlm_prepare_lru_list() didn't find other lru locks, so I'd think your patch would possibly not affect this issue. |
| Comment by Li Xi (Inactive) [ 13/Jan/14 ] |
|
Ah, thank you for pointing that out, Zhenyu. I was just wondering why we can call ldlm_lock_remove_from_lru_nolock() without calling lock_res_and_lock() first. Is there anything I missed? |
| Comment by Zhenyu Xu [ 13/Jan/14 ] |
|
ldlm_prepare_lru_list() is to collect unused lock from namespace::ns_unused_list, where only unused locks are linked into it (please refer to ldlm_lock_decref_internal()->ldlm_lock_add_to_lru()), so in ldlm_prepare_lru_list() we can call ldlm_lock_remove_from_lru_nolock() w/o locking the res and lock since the lock has no readers nor writers referring to it. |
| Comment by Li Xi (Inactive) [ 13/Jan/14 ] |
|
Ah, thank you for the explaination! Now I understand. |
| Comment by Jinshan Xiong (Inactive) [ 14/Jan/14 ] |
|
how often do you guys see this problem? If it's quite often, please apply Bobijam's patch and get some logs; otherwise, I'd like to drop the priority of this ticket. Jinshan |
| Comment by Shuichi Ihara (Inactive) [ 14/Jan/14 ] |
|
Jinshan, Bobijam, Yes, we applied Bobijam's debug patch at the customer site, but didn't reproduce problem yet. I will post logs once we can hit same issue. |
| Comment by Shuichi Ihara (Inactive) [ 14/Jan/14 ] |
|
We got crash again with debug patch. Attached is diagnostic information of crach. |
| Comment by Zhenyu Xu [ 14/Jan/14 ] |
|
do you collect any lctl log from it? |
| Comment by Shuichi Ihara (Inactive) [ 14/Jan/14 ] |
|
no, unfortunately.. client got panic even we have panic_on_lbug=0 setting.. I don't know why we got panic after LBUG.. |
| Comment by Patrick Farrell (Inactive) [ 23/Jan/14 ] |
|
We reproduced this running master + a client lock debug patch on SLES11SP3 while reproducing a client lock bug. I believe we hit this with a special debug flag for cl_lock_trace turned on (So all calls to cl_lock_trace are being logged, but little or nothing else.) I should be able to make the dump and logs available if you think it would be helpful. I'm not sure about reproducing it, with different debug or at all - We run the test I'm using as a reproducer a lot and this is the first we've seen this particular bug. (The test in question is mmstress from the Linux Test Project.) |
| Comment by Zhenyu Xu [ 24/Jan/14 ] |
|
yes please upload logs Patrick, thank you. |
| Comment by Patrick Farrell (Inactive) [ 27/Jan/14 ] |
|
Console log & Lustre log. |
| Comment by Patrick Farrell (Inactive) [ 27/Jan/14 ] |
|
Logs attached as noted. I've also uploaded the dump (with logs, vmlinuz, ko files, etc) to ftp.whamcloud.com at uploads/ System is master from January 23rd on SLES11SP3, with a special debug patch that enables all calls to cl_lock_trace. Also, we reproduced this by running mmstress from the Linux test project. We ran multiple copies on multiple nodes. (This also hits several cl_lock related bugs, so that may make this harder to find. Those problems are more common than this one when running mmstress.) |
| Comment by Zhenyu Xu [ 29/Jan/14 ] |
|
Hi Jinshan, When a top lock enqueue failed, top lock will go to cl_lock_hold_release() and then unhold a lovsub lock, which will go to osc_lock_cancel(), if at this time the osc lock is used by another IO, then the ldlm lock is not canceled. osc_lock_cancel() do_cancel = (dlmlock->l_readers == 0 &&
dlmlock->l_writers == 0);
ldlm_set_cbpending(dlmlock);
unlock_res_and_lock(dlmlock);
if (do_cancel)
result = ldlm_cli_cancel(&olck->ols_handle, LCF_ASYNC);
}
and the osc lock is detached from the lobsub lock in cl_lock_hold_release() if (lock->cll_holds == 0) { CL_LOCK_ASSERT(lock->cll_state != CLS_HELD, env, lock); if (lock->cll_descr.cld_mode == CLM_PHANTOM || lock->cll_descr.cld_mode == CLM_GROUP || lock->cll_state != CLS_CACHED) /* * If lock is still phantom or grouplock when user is * done with it---destroy the lock. */ lock->cll_flags |= CLF_CANCELPEND|CLF_DOOMED; if (lock->cll_flags & CLF_CANCELPEND) { lock->cll_flags &= ~CLF_CANCELPEND; cl_lock_cancel0(env, lock); } if (lock->cll_flags & CLF_DOOMED) { /* no longer doomed: it's dead... Jim. */ lock->cll_flags &= ~CLF_DOOMED; cl_lock_delete0(env, lock); } } Is it possible that the ldlm lock somehow missed the cancel process somewhere so that its last dereference is still missing the destroy flag? |
| Comment by Jinshan Xiong (Inactive) [ 31/Jan/14 ] |
|
I happen to be able to reproduce this issue steadily while I was working another one. After I applied the following patch: diff --git a/lustre/ldlm/ldlm_lock.c b/lustre/ldlm/ldlm_lock.c index 214ee8c..2e69ad0 100644 --- a/lustre/ldlm/ldlm_lock.c +++ b/lustre/ldlm/ldlm_lock.c @@ -372,12 +372,14 @@ int ldlm_lock_destroy_internal(struct ldlm_lock *lock) LBUG(); } + LDLM_DEBUG(lock, "destroy"); if (ldlm_is_destroyed(lock)) { LASSERT(cfs_list_empty(&lock->l_lru)); EXIT; return 0; } ldlm_set_destroyed(lock); + LDLM_DEBUG(lock, "destroy"); if (lock->l_export && lock->l_export->exp_lock_hash) { /* NB: it's safe to call cfs_hash_del() even lock isn't @@ -2169,6 +2171,7 @@ void ldlm_lock_cancel(struct ldlm_lock *lock) ldlm_resource_unlink_lock(lock); ldlm_lock_destroy_nolock(lock); + LDLM_DEBUG(lock, "destroyed"); if (lock->l_granted_mode == lock->l_req_mode) ldlm_pool_del(&ns->ns_pool, lock); diff --git a/lustre/ldlm/ldlm_request.c b/lustre/ldlm/ldlm_request.c index e140e0f..def3ee5 100644 --- a/lustre/ldlm/ldlm_request.c +++ b/lustre/ldlm/ldlm_request.c @@ -1141,6 +1141,7 @@ static __u64 ldlm_cli_cancel_local(struct ldlm_lock *lock) rc = LDLM_FL_LOCAL_ONLY; } ldlm_lock_cancel(lock); + LDLM_DEBUG(lock, "client-side cancel completed, rc = %lld", rc); } else { if (ns_is_client(ldlm_lock_to_ns(lock))) { LDLM_ERROR(lock, "Trying to cancel local lock"); diff --git a/lustre/llite/llite_internal.h b/lustre/llite/llite_internal.h index 92b278e..e9e7053 100644 --- a/lustre/llite/llite_internal.h +++ b/lustre/llite/llite_internal.h @@ -1469,7 +1469,7 @@ static inline void cl_isize_unlock(struct inode *inode) static inline void cl_isize_write_nolock(struct inode *inode, loff_t kms) { - LASSERT(down_trylock(&ll_i2info(inode)->lli_size_sem) != 0); + //LASSERT(down_trylock(&ll_i2info(inode)->lli_size_sem) != 0); i_size_write(inode, kms); } diff --git a/lustre/llite/llite_lib.c b/lustre/llite/llite_lib.c index df421d6..ae53f3d 100644 --- a/lustre/llite/llite_lib.c +++ b/lustre/llite/llite_lib.c @@ -1700,25 +1700,29 @@ int ll_statfs(struct dentry *de, struct kstatfs *sfs) void ll_inode_size_lock(struct inode *inode) { +#if 0 struct ll_inode_info *lli; LASSERT(!S_ISDIR(inode->i_mode)); lli = ll_i2info(inode); LASSERT(lli->lli_size_sem_owner != current); - down(&lli->lli_size_sem); + //down(&lli->lli_size_sem); LASSERT(lli->lli_size_sem_owner == NULL); lli->lli_size_sem_owner = current; +#endif } void ll_inode_size_unlock(struct inode *inode) { +#if 0 struct ll_inode_info *lli; lli = ll_i2info(inode); LASSERT(lli->lli_size_sem_owner == current); lli->lli_size_sem_owner = NULL; - up(&lli->lli_size_sem); + //up(&lli->lli_size_sem); +#endif } void ll_update_inode(struct inode *inode, struct lustre_md *md) diff --git a/lustre/obdclass/cl_lock.c b/lustre/obdclass/cl_lock.c index d440da9..b8c2e04 100644 --- a/lustre/obdclass/cl_lock.c +++ b/lustre/obdclass/cl_lock.c @@ -493,6 +493,7 @@ static struct cl_lock *cl_lock_lookup(const struct lu_env *env, const struct cl_io *io, const struct cl_lock_descr *need) { +#if 0 struct cl_lock *lock; struct cl_object_header *head; @@ -518,6 +519,7 @@ static struct cl_lock *cl_lock_lookup(const struct lu_env *env, RETURN(lock); } } +#endif RETURN(NULL); } @@ -1964,6 +1966,8 @@ static struct cl_lock *cl_lock_hold_mutex(const struct lu_env *env, cl_lock_hold_mod(env, lock, +1); lu_ref_add(&lock->cll_holders, scope, source); lu_ref_add(&lock->cll_reference, scope, source); + + cl_lock_cancel(env, lock); break; } cl_lock_mutex_put(env, lock); And uses 32 processes to read the same file on the same time. The reading process hit the LBUG at the same place: Call Trace: [<ffffffffa03fd895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa03fde97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0774fb7>] ldlm_lock_put+0x547/0x630 [ptlrpc] [<ffffffffa0793df2>] ldlm_cli_cancel_list+0xf2/0x3e0 [ptlrpc] [<ffffffffa0795a22>] ldlm_cli_cancel+0x132/0x360 [ptlrpc] [<ffffffffa0a8e72e>] osc_lock_cancel+0x10e/0x1d0 [osc] [<ffffffffa059c045>] cl_lock_cancel0+0x75/0x160 [obdclass] [<ffffffffa059d9d6>] cl_lock_hold_release+0x1c6/0x2a0 [obdclass] [<ffffffffa059ecc7>] cl_lock_unhold+0x37/0x130 [obdclass] [<ffffffffa0af8a68>] lov_sublock_release+0x1a8/0x280 [lov] [<ffffffffa0afa4fe>] lov_lock_unuse+0x1be/0x290 [lov] [<ffffffffa059c205>] cl_unuse_try_internal+0x55/0xe0 [obdclass] [<ffffffffa059f0a9>] cl_unuse_try+0x199/0x320 [obdclass] [<ffffffffa059f267>] cl_unuse_locked+0x37/0x1a0 [obdclass] [<ffffffffa059f40e>] cl_unuse+0x3e/0x120 [obdclass] [<ffffffffa05a5e57>] cl_io_unlock+0x127/0x2b0 [obdclass] [<ffffffffa05a6aed>] cl_io_loop+0xcd/0x1b0 [obdclass] [<ffffffffa0f7c9d6>] ll_file_io_generic+0x2b6/0x710 [lustre] [<ffffffffa0596db9>] ? cl_env_get+0x29/0x350 [obdclass] [<ffffffffa0f7cf6f>] ll_file_aio_read+0x13f/0x2c0 [lustre] [<ffffffffa0f7d42c>] ll_file_read+0x16c/0x2a0 [lustre] [<ffffffff81181a95>] vfs_read+0xb5/0x1a0 [<ffffffff81181bd1>] sys_read+0x51/0x90 [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b LustreError: dumping log to /tmp/lustre-log.1391207221.6283 However, from the log, I can see this: 00000020:00010000:5.0:1391207220.451167:0:6294:0:(cl_lock.c:151:cl_lock_trace0()) unuse lock: ffff88081acf48e8@(2 ffff8807e6fd0080 1 3 0 1 1 2)(ffff88080d4a0db8/0/1) at cl_unuse_try():1363 00000020:00010000:5.0:1391207220.451172:0:6294:0:(cl_lock.c:151:cl_lock_trace0()) enclosure lock: ffff88081acf4738@(1 (null) 0 3 0 1 1 2)(ffff88080d4a0db8/1/1) at cl_lock_enclosure():1691 00000020:00010000:5.0:1391207220.451175:0:6294:0:(cl_lock.c:151:cl_lock_trace0()) enclosure lock: ffff88081acf48e8@(3 ffff8807e6fd0080 1 4 0 2 1 2)(ffff88080d4a0db8/0/2) at cl_lock_enclosure():1691 00000020:00010000:5.0:1391207220.451178:0:6294:0:(cl_lock.c:151:cl_lock_trace0()) unuse lock: ffff88081acf4738@(2 ffff8807e6fd0080 1 3 0 1 1 2)(ffff88080d4a0c68/1/0) at cl_unuse_try():1363 00010000:00010000:5.0:1391207220.451185:0:6294:0:(ldlm_lock.c:852:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(PR) ns: lustre-OST0000-osc-ffff8808198eb000 lock: ffff8808152de340/0x6d44b209b7d2fe79 lrc: 4/1,0 mode: PR/PR res: [0x2:0x0:0x0].0 rrc: 15 type: EXT [0->18446744073709551615] (req 1146880->1150975) flags: 0x10020000000000 nid: local remote: 0x6d44b209b7d2fe95 expref: -99 pid: 6294 timeout: 0 lvb_type: 1 00010000:00010000:5.0:1391207220.451191:0:6294:0:(ldlm_lock.c:920:ldlm_lock_decref_internal()) ### add lock into lru list ns: lustre-OST0000-osc-ffff8808198eb000 lock: ffff8808152de340/0x6d44b209b7d2fe79 lrc: 3/0,0 mode: PR/PR res: [0x2:0x0:0x0].0 rrc: 15 type: EXT [0->18446744073709551615] (req 1146880->1150975) flags: 0x10020000000000 nid: local remote: 0x6d44b209b7d2fe95 expref: -99 pid: 6294 timeout: 0 lvb_type: 1 00000020:00010000:5.0:1391207220.451197:0:6294:0:(cl_lock.c:151:cl_lock_trace0()) hold release lock: ffff88081acf4738@(3 ffff8807e6fd0080 1 5 0 2 0 2)(ffff88080d4a0c68/1/0) at cl_lock_hold_release():903 00000020:00010000:5.0:1391207220.451200:0:6294:0:(cl_lock.c:151:cl_lock_trace0()) hold release lock: ffff88081acf4738@(2 ffff8807e6fd0080 1 5 0 1 0 2)(ffff88080d4a0db8/1/2) at cl_lock_hold_release():903 00000080:00200000:5.0:1391207220.451207:0:6294:0:(vvp_io.c:1163:vvp_io_init()) [0x200000400:0x2:0x0] ignore/verify layout 1/0, layout version 0 restore needed 0 00000080:00200000:5.0:1391207220.452460:0:6294:0:(vvp_io.c:153:vvp_io_fini()) [0x200000400:0x2:0x0] ignore/verify layout 1/0, layout version 0 restore needed 0 00010000:00010000:5.0:1391207220.452485:0:6294:0:(ldlm_request.c:1127:ldlm_cli_cancel_local()) ### client-side cancel ns: lustre-OST0000-osc-ffff8808198eb000 lock: ffff8808152de340/0x6d44b209b7d2fe79 lrc: 3/0,0 mode: PR/PR res: [0x2:0x0:0x0].0 rrc: 15 type: EXT [0->18446744073709551615] (req 1146880->1150975) flags: 0x28400000000 nid: local remote: 0x6d44b209b7d2fe95 expref: -99 pid: 6294 timeout: 0 lvb_type: 1 00000020:00010000:5.0:1391207220.452493:0:6294:0:(cl_lock.c:151:cl_lock_trace0()) cancel lock: ffff88081acf4738@(3 ffff8807e6fd0080 2 5 0 0 0 1)(ffff8807e0547920/1/1) at cl_lock_cancel():1836 00000020:00010000:5.0:1391207220.452496:0:6294:0:(cl_lock.c:151:cl_lock_trace0()) delete lock: ffff88081acf4738@(3 ffff8807e6fd0080 2 5 0 0 0 1)(ffff8807e0547920/1/1) at cl_lock_delete():1783 00010000:00010000:5.0:1391207220.452523:0:6294:0:(ldlm_lock.c:375:ldlm_lock_destroy_internal()) ### destroy ns: lustre-OST0000-osc-ffff8808198eb000 lock: ffff8808152de340/0x6d44b209b7d2fe79 lrc: 2/0,0 mode: PR/PR res: [0x2:0x0:0x0].0 rrc: 15 type: EXT [0->18446744073709551615] (req 1146880->1150975) flags: 0x10869400000000 nid: local remote: 0x6d44b209b7d2fe95 expref: -99 pid: 6294 timeout: 0 lvb_type: 1 00010000:00010000:5.0:1391207220.452529:0:6294:0:(ldlm_lock.c:382:ldlm_lock_destroy_internal()) ### destroy ns: lustre-OST0000-osc-ffff8808198eb000 lock: ffff8808152de340/0x6d44b209b7d2fe79 lrc: 2/0,0 mode: PR/PR res: [0x2:0x0:0x0].0 rrc: 15 type: EXT [0->18446744073709551615] (req 1146880->1150975) flags: 0x10869400000000 nid: local remote: 0x6d44b209b7d2fe95 expref: -99 pid: 6294 timeout: 0 lvb_type: 1 00010000:00010000:5.0:1391207220.452534:0:6294:0:(ldlm_lock.c:2174:ldlm_lock_cancel()) ### destroyed ns: lustre-OST0000-osc-ffff8808198eb000 lock: ffff8808152de340/0x6d44b209b7d2fe79 lrc: 1/0,0 mode: PR/PR res: [0x2:0x0:0x0].0 rrc: 15 type: EXT [0->18446744073709551615] (req 1146880->1150975) flags: 0x10869400000000 nid: local remote: 0x6d44b209b7d2fe95 expref: -99 pid: 6294 timeout: 0 lvb_type: 1 00010000:00010000:5.0:1391207220.452544:0:6294:0:(ldlm_request.c:1144:ldlm_cli_cancel_local()) ### client-side cancel completed, rc = 549755813888 ns: lustre-OST0000-osc-ffff8808198eb000 lock: ffff8808152de340/0x6d44b209b7d2fe79 lrc: 1/0,0 mode: --/PR res: [0x2:0x0:0x0].0 rrc: 15 type: EXT [0->18446744073709551615] (req 1146880->1150975) flags: 0x869400000000 nid: local remote: 0x6d44b209b7d2fe95 expref: -99 pid: 6294 timeout: 0 lvb_type: 1 00010000:00010000:5.0:1391207220.452554:0:6294:0:(ldlm_request.c:1187:ldlm_cancel_pack()) ### packing ns: lustre-OST0000-osc-ffff8808198eb000 lock: ffff8808152de340/0x6d44b209b7d2fe79 lrc: 1/0,0 mode: --/PR res: [0x2:0x0:0x0].0 rrc: 14 type: EXT [0->18446744073709551615] (req 1146880->1150975) flags: 0x869400000000 nid: local remote: 0x6d44b209b7d2fe95 expref: -99 pid: 6294 timeout: 0 lvb_type: 1 00010000:00010000:5.0:1391207220.452559:0:6294:0:(ldlm_request.c:1191:ldlm_cancel_pack()) 1 locks packed 00010000:00010000:5.0:1391207220.452565:0:6294:0:(ldlm_lock.c:219:ldlm_lock_put()) ### final lock_put on destroyed lock, freeing it. ns: lustre-OST0000-osc-ffff8808198eb000 lock: ffff8808152de340/0x6d44b209b7d2fe79 lrc: 0/0,0 mode: --/PR res: [0x2:0x0:0x0].0 rrc: 14 type: EXT [0->18446744073709551615] (req 1146880->1150975) flags: 0x869400000000 nid: local remote: 0x6d44b209b7d2fe95 expref: -99 pid: 6294 timeout: 0 lvb_type: 1 00010000:00040000:5.0:1391207220.452570:0:6294:0:(ldlm_lock.c:222:ldlm_lock_put()) ASSERTION( (((( lock))->l_flags & (1ULL << 50)) != 0) ) failed: 00010000:00040000:5.0:1391207220.577549:0:6294:0:(ldlm_lock.c:222:ldlm_lock_put()) LBUG Both line 375 and 382 in were printed. and the source is as follows: 361 int ldlm_lock_destroy_internal(struct ldlm_lock *lock) 362 { 363 ENTRY; 364 365 if (lock->l_readers || lock->l_writers) { 366 LDLM_ERROR(lock, "lock still has references"); 367 LBUG(); 368 } 369 370 if (!cfs_list_empty(&lock->l_res_link)) { 371 LDLM_ERROR(lock, "lock still on resource"); 372 LBUG(); 373 } 374 375 LDLM_DEBUG(lock, "destroy"); 376 if (ldlm_is_destroyed(lock)) { 377 LASSERT(cfs_list_empty(&lock->l_lru)); 378 EXIT; 379 return 0; 380 } 381 ldlm_set_destroyed(lock); 382 LDLM_DEBUG(lock, "destroy"); However, the destroyed flag was failed to set. This is beyond of my understanding, and we should escalate this issue. |
| Comment by Jinshan Xiong (Inactive) [ 31/Jan/14 ] |
|
Full log is here |
| Comment by Oleg Drokin [ 31/Jan/14 ] |
|
I think the prime suspect owuld be some place that updates lock flags without taking lock spinlock. There are several such places. |
| Comment by Jinshan Xiong (Inactive) [ 01/Feb/14 ] |
|
I confirmed that changing the dlm flag operation to bitops fixed the problem. Check the attachment for the concept patch. The real patch should be worked out by modifying lustre/contrib/bit-masks/lustre_dlm_flags.tpl. diff --git a/lustre/include/lustre_dlm.h b/lustre/include/lustre_dlm.h
index 98e03c9..bf4fb65 100644
--- a/lustre/include/lustre_dlm.h
+++ b/lustre/include/lustre_dlm.h
@@ -772,7 +772,7 @@ struct ldlm_lock {
* Lock state flags. Protected by lr_lock.
* \see lustre_dlm_flags.h where the bits are defined.
*/
- __u64 l_flags;
+ volatile __u64 l_flags;
/**
* Lock r/w usage counters.
diff --git a/lustre/include/lustre_dlm_flags.h b/lustre/include/lustre_dlm_flags.h
index 283546d..4be532f 100644
--- a/lustre/include/lustre_dlm_flags.h
+++ b/lustre/include/lustre_dlm_flags.h
@@ -57,91 +57,91 @@
/** extent, mode, or resource changed */
#define LDLM_FL_LOCK_CHANGED 0x0000000000000001ULL // bit 0
-#define ldlm_is_lock_changed(_l) LDLM_TEST_FLAG(( _l), 1ULL << 0)
-#define ldlm_set_lock_changed(_l) LDLM_SET_FLAG(( _l), 1ULL << 0)
-#define ldlm_clear_lock_changed(_l) LDLM_CLEAR_FLAG((_l), 1ULL << 0)
+#define ldlm_is_lock_changed(_l) LDLM_TEST_FLAG(( _l), 0)
+#define ldlm_set_lock_changed(_l) LDLM_SET_FLAG(( _l), 0)
+#define ldlm_clear_lock_changed(_l) LDLM_CLEAR_FLAG((_l), 0)
......
......
/** test for ldlm_lock flag bit set */
-#define LDLM_TEST_FLAG(_l, _b) (((_l)->l_flags & (_b)) != 0)
+#define LDLM_TEST_FLAG(_l, _b) test_bit(_b, (volatile unsigned long *)&(_l)->l_flags)
/** multi-bit test: are any of mask bits set? */
#define LDLM_HAVE_MASK(_l, _m) (((_l)->l_flags & LDLM_FL_##_m##_MASK) != 0)
/** set a ldlm_lock flag bit */
-#define LDLM_SET_FLAG(_l, _b) ((_l)->l_flags |= (_b))
+#define LDLM_SET_FLAG(_l, _b) set_bit(_b, (volatile unsigned long *)&(_l)->l_flags)
/** clear a ldlm_lock flag bit */
-#define LDLM_CLEAR_FLAG(_l, _b) ((_l)->l_flags &= ~(_b))
+#define LDLM_CLEAR_FLAG(_l, _b) clear_bit(_b, (volatile unsigned long *)&(_l)->l_flags)
/** @} subgroup */
/** @} group */
|
| Comment by Andreas Dilger [ 01/Feb/14 ] |
|
Jinshan, according to the comment, the flags should be protected by lr_flags. Are there flag changes not protected by this lock? Also, I have some concern that using set_bit() will not set the same bits as the Lustre network protocol on all architectures. Do you know if there is a specific bit ordering used in all cases? |
| Comment by Jinshan Xiong (Inactive) [ 01/Feb/14 ] |
|
The flags should be protected by ldlm_lock::l_lock. When I saw this issue, my first reaction was to check if there are unprotected writing to l_flags and the only place is: /** * Removes LDLM lock \a lock from LRU. Assumes LRU is already locked. */ int ldlm_lock_remove_from_lru_nolock(struct ldlm_lock *lock) { int rc = 0; if (!cfs_list_empty(&lock->l_lru)) { struct ldlm_namespace *ns = ldlm_lock_to_ns(lock); LASSERT(lock->l_resource->lr_type != LDLM_FLOCK); cfs_list_del_init(&lock->l_lru); ldlm_clear_skipped(lock); LASSERT(ns->ns_nr_unused > 0); ns->ns_nr_unused--; rc = 1; } return rc; } However, it was protected by ns_lock and if the lock is being destroyed, it must have been taken out of LRU list. |
| Comment by Li Xi (Inactive) [ 02/Feb/14 ] |
|
Hi Jinshan, According to your test results, this problem really looks like a race problem. Would you please try following patch to check that the lock is always held when clearing/setting the flag? |
| Comment by Jinshan Xiong (Inactive) [ 02/Feb/14 ] |
|
Hi Li Xi, That looks true. Somehow I looked at the code wrong. The dlm lock was set destroyed flag before taking out of LRU, so ldlm_clear_skipped() in ldlm_lock_remove_from_lru_nolock() is indeed the root cause of this problem. Your patch will cause deadlock because we should take res lock first and then ns_lock. My idea to fix this problem is to move ldlm_clear_skipped() out of ldlm_lock_remove_from_lru_nolock() into ldlm_lock_add_to_lru_nolock(). I think this will fix the problem as well. |
| Comment by Li Xi (Inactive) [ 02/Feb/14 ] |
|
Ah, yeah, the right lock order is lock->l_lock, res->lr_lock, ns->ns_lock. Thanks for poiting that error out. We are able to reproduce the problem steadily too. If you could push a patch, we can check whether it helps. I feel that there are more than one place where lock->l_flags is changed without the protection of lock->l_lock. |
| Comment by Patrick Farrell (Inactive) [ 12/Feb/14 ] |
|
Li, Jinshan - Are either of you able to share how you're reproducing this? |
| Comment by Jinshan Xiong (Inactive) [ 12/Feb/14 ] |
|
The root cause of this issue is pretty clear. Bobijam, can you please create a patch for this? |
| Comment by Zhenyu Xu [ 13/Feb/14 ] |
|
updated patch http://review.whamcloud.com/#/c/8772/ |
| Comment by Jodi Levi (Inactive) [ 21/Feb/14 ] |
|
Patch landed to Master. Please reopen ticket if more work is needed. |
| Comment by Bob Glossman (Inactive) [ 21/Feb/14 ] |
|
backport to b2_5: |