[LU-1306] LBUG at (dlm_lock.c:213:ldlm_lock_add_to_lru_nolock()) ASSERTION(lock->l_resource->lr_type != LDLM_FLOCK failed Created: 11/Apr/12 Updated: 06/Nov/13 Resolved: 06/Nov/13 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.3.0, Lustre 1.8.9 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Andriy Skulysh | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 4630 |
| Description |
|
Following bug occured : It looks like the problem is in following race: ldlm_cb thread calls ldlm_run_cp_ast_work() : mds_intent_policy(), see bug 14225 */ while original lock wait thread receives signal: |
| Comments |
| Comment by Andriy Skulysh [ 11/Apr/12 ] |
| Comment by Peter Jones [ 08/May/12 ] |
|
Landed for 2.3 |
| Comment by Cory Spitz [ 09/May/12 ] |
|
Can this push to b1_8? |
| Comment by Andriy Skulysh [ 09/May/12 ] |
|
The bug was originally detected on b1_8. the patch can be applied for 1.8 also. |
| Comment by Andriy Skulysh [ 14/May/12 ] |
|
patch for b1_8: http://review.whamcloud.com/2727 |
| Comment by Iurii Golovach (Inactive) [ 26/Jul/12 ] |
|
Since there were no updates last few months new ticket to track landing into 1.8 was created: |
| Comment by Cory Spitz [ 10/Oct/12 ] |
|
change #2727 has landed to b1_8. |
| Comment by Nathan Rutman [ 21/Nov/12 ] |
|
Xyratex-bug-id: MRP-420 |
| Comment by Sarah Liu [ 14/Jan/13 ] |
|
Hit this LBUG again in POSIX test during interop testing between 2.3.0 server and 2.4 client. client runs build lustre-master #1142 0:11:40:Lustre: DEBUG MARKER: Run POSIX test against lustre filesystem 20:20:53:LustreError: 12733:0:(ldlm_lock.c:1570:ldlm_fill_lvb()) ### Unexpected LVB type ns: lustre-MDT0000-mdc-ffff880061663400 lock: ffff88001f87b200/0x3aa7bfeb697ea484 lrc: 5/0,1 mode: --/PW res: 8589939620/4400 rrc: 4 type: FLK pid: 386 [10->29] flags: 0x0 nid: local remote: 0x3ba20de103a0d632 expref: -99 pid: 386 timeout: 0 20:21:35:LustreError: 386:0:(ldlm_lock.c:298:ldlm_lock_add_to_lru_nolock()) ASSERTION( lock->l_resource->lr_type != LDLM_FLOCK ) failed: 20:21:35:LustreError: 386:0:(ldlm_lock.c:298:ldlm_lock_add_to_lru_nolock()) LBUG 20:21:35:Pid: 386, comm: T.fcntl 20:21:35: 20:21:35:Call Trace: 20:21:35: [<ffffffffa0b63905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 20:21:35: [<ffffffffa0b63f17>] lbug_with_loc+0x47/0xb0 [libcfs] 20:21:35: [<ffffffffa04eec02>] ldlm_lock_add_to_lru_nolock+0x112/0x120 [ptlrpc] 20:21:35: [<ffffffffa04ef023>] ldlm_lock_add_to_lru+0x43/0x120 [ptlrpc] 20:21:35: [<ffffffffa04f4b78>] ldlm_lock_decref_internal+0x338/0xad0 [ptlrpc] 20:21:35: [<ffffffffa05008fb>] failed_lock_cleanup+0x8b/0x220 [ptlrpc] 20:21:35: [<ffffffffa0500bbf>] ldlm_cli_enqueue_fini+0x12f/0xec0 [ptlrpc] 20:21:35: [<ffffffffa0b64bae>] ? cfs_free+0xe/0x10 [libcfs] 20:21:35: [<ffffffffa0501cfd>] ldlm_cli_enqueue+0x3ad/0x790 [ptlrpc] 20:21:35: [<ffffffffa050e160>] ? ldlm_flock_completion_ast+0x0/0xb40 [ptlrpc] 20:21:35: [<ffffffffa04341b4>] mdc_enqueue+0x694/0x1510 [mdc] 20:21:35: [<ffffffffa065227c>] lmv_enqueue+0x40c/0x1a20 [lmv] 20:21:35: [<ffffffffa07d1e05>] ll_file_flock+0x635/0x9f0 [lustre] 20:21:35: [<ffffffffa050e160>] ? ldlm_flock_completion_ast+0x0/0xb40 [ptlrpc] 20:21:35: [<ffffffff811c78c3>] vfs_lock_file+0x23/0x40 20:21:35: [<ffffffff811c7b17>] fcntl_setlk+0x177/0x320 20:21:35: [<ffffffff8118dd57>] sys_fcntl+0x197/0x530 20:21:35: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b 20:21:35: 20:21:35:Kernel panic - not syncing: LBUG 20:21:35:Pid: 386, comm: T.fcntl Not tainted 2.6.32-279.14.1.el6.x86_64 #1 20:21:35:Call Trace: 20:21:35: [<ffffffff814fd98a>] ? panic+0xa0/0x168 20:21:35: [<ffffffffa0b63f6b>] ? lbug_with_loc+0x9b/0xb0 [libcfs] 20:21:35: [<ffffffffa04eec02>] ? ldlm_lock_add_to_lru_nolock+0x112/0x120 [ptlrpc] 20:21:35: [<ffffffffa04ef023>] ? ldlm_lock_add_to_lru+0x43/0x120 [ptlrpc] 20:21:35: [<ffffffffa04f4b78>] ? ldlm_lock_decref_internal+0x338/0xad0 [ptlrpc] 20:21:35: [<ffffffffa05008fb>] ? failed_lock_cleanup+0x8b/0x220 [ptlrpc] 20:21:35: [<ffffffffa0500bbf>] ? ldlm_cli_enqueue_fini+0x12f/0xec0 [ptlrpc] 20:21:35: [<ffffffffa0b64bae>] ? cfs_free+0xe/0x10 [libcfs] 20:21:35: [<ffffffffa0501cfd>] ? ldlm_cli_enqueue+0x3ad/0x790 [ptlrpc] 20:21:35: [<ffffffffa050e160>] ? ldlm_flock_completion_ast+0x0/0xb40 [ptlrpc] 20:21:35: [<ffffffffa04341b4>] ? mdc_enqueue+0x694/0x1510 [mdc] 20:21:35: [<ffffffffa065227c>] ? lmv_enqueue+0x40c/0x1a20 [lmv] 20:21:35: [<ffffffffa07d1e05>] ? ll_file_flock+0x635/0x9f0 [lustre] 20:21:36: [<ffffffffa050e160>] ? ldlm_flock_completion_ast+0x0/0xb40 [ptlrpc] 20:21:36: [<ffffffff811c78c3>] ? vfs_lock_file+0x23/0x40 20:21:36: [<ffffffff811c7b17>] ? fcntl_setlk+0x177/0x320 20:21:36: [<ffffffff8118dd57>] ? sys_fcntl+0x197/0x530 20:21:36: [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b 20:21:36:Initializing cgroup subsys cpuset 20:21:36:Initializing cgroup subsys cpu |
| Comment by Sarah Liu [ 15/Jan/13 ] |
|
another instance seen in 2.1.4 server vs 2.4 client: |
| Comment by Andreas Dilger [ 06/Nov/13 ] |
|
Patches were landed for b1_8 and master. |