[LU-1188] mkwrite issue : ASSERTION(!cl_env_info(env)->clt_counters[CNL_TOP].ctc_nr_locks_acquired) Created: 05/Mar/12  Updated: 23/Jun/12  Resolved: 23/Jun/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0, Lustre 2.2.0
Fix Version/s: Lustre 2.3.0

Type: Bug Priority: Major
Reporter: Alexey Lyashkov Assignee: Jinshan Xiong (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

RHEL6/64, 8CPU.


Attachments: File kernel-config    
Severity: 3
Rank (Obsolete): 4246

 Description   

Mar 5 16:54:29 rhel6-64 kernel: Lustre: DEBUG MARKER: test message ID 29780 32002
Lustre: DEBUG MARKER: == sanity test 61: mmap() writes don't make sync hang ================== 16:54:30 (1330959270)
Mar 5 16:54:30 rhel6-64 kernel: Lustre: DEBUG MARKER: == sanity test 61: mmap() writes don't make sync hang ================== 16:54:30 (1330959270)
Lustre: DEBUG MARKER: cancel_lru_locks osc start
Mar 5 16:54:30 rhel6-64 kernel: Lustre: DEBUG MARKER: cancel_lru_locks osc start
Lustre: DEBUG MARKER: cancel_lru_locks osc stop
Mar 5 16:54:30 rhel6-64 kernel: Lustre: DEBUG MARKER: cancel_lru_locks osc stop
LustreError: 24138:0:(cl_io.c:523:cl_io_unlock()) ASSERTION(!cl_env_info(env)->clt_counters[CNL_TOP].ctc_nr_locks_acquired) failed
LustreError: 24138:0:(cl_io.c:523:cl_io_unlock()) LBUG

Mar 5 16:54:30 rhel6-64 kernel: LustreError: 24138:0:(cl_io.c:523:cl_io_unlock()) LBUG
Mar 5 16:54:30 rhel6-64 kernel: Pid: 24138, comm: multiop
Mar 5 16:54:30 rhel6-64 kernel:
Mar 5 16:54:30 rhel6-64 kernel: Call Trace:
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffffa04e1865>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffffa04e1ea5>] lbug_with_loc+0x75/0xe0 [libcfs]
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffffa04ecf46>] libcfs_assertion_failed+0x66/0x70 [libcfs]
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffffa0655a0d>] cl_io_unlock+0x23d/0x270 [obdclass]
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffffa0651b80>] ? cl_io_end+0x60/0x120 [obdclass]
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffffa0656769>] cl_io_loop+0x129/0x1c0 [obdclass]
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffffa0b4c346>] ll_page_mkwrite+0x96/0x7d0 [lustre]
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffffa0b4bfb9>] ? ll_fault+0x189/0x480 [lustre]
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffffa04f1ce2>] ? cfs_hash_rw_lock+0x12/0x30 [libcfs]
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffff81151e14>] __do_fault+0xd4/0x4f0
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffff8151f2ab>] ? _spin_unlock+0x2b/0x40
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffff811522c0>] handle_pte_fault+0x90/0xa90
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffff81156398>] ? vma_link+0x58/0xf0
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffff81152ea4>] handle_mm_fault+0x1e4/0x2b0
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffff81043c23>] __do_page_fault+0x163/0x4e0
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffff8115905a>] ? do_mmap_pgoff+0x33a/0x380
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffff81522cae>] do_page_fault+0x3e/0xa0
Mar 5 16:54:30 rhel6-64 kernel: [<ffffffff8151fde5>] page_fault+0x25/0x30
Mar 5 16:54:30 rhel6-64 kernel:



 Comments   
Comment by Alexey Lyashkov [ 05/Mar/12 ]

I have full log file, but not able to attach a log file, because it have size ~24mb. If someone will start to work on that - ping me on Skype and will send a log file.

Comment by Jinshan Xiong (Inactive) [ 05/Mar/12 ]

Hi Shadow, can you please send the log file to my email address at jinshan.xiong@whamcloud.com, or you can upload it to ftp.whamcloud.com/uploads, thanks.

Comment by Alexey Lyashkov [ 06/Mar/12 ]

ftp> put lu_1188-lustre-log.1330960026.26106.gz
local: lu_1188-lustre-log.1330960026.26106.gz remote: lu_1188-lustre-log.1330960026.26106.gz

will be ready in ~15minutes. ask me if you need something other, i able to replicate that issue on each sanity run.

Comment by Peter Jones [ 12/Mar/12 ]

Jinshan is looking into this one

Comment by Alexey Lyashkov [ 13/Mar/12 ]

Peter,

why it is isn't blocker? that real regression between 2.1 and 2.2 and hit at any sanity run.

Comment by Jinshan Xiong (Inactive) [ 13/Mar/12 ]

Shadow, can you please tell me the specific configuration of your kernel?

Comment by Alexey Lyashkov [ 13/Mar/12 ]

generic kernel config with debug options enabled, just to find a lock issues.

Comment by Alexey Lyashkov [ 13/Mar/12 ]

One note, my test env have 8CPU's.

Comment by Alexey Lyashkov [ 13/Mar/12 ]

Jay is it will help for you, if i provide a crash dump ?

Comment by Jinshan Xiong (Inactive) [ 14/Mar/12 ]

No, I don't need it, thanks for offering.

Comment by Johann Lombardi (Inactive) [ 30/Mar/12 ]

This problem is related to CONFIG_LOCKDEP enabled in the kernel and Jinshan is working on a patch.

Comment by Build Master (Inactive) [ 30/Mar/12 ]

Integrated in lustre-reviews » x86_64,client,el6,inkernel #4596
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 08038971ff2253fa23305b94d74b0d683dc968fc)

Result = SUCCESS
Jinshan Xiong : 08038971ff2253fa23305b94d74b0d683dc968fc
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 30/Mar/12 ]

Integrated in lustre-reviews » x86_64,client,el5,inkernel #4596
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 08038971ff2253fa23305b94d74b0d683dc968fc)

Result = SUCCESS
Jinshan Xiong : 08038971ff2253fa23305b94d74b0d683dc968fc
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 30/Mar/12 ]

Integrated in lustre-reviews » i686,client,el6,inkernel #4596
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 08038971ff2253fa23305b94d74b0d683dc968fc)

Result = SUCCESS
Jinshan Xiong : 08038971ff2253fa23305b94d74b0d683dc968fc
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 30/Mar/12 ]

Integrated in lustre-reviews » i686,server,el5,inkernel #4596
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 08038971ff2253fa23305b94d74b0d683dc968fc)

Result = SUCCESS
Jinshan Xiong : 08038971ff2253fa23305b94d74b0d683dc968fc
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 30/Mar/12 ]

Integrated in lustre-reviews » x86_64,server,el6,inkernel #4596
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 08038971ff2253fa23305b94d74b0d683dc968fc)

Result = SUCCESS
Jinshan Xiong : 08038971ff2253fa23305b94d74b0d683dc968fc
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 30/Mar/12 ]

Integrated in lustre-reviews » i686,server,el6,inkernel #4596
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 08038971ff2253fa23305b94d74b0d683dc968fc)

Result = SUCCESS
Jinshan Xiong : 08038971ff2253fa23305b94d74b0d683dc968fc
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 30/Mar/12 ]

Integrated in lustre-reviews » x86_64,server,el5,inkernel #4596
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 08038971ff2253fa23305b94d74b0d683dc968fc)

Result = SUCCESS
Jinshan Xiong : 08038971ff2253fa23305b94d74b0d683dc968fc
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 30/Mar/12 ]

Integrated in lustre-reviews » i686,client,el5,inkernel #4596
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 08038971ff2253fa23305b94d74b0d683dc968fc)

Result = SUCCESS
Jinshan Xiong : 08038971ff2253fa23305b94d74b0d683dc968fc
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 31/Mar/12 ]

Integrated in lustre-reviews » i686,client,el6,inkernel #4604
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 168a535a8582cd56b2832fd689bdad10219c40ca)

Result = SUCCESS
Jinshan Xiong : 168a535a8582cd56b2832fd689bdad10219c40ca
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 31/Mar/12 ]

Integrated in lustre-reviews » x86_64,client,el5,inkernel #4604
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 168a535a8582cd56b2832fd689bdad10219c40ca)

Result = SUCCESS
Jinshan Xiong : 168a535a8582cd56b2832fd689bdad10219c40ca
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 31/Mar/12 ]

Integrated in lustre-reviews » x86_64,client,el6,inkernel #4604
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 168a535a8582cd56b2832fd689bdad10219c40ca)

Result = SUCCESS
Jinshan Xiong : 168a535a8582cd56b2832fd689bdad10219c40ca
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 31/Mar/12 ]

Integrated in lustre-reviews » i686,server,el5,inkernel #4604
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 168a535a8582cd56b2832fd689bdad10219c40ca)

Result = SUCCESS
Jinshan Xiong : 168a535a8582cd56b2832fd689bdad10219c40ca
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 31/Mar/12 ]

Integrated in lustre-reviews » i686,server,el6,inkernel #4604
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 168a535a8582cd56b2832fd689bdad10219c40ca)

Result = SUCCESS
Jinshan Xiong : 168a535a8582cd56b2832fd689bdad10219c40ca
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 31/Mar/12 ]

Integrated in lustre-reviews » x86_64,server,el5,inkernel #4604
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 168a535a8582cd56b2832fd689bdad10219c40ca)

Result = SUCCESS
Jinshan Xiong : 168a535a8582cd56b2832fd689bdad10219c40ca
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 31/Mar/12 ]

Integrated in lustre-reviews » i686,client,el5,inkernel #4604
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 168a535a8582cd56b2832fd689bdad10219c40ca)

Result = SUCCESS
Jinshan Xiong : 168a535a8582cd56b2832fd689bdad10219c40ca
Files :

  • lustre/obdclass/cl_lock.c
Comment by Build Master (Inactive) [ 31/Mar/12 ]

Integrated in lustre-reviews » x86_64,server,el6,inkernel #4604
LU-1188 clio: acquire lockdep for cl_lock_peek() (Revision 168a535a8582cd56b2832fd689bdad10219c40ca)

Result = SUCCESS
Jinshan Xiong : 168a535a8582cd56b2832fd689bdad10219c40ca
Files :

  • lustre/obdclass/cl_lock.c
Comment by Cory Spitz [ 16/May/12 ]

Fix under review at http://review.whamcloud.com/#change,2422

Generated at Sat Feb 10 01:14:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.