[LU-811] possible recursive locking detected on client Created: 01/Nov/11  Updated: 04/Jun/12  Resolved: 04/Jun/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Ned Bass Assignee: Jinshan Xiong (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

RHEL 6.2
kernel 2.6.32-207.1chaos.ch5.x86_64.debug


Issue Links:
Related
is related to LU-619 Recursive locking in ldlm_lock_change... Resolved
Severity: 3
Rank (Obsolete): 6531

 Description   

Running Lustre 2.1 client on a debug kernel we got the following warning from the lock validator. I suspect this may be a false alarm since we didn't deadlock and the cl_lockset comments suggest holding multiple locks is unavoidable in some cases. Reporting here just in case this is a real bug.

=============================================
[ INFO: possible recursive locking detected ]
2.6.32-207.1chaos.ch5.x86_64.debug #1
---------------------------------------------
sh/5936 is trying to acquire lock:
 (EXT){+.+.+.}, at: [<ffffffffa05f18c0>] cl_lock_lockdep_acquire+0x0/0x50 [obdclass]

but task is already holding lock:
 (EXT){+.+.+.}, at: [<ffffffffa05f18c0>] cl_lock_lockdep_acquire+0x0/0x50 [obdclass]

other info that might help us debug this:
2 locks held by sh/5936:
 #0:  (&lli->lli_trunc_sem){.+.+.+}, at: [<ffffffffa08ca118>]
ll_file_io_generic+0x2c8/0x580 [lustre]
 #1:  (EXT){+.+.+.}, at: [<ffffffffa05f18c0>] cl_lock_lockdep_acquire+0x0/0x50
[obdclass]

stack backtrace:
Pid: 5936, comm: sh Not tainted 2.6.32-207.1chaos.ch5.x86_64.debug #1
Call Trace:
 [<ffffffff810af570>] ? __lock_acquire+0x11c0/0x1570
 [<ffffffff81013753>] ? native_sched_clock+0x13/0x60
 [<ffffffff81012c29>] ? sched_clock+0x9/0x10
 [<ffffffff8109d37d>] ? sched_clock_cpu+0xcd/0x110
 [<ffffffff810af9c4>] ? lock_acquire+0xa4/0x120
 [<ffffffffa05f18c0>] ? cl_lock_lockdep_acquire+0x0/0x50 [obdclass]
 [<ffffffffa05f18fd>] ? cl_lock_lockdep_acquire+0x3d/0x50 [obdclass]
 [<ffffffffa05f18c0>] ? cl_lock_lockdep_acquire+0x0/0x50 [obdclass]
 [<ffffffffa05f68e9>] ? cl_lock_request+0x1e9/0x200 [obdclass]
 [<ffffffff810adc9d>] ? trace_hardirqs_on_caller+0x14d/0x190
 [<ffffffffa0918d40>] ? cl_glimpse_lock+0x180/0x390 [lustre]
 [<ffffffffa08df942>] ? ll_inode_size_unlock+0x52/0xf0 [lustre]
 [<ffffffff8151fabb>] ? _spin_unlock+0x2b/0x40
 [<ffffffffa091cda6>] ? ccc_prep_size+0x1c6/0x280 [lustre]
 [<ffffffff810adc9d>] ? trace_hardirqs_on_caller+0x14d/0x190
 [<ffffffffa091b691>] ? cl2ccc_io+0x21/0x80 [lustre]
 [<ffffffffa0921faf>] ? vvp_io_read_start+0xbf/0x3d0 [lustre]
 [<ffffffffa05f3525>] ? cl_wait+0xb5/0x290 [obdclass]
 [<ffffffffa05f6ec8>] ? cl_io_start+0x68/0x170 [obdclass]
 [<ffffffffa05fb930>] ? cl_io_loop+0x110/0x1c0 [obdclass]
 [<ffffffffa08ca217>] ? ll_file_io_generic+0x3c7/0x580 [lustre]
 [<ffffffffa04c9c22>] ? cfs_hash_rw_unlock+0x12/0x30 [libcfs]
 [<ffffffffa04c8754>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
 [<ffffffffa05ea8f9>] ? cl_env_get+0x29/0x350 [obdclass]
 [<ffffffffa08cf37c>] ? ll_file_aio_read+0x13c/0x310 [lustre]
 [<ffffffffa05eaa6d>] ? cl_env_get+0x19d/0x350 [obdclass]
 [<ffffffff81042c54>] ? __do_page_fault+0x244/0x4e0
 [<ffffffffa08cf6c1>] ? ll_file_read+0x171/0x310 [lustre]
 [<ffffffff8109d37d>] ? sched_clock_cpu+0xcd/0x110
 [<ffffffff810aa27d>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff8109d4af>] ? cpu_clock+0x6f/0x80
 [<ffffffff81192875>] ? vfs_read+0xb5/0x1a0
 [<ffffffff811929b1>] ? sys_read+0x51/0x90
 [<ffffffff8100b0b2>] ? system_call_fastpath+0x16/0x1b

This is the test that was running at the time (while debugging a cgroup-related problem):

 CGROUP_DIR=$(lssubsys -m memory | cut -d' ' -f2)
 P=$CGROUP_DIR/test

 move_current_to_cgroup() {
        echo > $1/tasks
 }

 clean_up_all() {
        move_current_to_cgroup $CGROUP_DIR
        rm ./tmpfile
        rmdir $CGROUP_DIR/test/A
        rmdir $CGROUP_DIR/test
        exit 1
 }

 trap clean_up_all INT

 mkdir $P
 echo > $P/tasks

 while sleep 1; do
        date
        T=$P/A       
        mkdir $T
        move_current_to_cgroup $T
        echo 300M > $T/memory.limit_in_bytes
        cat /proc/self/cgroup
        dd if=/dev/zero of=./tmpfile bs=4096 count=100000
        move_current_to_cgroup $P
        cat /proc/self/cgroup
        echo 0 > $T/memory.force_empty
        rmdir $T
        rm ./tmpfile
 done


 Comments   
Comment by Jinshan Xiong (Inactive) [ 01/Nov/11 ]

Can you please apply the patch set at http://review.whamcloud.com/#change,1281,patchset=3, and try again. Thanks.

Comment by Ned Bass [ 02/Nov/11 ]

Hi Jinshan,

I'm testing with the patch set now. So far, I haven't been able to reproduce the exact warning as above, with or without the patch. However, I've seen the following similar warning at mount time, both with and without the patch. (I've also encountered a circular locking warning that should probably be tracked in a separate issue.)

=============================================
[ INFO: possible recursive locking detected ]
2.6.32-207.1chaos.ch5.x86_64.debug #1
---------------------------------------------
ll_cfg_requeue/3038 is trying to acquire lock:
 (&cli->cl_sem){.+.+.+}, at: [<ffffffffa068120e>] sptlrpc_conf_client_adapt+0x6e/0x200 [ptlrpc]

but task is already holding lock:
 (&cli->cl_sem){.+.+.+}, at: [<ffffffffa08750d9>] do_requeue+0x39/0x190 [mgc]

other info that might help us debug this:
2 locks held by ll_cfg_requeue/3038:
 #0:  (&cli->cl_sem){.+.+.+}, at: [<ffffffffa08750d9>] do_requeue+0x39/0x190 [mgc]
 #1:  (&cld->cld_lock){+.+.+.}, at: [<ffffffffa08734bd>] mgc_process_log+0x5d/0x1430 [mgc]

stack backtrace:
Pid: 3038, comm: ll_cfg_requeue Tainted: G        W  ----------------   2.6.32-207.1chaos.ch5.x86_64.debug #1
Call Trace:
 [<ffffffff810af570>] ? __lock_acquire+0x11c0/0x1570
 [<ffffffff81012c29>] ? sched_clock+0x9/0x10
 [<ffffffff8109d245>] ? sched_clock_local+0x25/0x90
 [<ffffffff8109d368>] ? sched_clock_cpu+0xb8/0x110
 [<ffffffff810af9c4>] ? lock_acquire+0xa4/0x120
 [<ffffffffa068120e>] ? sptlrpc_conf_client_adapt+0x6e/0x200 [ptlrpc]
 [<ffffffff81038458>] ? pvclock_clocksource_read+0x58/0xd0
 [<ffffffff8151e4c1>] ? down_read+0x51/0xa0
 [<ffffffffa068120e>] ? sptlrpc_conf_client_adapt+0x6e/0x200 [ptlrpc]
 [<ffffffff81012c29>] ? sched_clock+0x9/0x10
 [<ffffffffa068120e>] ? sptlrpc_conf_client_adapt+0x6e/0x200 [ptlrpc]
 [<ffffffff8109d368>] ? sched_clock_cpu+0xb8/0x110
 [<ffffffffa07ab7cc>] ? mdc_set_info_async+0x67c/0x9d0 [mdc]
 [<ffffffff810ad39d>] ? lock_release_holdtime+0x3d/0x190
 [<ffffffffa0523272>] ? class_notify_sptlrpc_conf+0x212/0x540 [obdclass]
 [<ffffffffa08740f7>] ? mgc_process_log+0xc97/0x1430 [mgc]
 [<ffffffffa086fae0>] ? mgc_blocking_ast+0x0/0x4b0 [mgc]
 [<ffffffffa0620e20>] ? ldlm_completion_ast+0x0/0x740 [ptlrpc]
 [<ffffffffa0875106>] ? do_requeue+0x66/0x190 [mgc]
 [<ffffffffa0875438>] ? mgc_requeue_thread+0x208/0x640 [mgc]
 [<ffffffff810adc9d>] ? trace_hardirqs_on_caller+0x14d/0x190
 [<ffffffff81062370>] ? default_wake_function+0x0/0x20
 [<ffffffffa0875230>] ? mgc_requeue_thread+0x0/0x640 [mgc]
 [<ffffffff810adc9d>] ? trace_hardirqs_on_caller+0x14d/0x190
 [<ffffffffa0875230>] ? mgc_requeue_thread+0x0/0x640 [mgc]
 [<ffffffff8100c20a>] ? child_rip+0xa/0x20
 [<ffffffff8100bb50>] ? restore_args+0x0/0x30
 [<ffffffffa0875230>] ? mgc_requeue_thread+0x0/0x640 [mgc]
 [<ffffffff8100c200>] ? child_rip+0x0/0x20
Comment by Jinshan Xiong (Inactive) [ 07/Nov/11 ]

Hi Ned,

It seems all right because they are acquiring cl_sem for different obd, the 1st cl_sem is for MGC and the 2nd one is for MDC. This may be not good but let's ignore it unless we can find another path to grab cl_sems in reverse order.

Comment by Peter Jones [ 04/Jun/12 ]

Does not seem to be a bug. Will reopen if this occurs again and causes a problem in production

Generated at Sat Feb 10 01:10:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.