[LU-6409] sleeping while atomic in nodemap_destroy - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.8.0
Affects Version/s: None
Labels:
- patch

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

nodemap_destroy seems to be using mutexes when called under spinlock/rwlock.

Here's the relevant error message:

[ 6003.169249] Lustre: server umount lustre-OST0001 complete
[ 6007.190173] BUG: sleeping function called from invalid context at kernel/mutex.c:576
[ 6007.190482] in_atomic(): 1, irqs_disabled(): 0, pid: 20641, name: rmmod
[ 6007.190659] 1 lock held by rmmod/20641:
[ 6007.190805]  #0:  (&new_bkts[i]->hsb_lock.rw){......}, at: [<ffffffffa015151d>] cfs_hash_rw_lock+0xd/0x20 [libcfs]
[ 6007.191177] CPU: 3 PID: 20641 Comm: rmmod Tainted: GF       W  O--------------   3.10.0-debug #5
[ 6007.191480] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 6007.191648]  ffff880083aac540 000000008428f159 ffff8800967e7d50 ffffffff816ccb68
[ 6007.191954]  ffff8800967e7d68 ffffffff810a8fd9 ffff8800a8d2b0c0 ffff8800967e7de8
[ 6007.192275]  ffffffff816d0dd5 0000000000000000 ffff8800967e7de0 0000000000000246
[ 6007.192582] Call Trace:
[ 6007.192716]  [<ffffffff816ccb68>] dump_stack+0x19/0x1b
[ 6007.192879]  [<ffffffff810a8fd9>] __might_sleep+0xe9/0x110
[ 6007.193063]  [<ffffffff816d0dd5>] mutex_lock_nested+0x55/0x490
[ 6007.193281]  [<ffffffffa057cfa4>] ? nodemap_destroy+0x74/0x1f0 [ptlrpc]
[ 6007.193491]  [<ffffffffa05812ab>] nm_member_reclassify_nodemap+0x1b/0x40 [ptlrpc]
[ 6007.193799]  [<ffffffffa057cfbc>] nodemap_destroy+0x8c/0x1f0 [ptlrpc]
[ 6007.193993]  [<ffffffffa057d14d>] nodemap_putref+0x2d/0xa0 [ptlrpc]
[ 6007.194201]  [<ffffffffa057e1f5>] nodemap_hs_put_locked+0x15/0x20 [ptlrpc]
[ 6007.194392]  [<ffffffffa0151f25>] cfs_hash_bd_del_locked+0x75/0xf0 [libcfs]
[ 6007.194578]  [<ffffffffa015579c>] cfs_hash_putref+0x1cc/0x500 [libcfs]
[ 6007.194771]  [<ffffffffa057e22a>] nodemap_cleanup_all+0x2a/0x30 [ptlrpc]
[ 6007.194967]  [<ffffffffa057e23e>] nodemap_mod_exit+0xe/0x20 [ptlrpc]
[ 6007.195177]  [<ffffffffa0583bbd>] ptlrpc_exit+0x9/0x44c [ptlrpc]
[ 6007.195354]  [<ffffffff810e769b>] SyS_delete_module+0x16b/0x2d0
[ 6007.195526]  [<ffffffff816df369>] system_call_fastpath+0x16/0x1b

Attachments

Issue Links

is related to

LU-7811 BUG: sleeping function called from invalid context at mm/slub.c:941

Resolved

is related to

LU-5092 nodemap: transfer idmaps between MGS, MDT, OST

Resolved

Activity

[LU-6409] sleeping while atomic in nodemap_destroy

Peter Jones added a comment - 29/Jun/15 10:50 PM

Landed for 2.8

Peter Jones added a comment - 29/Jun/15 10:50 PM Landed for 2.8

Gerrit Updater added a comment - 29/Jun/15 10:06 PM

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14275/
Subject: ~~LU-6409~~ nodemap: don't destroy nodemaps in cfs_hash_for_each
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6299d78713ef2b6f69f6dd046256aa91a85147a8

Gerrit Updater added a comment - 29/Jun/15 10:06 PM Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14275/ Subject: LU-6409 nodemap: don't destroy nodemaps in cfs_hash_for_each Project: fs/lustre-release Branch: master Current Patch Set: Commit: 6299d78713ef2b6f69f6dd046256aa91a85147a8

Oleg Drokin added a comment - 29/Jun/15 8:59 PM

ah, I see it's likely a bug in http://review.whamcloud.com/#/c/11813/8

Oleg Drokin added a comment - 29/Jun/15 8:59 PM ah, I see it's likely a bug in http://review.whamcloud.com/#/c/11813/8

Oleg Drokin added a comment - 29/Jun/15 8:53 PM

I tried the three patches from ~~LU-5092~~plus the one patch from this ticket and hit a deadlock when running sanity 133g:

[15921.648643] LustreError: 7937:0:(lproc_fid.c:182:lprocfs_server_fid_width_seq_write()) Skipped 17 previous similar messages
[16080.884140] INFO: task badarea_io:7981 blocked for more than 120 seconds.
[16080.884666]       Tainted: P           ---------------    2.6.32-rhe6.6-debug #1
[16080.885381] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[16080.886109] badarea_io    D 0000000000000002  2792  7981   7003 0x00000000
[16080.886582]  ffff880046b5fd78 0000000000000086 0000000000000000 ffffffff8114e4a0
[16080.887360]  0000000000000000 ffffffff8104e658 00000e7b1ce2fe1b ffff880046b5fd58
[16080.888121]  ffffffffa1395a20 00000001003b97ab ffff88005f2708f8 ffff880046b5ffd8
[16080.888904] Call Trace:
[16080.889234]  [<ffffffff8114e4a0>] ? handle_pte_fault+0x90/0xa50
[16080.889663]  [<ffffffff8104e658>] ? __change_page_attr_set_clr+0x808/0xcc0
[16080.890154]  [<ffffffffa1395a20>] ? lprocfs_single_release+0x0/0x10 [obdclass]
[16080.890876]  [<ffffffff8151ff50>] __mutex_lock_slowpath+0x120/0x2e0
[16080.891364]  [<ffffffffa166d8d0>] ? nodemap_del_nodemap_write+0x0/0xc0 [ptlrpc]
[16080.892098]  [<ffffffff81520141>] mutex_lock+0x31/0x50
[16080.892598]  [<ffffffffa16699bc>] nodemap_del+0x3c/0x1c0 [ptlrpc]
[16080.893118]  [<ffffffffa166d8d0>] ? nodemap_del_nodemap_write+0x0/0xc0 [ptlrpc]
[16080.896192]  [<ffffffffa166d975>] nodemap_del_nodemap_write+0xa5/0xc0 [ptlrpc]
[16080.896810]  [<ffffffff811fb8c5>] proc_reg_write+0x85/0xc0
[16080.897094]  [<ffffffff8118f4a8>] vfs_write+0xb8/0x1a0
[16080.897388]  [<ffffffff8118fde1>] sys_write+0x51/0x90
[16080.897719]  [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b

Oleg Drokin added a comment - 29/Jun/15 8:53 PM I tried the three patches from LU-5092 plus the one patch from this ticket and hit a deadlock when running sanity 133g: [15921.648643] LustreError: 7937:0:(lproc_fid.c:182:lprocfs_server_fid_width_seq_write()) Skipped 17 previous similar messages [16080.884140] INFO: task badarea_io:7981 blocked for more than 120 seconds. [16080.884666] Tainted: P --------------- 2.6.32-rhe6.6-debug #1 [16080.885381] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [16080.886109] badarea_io D 0000000000000002 2792 7981 7003 0x00000000 [16080.886582] ffff880046b5fd78 0000000000000086 0000000000000000 ffffffff8114e4a0 [16080.887360] 0000000000000000 ffffffff8104e658 00000e7b1ce2fe1b ffff880046b5fd58 [16080.888121] ffffffffa1395a20 00000001003b97ab ffff88005f2708f8 ffff880046b5ffd8 [16080.888904] Call Trace: [16080.889234] [<ffffffff8114e4a0>] ? handle_pte_fault+0x90/0xa50 [16080.889663] [<ffffffff8104e658>] ? __change_page_attr_set_clr+0x808/0xcc0 [16080.890154] [<ffffffffa1395a20>] ? lprocfs_single_release+0x0/0x10 [obdclass] [16080.890876] [<ffffffff8151ff50>] __mutex_lock_slowpath+0x120/0x2e0 [16080.891364] [<ffffffffa166d8d0>] ? nodemap_del_nodemap_write+0x0/0xc0 [ptlrpc] [16080.892098] [<ffffffff81520141>] mutex_lock+0x31/0x50 [16080.892598] [<ffffffffa16699bc>] nodemap_del+0x3c/0x1c0 [ptlrpc] [16080.893118] [<ffffffffa166d8d0>] ? nodemap_del_nodemap_write+0x0/0xc0 [ptlrpc] [16080.896192] [<ffffffffa166d975>] nodemap_del_nodemap_write+0xa5/0xc0 [ptlrpc] [16080.896810] [<ffffffff811fb8c5>] proc_reg_write+0x85/0xc0 [16080.897094] [<ffffffff8118f4a8>] vfs_write+0xb8/0x1a0 [16080.897388] [<ffffffff8118fde1>] sys_write+0x51/0x90 [16080.897719] [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b

Kit Westneat (Inactive) added a comment - 09/Jun/15 6:27 PM

Ok, I reported the potential lock inversion as ~~LU-6700~~.

I updated the nodemap patches and no longer get the warning messages.

Kit Westneat (Inactive) added a comment - 09/Jun/15 6:27 PM Ok, I reported the potential lock inversion as LU-6700 . I updated the nodemap patches and no longer get the warning messages.

Oleg Drokin added a comment - 09/Jun/15 4:52 AM

Sure, feel free to report the other ones you see too.

Oleg Drokin added a comment - 09/Jun/15 4:52 AM Sure, feel free to report the other ones you see too.

Kit Westneat (Inactive) added a comment - 08/Jun/15 4:53 PM

I finally got the debug kernel working and reproduced the warning you saw. Should I report any other non-nodemap warnings I find? There is a lock inversion warning with s_dquot.dqio_mutex and i_data_sem that I couldn't find any Jira tickets about, but I'm only getting it during mkfs so far.

Kit Westneat (Inactive) added a comment - 08/Jun/15 4:53 PM I finally got the debug kernel working and reproduced the warning you saw. Should I report any other non-nodemap warnings I find? There is a lock inversion warning with s_dquot.dqio_mutex and i_data_sem that I couldn't find any Jira tickets about, but I'm only getting it during mkfs so far.

Oleg Drokin added a comment - 04/Jun/15 1:10 AM

You need to use rhel7 for this, there's a new kernel option you need to enable:

CONFIG_DEBUG_ATOMIC_SLEEP=y

Turn this on and run something like sanity, you'll get a bunch of warnings. Also apply patch from ~~LU-6447~~ (I know it's imperfect) to quiet down unrelated warnings.

Oleg Drokin added a comment - 04/Jun/15 1:10 AM You need to use rhel7 for this, there's a new kernel option you need to enable: CONFIG_DEBUG_ATOMIC_SLEEP=y Turn this on and run something like sanity, you'll get a bunch of warnings. Also apply patch from LU-6447 (I know it's imperfect) to quiet down unrelated warnings.

Kit Westneat (Inactive) added a comment - 03/Jun/15 3:13 PM

Ah ok, I see what's going on, I'll update the patches I have in gerrit.

Can you describe what tests/config options you are running to get these warnings? I am not getting them on my build and I don't want to waste your time if I can catch them before I submit.

Kit Westneat (Inactive) added a comment - 03/Jun/15 3:13 PM Ah ok, I see what's going on, I'll update the patches I have in gerrit. Can you describe what tests/config options you are running to get these warnings? I am not getting them on my build and I don't want to waste your time if I can catch them before I submit.

Oleg Drokin added a comment - 03/Jun/15 4:13 AM - edited

Hm, so I have several patches from you:
93ccc4a ~~LU-5092~~ nodemap: add structure to hold nodemap config
e4d2197 ~~LU-5092~~ nodemap: convert member hash to a list
3f0bb87 ~~LU-6409~~ nodemap: don't destroy nodemaps in cfs_hash_for_each

And this is not enough.

I still get this in the logs:

leeping function called from invalid context at kernel/mutex.c:576
May 30 22:19:09 centos6-0 kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 9196,
 name: mdt00_002
May 30 22:19:09 centos6-0 kernel: 2 locks held by mdt00_002/9196:
May 30 22:19:09 centos6-0 kernel: #0:  (active_config_lock){......}, at: [<fffff
fffa05cb147>] nodemap_lock_active_ranges+0x17/0x30 [ptlrpc]
May 30 22:19:09 centos6-0 kernel: #1:  (&config->nmc_range_tree_lock){......}, a
t: [<ffffffffa05cb157>] nodemap_lock_active_ranges+0x27/0x30 [ptlrpc] PID: 9196 Comm: mdt00_002 Tainted: GF       W  O--------------   3.10.0-debug #
re name: Red Hat KVM, BIOS 0.5.1 01/01/2011
0019366680 00000000b6a08141 ffff88003ac9b9a0 ffffffff816ccb68
003ac9b9b8 ffffffff810a8fd9 ffff8800aa50
0e00 ffff88003ac9ba40
ff816d0dd5 0000000000000000 0000000000000002 0000000000000000
race:
ffff816ccb68>] dump_stack+0x19/0x1b
ffff810a8fd9>] __might_sleep+0xe9/0x110
ffff816d0dd5>] mutex_lock_nested+0x55/0x490
ffffa05cb157>] ? nodemap_lock_active_ranges+0x27/0x30 [ptlrpc]
ffffa05cfbe8>] nm_member_add+0x78/0x180 [ptlrpc]
ffffa05cc80d>] nodemap_add_member+0x2d/0x50 [ptlrpc]
ffffa0b147f3>] mdt_obd_connect+0x3b3/0x760 [mdt]
ffffa01c4a87>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
ffffa051ded3>] target_handle_connect+0x12c3/0x2be0 [ptlrpc]
ffff81363263>] ? number.isra.2+0x323/0x360
ffffa02dbcbd>] ? keys_fill+0x1d/0x130 [obdclass]
ffffa02dbcd2>] ? keys_fill+0x32/0x130 [obdclass]
ffffa05b445f>] tgt_request_handle+0x3ef/0x1070 [ptlrpc]
ffffa0565c01>] ptlrpc_server_handle_request+0x231/0xab0 [ptlrpc]
ffffa0563768>] ? ptlrpc_wait_event+0xb8/0x370 [ptlrpc]
ffffa0569b70>] ptlrpc_main+0xae0/0x1ee0 [ptlrpc]
ffffa0569090>] ? ptlrpc_register_service+0xf30/0xf30 [ptlrpc]
ffff8109c00a>] kthread+0xea/0xf0
ffff8109bf20>] ? kthread_create_on_node+0x140/0x140
ffff816df2bc>] ret_from_fork+0x7c/0xb0
ffff8109bf20>] ? kthread_create_on_node+0x140/0x140

seemingly from every target.

Oleg Drokin added a comment - 03/Jun/15 4:13 AM - edited Hm, so I have several patches from you: 93ccc4a LU-5092 nodemap: add structure to hold nodemap config e4d2197 LU-5092 nodemap: convert member hash to a list 3f0bb87 LU-6409 nodemap: don't destroy nodemaps in cfs_hash_for_each And this is not enough. I still get this in the logs: leeping function called from invalid context at kernel/mutex.c:576 May 30 22:19:09 centos6-0 kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 9196, name: mdt00_002 May 30 22:19:09 centos6-0 kernel: 2 locks held by mdt00_002/9196: May 30 22:19:09 centos6-0 kernel: #0: (active_config_lock){......}, at: [<fffff fffa05cb147>] nodemap_lock_active_ranges+0x17/0x30 [ptlrpc] May 30 22:19:09 centos6-0 kernel: #1: (&config->nmc_range_tree_lock){......}, a t: [<ffffffffa05cb157>] nodemap_lock_active_ranges+0x27/0x30 [ptlrpc] PID: 9196 Comm: mdt00_002 Tainted: GF W O-------------- 3.10.0-debug # re name: Red Hat KVM, BIOS 0.5.1 01/01/2011 0019366680 00000000b6a08141 ffff88003ac9b9a0 ffffffff816ccb68 003ac9b9b8 ffffffff810a8fd9 ffff8800aa50 0e00 ffff88003ac9ba40 ff816d0dd5 0000000000000000 0000000000000002 0000000000000000 race: ffff816ccb68>] dump_stack+0x19/0x1b ffff810a8fd9>] __might_sleep+0xe9/0x110 ffff816d0dd5>] mutex_lock_nested+0x55/0x490 ffffa05cb157>] ? nodemap_lock_active_ranges+0x27/0x30 [ptlrpc] ffffa05cfbe8>] nm_member_add+0x78/0x180 [ptlrpc] ffffa05cc80d>] nodemap_add_member+0x2d/0x50 [ptlrpc] ffffa0b147f3>] mdt_obd_connect+0x3b3/0x760 [mdt] ffffa01c4a87>] ? libcfs_debug_msg+0x57/0x80 [libcfs] ffffa051ded3>] target_handle_connect+0x12c3/0x2be0 [ptlrpc] ffff81363263>] ? number.isra.2+0x323/0x360 ffffa02dbcbd>] ? keys_fill+0x1d/0x130 [obdclass] ffffa02dbcd2>] ? keys_fill+0x32/0x130 [obdclass] ffffa05b445f>] tgt_request_handle+0x3ef/0x1070 [ptlrpc] ffffa0565c01>] ptlrpc_server_handle_request+0x231/0xab0 [ptlrpc] ffffa0563768>] ? ptlrpc_wait_event+0xb8/0x370 [ptlrpc] ffffa0569b70>] ptlrpc_main+0xae0/0x1ee0 [ptlrpc] ffffa0569090>] ? ptlrpc_register_service+0xf30/0xf30 [ptlrpc] ffff8109c00a>] kthread+0xea/0xf0 ffff8109bf20>] ? kthread_create_on_node+0x140/0x140 ffff816df2bc>] ret_from_fork+0x7c/0xb0 ffff8109bf20>] ? kthread_create_on_node+0x140/0x140 seemingly from every target.

People

Assignee:: WC Triage

Reporter:: Oleg Drokin

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 28/Mar/15 1:09 AM

Updated:: 23/Mar/17 7:33 PM

Resolved:: 29/Jun/15 10:50 PM