[LU-13088] sleeping function in target_recovery_overseer Created: 19/Dec/19  Updated: 09/Dec/20  Resolved: 10/Jan/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.14.0, Lustre 2.12.6

Type: Bug Priority: Minor
Reporter: Neil Brown Assignee: Neil Brown
Resolution: Fixed Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Test sometimes reports the "BUG" below.

This was introduced in Commit b32e55b600ca ("LU-7638 recovery: do not abort update recovery.") and is easily fixed by dropping the spinlock while waiting.

 

[ 1705.879495] BUG: sleeping function called from invalid context at /home/green/git/lustre-release/lustre/ptlrpc/../../lustre/ldlm/ldlm_lib.c:2124
[ 1705.885402] in_atomic(): 1, irqs_disabled(): 0, pid: 6663, name: tgt_recover_0
[ 1705.886874] CPU: 3 PID: 6663 Comm: tgt_recover_0 Kdump: loaded Tainted: P OE ------------ 3.10.0-7.7-debug #1
[ 1705.889723] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 1705.890830] Call Trace:
[ 1705.891378] [<ffffffff817d1711>] dump_stack+0x19/0x1b
[ 1705.892744] [<ffffffff810c71f9>] __might_sleep+0xd9/0x100
[ 1705.894273] [<ffffffffa061af13>] target_recovery_overseer+0x4a3/0x6d0 [ptlrpc]
[ 1705.895782] [<ffffffffa0618140>] ? libcfs_nid2str+0x20/0x20 [ptlrpc]
[ 1705.897142] [<ffffffffa01efce7>] ? __cfs_fail_timeout_set+0x1a7/0x220 [libcfs]
[ 1705.899177] [<ffffffffa0623172>] replay_request_or_update.isra.22+0xf2/0x8c0 [ptlrpc]
[ 1705.900844] [<ffffffffa0623940>] ? replay_request_or_update.isra.22+0x8c0/0x8c0 [ptlrpc]
[ 1705.903035] [<ffffffffa0623fa5>] target_recovery_thread+0x665/0x10c0 [ptlrpc]
[ 1705.904882] [<ffffffffa0623940>] ? replay_request_or_update.isra.22+0x8c0/0x8c0 [ptlrpc]
[ 1705.906752] [<ffffffff810b8254>] kthread+0xe4/0xf0
[ 1705.907788] [<ffffffff810b8170>] ? kthread_create_on_node+0x140/0x140
[ 1705.908970] [<ffffffff817e5ddd>] ret_from_fork_nospec_begin+0x7/0x21
[ 1705.910492] [<ffffffff810b8170>] ? kthread_create_on_node+0x140/0x140



 Comments   
Comment by Gerrit Updater [ 19/Dec/19 ]

Neil Brown (neilb@suse.de) uploaded a new patch: https://review.whamcloud.com/37063
Subject: LU-13088 ldlm: Fix sleeping function called in atomic
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b554aa54f75336f86010ca662a2785376ebfad7b

Comment by Gerrit Updater [ 10/Jan/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37063/
Subject: LU-13088 ldlm: Fix sleeping function called in atomic
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: b29b9310dafe17ba78e1db490b79b89d2d6fdcd1

Comment by Peter Jones [ 10/Jan/20 ]

Landed for 2.14

Comment by Gerrit Updater [ 06/Jul/20 ]

Li Dongyang (dongyangli@ddn.com) uploaded a new patch: https://review.whamcloud.com/39283
Subject: LU-13088 ldlm: Fix sleeping function called in atomic
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 53f8e7492fe20b76760af813fb43e03158a99d2f

Comment by Gerrit Updater [ 07/Aug/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39283/
Subject: LU-13088 ldlm: Fix sleeping function called in atomic
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 8506d320af48aa16dab1c60d1ff134af17040ffc

Generated at Sat Feb 10 02:58:16 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.