[LU-16734] kernel warning in key_task_permission() leading to stuck resources Created: 13/Apr/23  Updated: 30/May/23  Resolved: 22/Apr/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Aurelien Degremont Assignee: Aurelien Degremont
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

On Ubuntu 22.04, when some keyring resources are getting cleared, a warning message is displayed instead, and the clearing is not successful, leading into a misbehaving filesystem and regular stack traces being printed in logs.

WARNING: CPU: 44 PID: 305468 at security/keys/permission.c:35 key_task_permission+0xa5/0x150
CPU: 44 PID: 305468 Comm: kworker/u448:1 Tainted: P        W  OE     5.15.0-69-generic #76-Ubuntu
Workqueue: ptlrpc_pinger ptlrpc_pinger_main [ptlrpc]
RIP: 0010:key_task_permission+0xa5/0x150
Call Trace:
 <TASK>
 lookup_user_key+0xf4/0x700
 ? key_validate+0x50/0x50
 request_key_unlink+0x230/0x330 [ptlrpc_gss]
 gss_sec_lookup_ctx_kr+0xa0c/0xd0c [ptlrpc_gss]
 get_my_ctx+0x5f/0x140 [ptlrpc]
 sptlrpc_req_get_ctx+0x15a/0x280 [ptlrpc]
 ptlrpc_request_bufs_pack+0x283/0x6a0 [ptlrpc]
 ptlrpc_request_alloc_pack+0x3a/0x70 [ptlrpc]
 ptlrpc_pinger_main+0x893/0xab0 [ptlrpc]
 process_one_work+0x228/0x3d0
 worker_thread+0x53/0x420
 ? process_one_work+0x3d0/0x3d0
 kthread+0x127/0x150
 ? set_kthread_struct+0x50/0x50
 ret_from_fork+0x1f/0x30
 </TASK>

This is because in Linux 5.8, in commit 8c0637e950d68933a67f7438f779d79b049b5e5c, lookup_user_key() API was changed and requires different parameters.

 



 Comments   
Comment by Gerrit Updater [ 13/Apr/23 ]

"Aurelien Degremont <adegremont@nvidia.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50623
Subject: LU-16734 gss: fix lookup_user_key() bug
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4e98d42e68191b8078779432ec2870baa0477431

Comment by Andreas Dilger [ 13/Apr/23 ]

API changes that don't break the compilation are evil, and this is really a bug in the original patch, but hindsight is 20/20. 

Comment by Aurelien Degremont [ 13/Apr/23 ]

>  API changes that don't break the compilation are evil, and this is really a bug in the original patch, but hindsight is 20/20.

Actually it would have break compilation, but we have

/* from Linux security/keys/internal.h: */
#ifndef KEY_LOOKUP_FOR_UNLINK
#define KEY_LOOKUP_FOR_UNLINK           0x04
#endif

in the code that masked it. We have that because this define would have require us to include an internal header.

Comment by Gerrit Updater [ 22/Apr/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50623/
Subject: LU-16734 gss: fix lookup_user_key() bug
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 013a6711503045b9e7154b8ff786ee85cdc3ecdd

Comment by Peter Jones [ 22/Apr/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:29:33 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.