[LU-16144] OST crash at umount in ptlrpc_nrs_req_stop_nolock (with TBF policy). Created: 08/Sep/22 Updated: 19/Jun/23 Resolved: 04/Oct/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Etienne Aujames | Assignee: | Etienne Aujames |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | tbf | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
OST calltrace: [5839915.258394] BUG: unable to handle kernel NULL pointer dereference at 0000000000000114 [5839915.260256] IP: [<ffffffffc0d9e965>] ptlrpc_nrs_req_stop_nolock+0x5/0x150 [ptlrpc] ..... [5839915.319008] [<ffffffffc0d6861b>] ? ptlrpc_server_finish_active_request+0x2b/0x140 [ptlrpc] [5839915.320846] [<ffffffffc0d68867>] ptlrpc_service_purge_all+0x137/0x920 [ptlrpc] [5839915.322159] [<ffffffffc0d6ac37>] ptlrpc_unregister_service+0xe7/0x6f0 [ptlrpc] [5839915.323521] [<ffffffffc09090f2>] ost_cleanup+0x52/0x1b0 [ost] [5839915.324585] [<ffffffffc0a4db2d>] class_free_dev+0x21d/0x720 [obdclass] [5839915.325761] [<ffffffffc0a4e220>] class_export_put+0x1f0/0x2c0 [obdclass] [5839915.327088] [<ffffffffc0a4fc95>] class_unlink_export+0x135/0x170 [obdclass] [5839915.328496] [<ffffffffc0a659e0>] class_decref+0x80/0x160 [obdclass] [5839915.329883] [<ffffffffc0a65e43>] class_detach+0x1b3/0x2e0 [obdclass] [5839915.331131] [<ffffffffc0a6ca48>] class_process_config+0x1a38/0x2830 [obdclass] [5839915.332602] [<ffffffffb08d3b0a>] ? complete+0x4a/0x60 [5839915.333756] [<ffffffffb0ba14fd>] ? list_del+0xd/0x30 [5839915.334904] [<ffffffffb0f814fe>] ? wait_for_completion+0x4e/0x140 [5839915.336336] [<ffffffffc0a6da20>] class_manual_cleanup+0x1e0/0x710 [obdclass] [5839915.337972] [<ffffffffc0a99835>] server_stop_servers+0xd5/0x160 [obdclass] [5839915.339302] [<ffffffffc0a9ef9d>] server_put_super+0x12d/0xd00 [obdclass] [5839915.340450] [<ffffffffb0a4d53d>] generic_shutdown_super+0x6d/0x100 [5839915.341528] [<ffffffffb0a4d942>] kill_anon_super+0x12/0x20 [5839915.342542] [<ffffffffc0a70852>] lustre_kill_super+0x32/0x50 [obdclass] [5839915.343693] [<ffffffffb0a4dd1e>] deactivate_locked_super+0x4e/0x70 [5839915.344791] [<ffffffffb0a4e4a6>] deactivate_super+0x46/0x60 [5839915.345863] [<ffffffffb0a6d03f>] cleanup_mnt+0x3f/0x80 [5839915.346952] [<ffffffffb0a6d0d2>] __cleanup_mnt+0x12/0x20 [5839915.347897] [<ffffffffb08c2e5b>] task_work_run+0xbb/0xe0 [5839915.348805] [<ffffffffb082cc65>] do_notify_resume+0xa5/0xc0 [5839915.349916] [<ffffffffb0f8e23b>] int_signal+0x12/0x17 ptlrpc_server_request_get() return NULL pointer in ptlrpc_service_purge_all(): ptlrpc_service_purge_all(struct ptlrpc_service *svc) .... while (ptlrpc_server_request_pending(svcpt, true)) { req = ptlrpc_server_request_get(svcpt, true); ptlrpc_server_finish_active_request(svcpt, req); } It seems that nrs_tbf_req_get does not implement force mode: static struct ptlrpc_nrs_request *nrs_tbf_req_get(struct ptlrpc_nrs_policy *policy, bool peek, bool force) { struct nrs_tbf_head *head = policy->pol_private; struct ptlrpc_nrs_request *nrq = NULL; struct nrs_tbf_client *cli; struct binheap_node *node; assert_spin_locked(&policy->pol_nrs->nrs_svcpt->scp_req_lock); if (!peek && policy->pol_nrs->nrs_throttling) <--------- return NULL; .... |
| Comments |
| Comment by Etienne Aujames [ 08/Sep/22 ] |
|
This issue could be linked to |
| Comment by Gerrit Updater [ 09/Sep/22 ] |
|
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/48494 |
| Comment by Gerrit Updater [ 04/Oct/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/48494/ |
| Comment by Peter Jones [ 04/Oct/22 ] |
|
Landed for 2.16 |
| Comment by Gerrit Updater [ 19/Dec/22 ] |
|
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49448 |
| Comment by Gerrit Updater [ 19/Jun/23 ] |
|
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51363 |