Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
Lustre 2.15.0
-
None
-
CentOS Linux release 8.5.2111
Kernel: 4.18.0-348.7.1.el8_5.x86_64
-
3
-
9223372036854775807
Description
Dear Devs,
During heavy workload we are experiencing kernel crash caused by page fault in lnet_process_id_hash()
<pre>
[ 520.767199] BUG: unable to handle kernel paging request at 00000000deadbf1f
[ 520.775831] PGD 0 P4D 0
[ 520.779875] Oops: 0000 1 SMP NOPTI
[ 520.785037] CPU: 10 PID: 492691 Comm: ll_ost00_016 Kdump: loaded Tainted: P OE --------- - - 4.18.0-348.7.1.el8_5.x86_64 #1
[ 520.800422] Hardware name: HPE ProLiant DL325 Gen10 Plus/ProLiant DL325 Gen10 Plus, BIOS A43 12/03/2021
[ 520.812168] RIP: 0010:lnet_process_id_hash+0x5/0x50 [ptlrpc]
[ 520.820123] Code: 7e 28 39 7a 0c 75 d4 8b 7e 2c 39 7a 10 75 cc 8b 46 30 39 42 14 0f 94 c0 0f b6 c0 8d 44 40 fd c3 0f 1f 44 00 00 0f 1f 44 00 00 <33> 57 14 be ff ff ff ff 69 ca 47 86 c8 61 48 85 ff 74 18 0f b6 47
[ 520.842104] RSP: 0018:ffffaa79b40c3be0 EFLAGS: 00010202
[ 520.848959] RAX: ffffffffc1a36690 RBX: 5a5a5a5a5a5a5a5a RCX: 00000000deadbeef
[ 520.857883] RDX: 000000000cdd1d51 RSI: 0000000000000001 RDI: 00000000deadbf0b
[ 520.866635] RBP: ffffaa79b40c3c70 R08: ffff8ac3fecaabf8 R09: 00000000000003e8
[ 520.875303] R10: 0000000000000000 R11: ffff8ac3feca8ec4 R12: ffff8a4e59d5c000
[ 520.884015] R13: ffffffffc1baa580 R14: fffffffffffffff0 R15: 00000000deadbeef
[ 520.893240] FS: 0000000000000000(0000) GS:ffff8ac3fec80000(0000) knlGS:0000000000000000
[ 520.903294] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 520.911071] CR2: 00000000deadbf1f CR3: 0000000c407c6000 CR4: 0000000000350ee0
[ 520.920020] Call Trace:
[ 520.924182] ptlrpc_connection_get+0x27f/0x920 [ptlrpc]
[ 520.931034] target_handle_connect+0x6de/0x29d0 [ptlrpc]
[ 520.937816] ? internal_add_timer+0x42/0x60
[ 520.943593] tgt_request_handle+0x565/0x1a40 [ptlrpc]
[ 520.950382] ? ptlrpc_nrs_req_get_nolock0+0xfb/0x1f0 [ptlrpc]
[ 520.957780] ptlrpc_server_handle_request+0x323/0xbd0 [ptlrpc]
[ 520.965373] ptlrpc_main+0xc06/0x1560 [ptlrpc]
[ 520.971430] ? __schedule+0x2c5/0x760
[ 520.976758] ? ptlrpc_wait_event+0x590/0x590 [ptlrpc]
[ 520.983264] kthread+0x116/0x130
[ 520.987811] ? kthread_flush_work_fn+0x10/0x10
[ 520.993636] ret_from_fork+0x22/0x40
</pre>