Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.15.0
-
None
-
3
-
9223372036854775807
Description
looks like regression in
commit 959304eac7ec5b156b4bfa57f47cbbf9ef3c8315 Author: Alexey Lyashkov <alexey.lyashkov@hpe.com> Date: Mon Feb 7 18:02:14 2022 +0300 LU-15189 lnet: fix memory mapping.
[ 2699.061116] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 [ 2699.062873] IP: [<ffffffffc053c0eb>] lnet_find_best_ni_on_spec_net+0x6b/0x4d0 [lnet] [ 2699.077584] RIP: 0010:[<ffffffffc053c0eb>] [<ffffffffc053c0eb>] lnet_find_best_ni_on_spec_net+0x6b/0x4d0 [lnet] [ 2699.079020] RSP: 0018:ffff8acfd8bcfaa8 EFLAGS: 00010286 [ 2699.079979] RAX: ffff8acfe93fe000 RBX: ffff8acfea501c00 RCX: 0000000000000000 [ 2699.080961] RDX: 00000000000002be RSI: 0000000000000000 RDI: 0000000000000000 [ 2699.082054] RBP: ffff8acfd8bcfb60 R08: 000000000000000a R09: 000000000000fffe [ 2699.082981] R10: 0000000000000000 R11: 000000000000000f R12: ffff8acff9812480 [ 2699.083784] R13: ffff8acfd88ba000 R14: 0000000000000000 R15: 0000000000000000 [ 2699.084448] FS: 0000000000000000(0000) GS:ffff8acfffc00000(0000) knlGS:0000000000000000 [ 2699.085469] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2699.086065] CR2: 0000000000000050 CR3: 000000007aa98000 CR4: 00000000000606f0 [ 2699.086875] Call Trace: [ 2699.087557] [<ffffffffc053dfcf>] lnet_select_pathway+0x50f/0x18d0 [lnet] [ 2699.088830] [<ffffffffc053f401>] lnet_send+0x71/0x200 [lnet] [ 2699.089727] [<ffffffffc053008b>] lnet_finalize+0x51b/0x9f0 [lnet] [ 2699.090606] [<ffffffffc05d0585>] ksocknal_process_receive+0x665/0xe30 [ksocklnd] [ 2699.091929] [<ffffffffc05d11ca>] ksocknal_scheduler+0x1fa/0xd00 [ksocklnd] [ 2699.092817] [<ffffffff898c7780>] ? wake_up_atomic_t+0x30/0x30 [ 2699.093869] [<ffffffffc05d0fd0>] ? ksocknal_recv+0x280/0x280 [ksocklnd] [ 2699.094758] [<ffffffff898c6691>] kthread+0xd1/0xe0 [ 2699.095614] [<ffffffff898c65c0>] ? insert_kthread_work+0x40/0x40 [ 2699.096393] [<ffffffff89f92d37>] ret_from_fork_nospec_begin+0x21/0x21 [ 2699.097336] [<ffffffff898c65c0>] ? insert_kthread_work+0x40/0x40
crash_x86_64> lnet_msg.msg_md ffff8acfd88ba000 msg_md = 0x0 crash_x86_64>
OOPS on this line:
bool gpu = md->md_flags & LNET_MD_FLAG_GPU;
OOPS hit on LNet router
Attachments
Issue Links
- is related to
-
LU-16211 o2iblnd NULL md deref
-
- Resolved
-