-
Bug
-
Resolution: Fixed
-
Medium
-
None
-
None
-
3
-
9223372036854775807
cfs_cpt_set_node() returns 1 on success (boolean), but its return
value was left in rc after the for_each_node_mask loop that builds
the IO CPT table when cpu_npartitions=1.
In mds_start_ptlrpc_service(), this stale rc=1 causes the function
to treat success as failure: all just-registered services are stopped
via mds_stop_ptlrpc_service(), and rc=1 propagates up through
mds_device_alloc() as ERR_PTR(1) = (void *)0x1. Since IS_ERR()
does not catch small positive values, obd_setup() dereferences
this as a struct lu_device pointer, crashing at offset 0x28 (ld_obd):
BUG: unable to handle kernel NULL pointer dereference at 0x29
RIP: obd_setup+0x238/0x470 [obdclass]
The same pattern exists in oss_device_init(). While the OSS success
path explicitly returns 0 (masking the bug), reset rc there too to
prevent future regressions.
[337261.506854] BUG: unable to handle kernel NULL pointer dereference at 0000000000000029 [337261.514776] PGD 130588067 P4D 130588067 PUD 14524e067 PMD 0 [337261.520519] Oops: 0002 [#1] SMP NOPTI [337261.524271] CPU: 0 PID: 435181 Comm: lt-mount.lustre Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.36.1.el8_10.x86_64 #1 [337261.537036] Hardware name: Viking Enterprise Solutions VSSEP1EA/VSSEP1EA, BIOS 10.09.02 10/26/2020 [337261.546076] RIP: 0010:obd_setup+0x238/0x470 [obdclass] [337261.551464] Code: 85 c0 0f 84 81 00 00 00 4c 89 ea 48 89 ee 48 89 df e8 4c 2e b3 f2 49 89 c5 48 3d 00 f0 ff ff 0f 87 3d 01 00 00 49 89 44 24 10 <4c> 89 60 28 48 89 68 08 48 85 c0 0f 84 37 01 00 00 f6 45 00 02 0f [337261.570293] RSP: 0018:ffffab642795f878 EFLAGS: 00010203 [337261.575598] RAX: 0000000000000001 RBX: ffffab642795f8a8 RCX: 00000000002a0009 [337261.582815] RDX: 00000000002a000a RSI: 00000000002a0009 RDI: ffff987400004e00 [337261.590026] RBP: ffffffffc1fb2da0 R08: 0000000000000001 R09: 0000000000000000 [337261.597244] R10: ffff9874344e4480 R11: ffff987567092500 R12: ffff98756702d9c0 [337261.604462] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 [337261.611672] FS: 00007f40c573da80(0000) GS:ffff9892ede00000(0000) knlGS:0000000000000000 [337261.619845] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [337261.625676] CR2: 0000000000000029 CR3: 0000000264c02000 CR4: 0000000000350ef0 [337261.632895] Call Trace: [337261.635439] ? __die_body+0x1a/0x60 [337261.639020] ? no_context+0x1ba/0x3f0 [337261.642771] ? libcfs_debug_msg+0x907/0xc00 [libcfs] [337261.647835] ? __bad_area_nosemaphore+0x157/0x180 [337261.652624] ? do_page_fault+0x37/0x12d [337261.656542] ? page_fault+0x1e/0x30 [337261.660123] ? obd_setup+0x238/0x470 [obdclass] [337261.664783] ? obd_setup+0x224/0x470 [obdclass] [337261.669445] class_setup+0x5b7/0x760 [obdclass] [337261.674109] lustre_start_simple+0x397/0x6f0 [obdclass] [337261.679465] server_start_targets+0x1360/0x2b20 [ptlrpc] [337261.685081] ? cpumask_next_wrap+0x2d/0x80 [337261.689264] ? string+0x44/0x60 [337261.692495] ? vsnprintf+0x340/0x520 [337261.696162] ? snprintf+0x49/0x70 [337261.699567] ? libcfs_debug_msg+0x907/0xc00 [libcfs] [337261.704621] ? lustre_start_mgc+0xd6d/0x1ed0 [obdclass] [337261.709975] ? kfree+0xd3/0x250 [337261.713207] ? lustre_start_mgc+0xd6d/0x1ed0 [obdclass] [337261.718564] server_fill_super+0xd11/0x11a0 [ptlrpc] [337261.723709] ? obd_zombie_barrier+0x38/0xb0 [obdclass] [337261.728982] ? ll_alloc_inode+0x110/0x110 [lustre] [337261.733943] lustre_fill_super+0x2eb/0x400 [lustre] [337261.738936] vfs_get_super+0x7f/0x110 [337261.742688] vfs_get_tree+0x25/0xc0 [337261.746267] do_mount+0x2e9/0x950 [337261.749671] ksys_mount+0xbe/0xe0 [337261.753067] __x64_sys_mount+0x21/0x30 [337261.756900] do_syscall_64+0x5b/0x1a0 [337261.760652] entry_SYSCALL_64_after_hwframe+0x66/0xcb