Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
None
-
None
-
3
-
9223372036854775807
Description
lustre hash functions broken now after landing
commit 239e826876e5e20405e14a180a8fd4377d6553b2
Author: Timothy Day <timday@amazon.com>
Date: Mon Feb 6 20:02:15 2023 +0000
LU-16518 misc: use fixed hash code.
static __always_inline u32 cfs_hash_64(u64 val, unsigned int bits) { #if BITS_PER_LONG == 64 / 64x64-bit multiply is efficient on all 64-bit processors / return val * GOLDEN_RATIO_64 >> (64 - bits); #else / Hash 64 bits using only 32x32-bit multiply. / return cfs_hash_32(((u32)val ^ ((val >> 32) * GOLDEN_RATIO_32)), bits); #endif } static unsigned ldlm_export_flock_hash(struct cfs_hash hs, const void key, unsigned mask) { - return cfs_hash_u64_hash(*(__u64 *)key, mask); + return cfs_hash_64(*(__u64 *)key, 0) & mask; }
this change means we have shift for 64bits for any result, it caused a return zero/0xfffff... at any input and warning with debug kernel.
[10939.945272] ================================================================================ [10939.946792] UBSAN: Undefined behaviour in include/linux/hash.h:81:31 [10939.948193] shift exponent 64 is too large for 64-bit type 'long long unsigned int' [10939.949869] CPU: 2 PID: 384127 Comm: ll_mgs_0002 Tainted: G B W OE ---------r- - 4.18.0-305.25.1.el8_4.x86_64+debug #1 [10939.952333] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-3.module_el8.7.0+3346+68867adb 04/01/2014 [10939.954274] Call Trace: [10939.954823] dump_stack+0x8e/0xd0 [10939.955543] ubsan_epilogue+0x5/0x21 [10939.956329] __ubsan_handle_shift_out_of_bounds.cold.13+0x14/0x98 [10939.957581] ? rcu_read_unlock+0x50/0x50 [10939.958418] ? lock_acquired+0x6c6/0xe60 [10939.959367] ? lprocfs_stats_lock+0x15d/0x1b0 [obdclass] [10939.960699] ldlm_export_lock_hash+0x49/0x4d [ptlrpc] [10939.961715] cfs_hash_bd_from_key+0x88/0x2e0 [libcfs] [10939.962821] cfs_hash_add+0xef/0xb60 [libcfs] [10939.963830] ? class_handle_hash+0x274/0x5f0 [obdclass] [10939.964961] ? cfs_hash_rehash+0x7a0/0x7a0 [libcfs] [10939.966167] ? ldlm_lock_create+0x734/0x1e20 [ptlrpc] [10939.967287] ldlm_handle_enqueue+0x8dc/0x48e0 [ptlrpc] [10939.968352] ? do_raw_spin_unlock+0x14b/0x230 [10939.969377] ? ldlm_setup+0x1af0/0x1af0 [ptlrpc] [10939.970509] ? __req_capsule_get+0x7ff/0x11f0 [ptlrpc] [10939.971710] ? lustre_swab_ldlm_lock_desc+0x230/0x230 [ptlrpc] [10939.972973] tgt_enqueue+0x148/0x5a0 [ptlrpc] [10939.974139] tgt_request_handle+0x179c/0x3ff0 [ptlrpc] [10939.975403] ? tgt_brw_write+0x5a00/0x5a00 [ptlrpc] [10939.976594] ptlrpc_server_handle_request+0xa34/0x1f50 [ptlrpc] [10939.977936] ? lu_context_exit+0x15a/0x2c0 [obdclass] [10939.979045] ptlrpc_main+0x1ae0/0x2f40 [ptlrpc] [10939.980062] ? __kthread_parkme+0xc4/0x190 [10939.981068] ? ptlrpc_wait_event+0xf40/0xf40 [ptlrpc] [10939.982107] kthread+0x344/0x410 [10939.982811] ? kthread_insert_work_sanity_check+0xd0/0xd0 [10939.983941] ret_from_fork+0x3a/0x50 [10939.984776] ================================================================================
Attachments
Issue Links
- is related to
-
LU-17680 performance regression in sanity test_123ac
- Resolved