diff options
| author | Jiamei Xie <xiejiamei@hygon.cn> | 2025-10-14 20:14:11 +0800 |
|---|---|---|
| committer | Sam James <sam@gentoo.org> | 2025-11-04 12:23:17 +0000 |
| commit | ba7c1682eae563b49baeaebcd0f39538e029c4bb (patch) | |
| tree | b428c16befaa9be5a755a9bcf25503b826582943 | |
| parent | df5072615e9fb1c6dfd06bd35f20f28d1ac2808d (diff) | |
x86: fix wmemset ifunc stray '!' (bug 33542)release/2.37/master
The ifunc selector for wmemset had a stray '!' in the
X86_ISA_CPU_FEATURES_ARCH_P(...) check:
if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2)
&& X86_ISA_CPU_FEATURES_ARCH_P (cpu_features,
AVX_Fast_Unaligned_Load, !))
This effectively negated the predicate and caused the AVX2/AVX512
paths to be skipped, making the dispatcher fall back to the SSE2
implementation even on CPUs where AVX2/AVX512 are available. The
regression leads to noticeable throughput loss for wmemset.
Remove the stray '!' so the AVX_Fast_Unaligned_Load capability is
tested as intended and the correct AVX2/EVEX variants are selected.
Impact:
- On AVX2/AVX512-capable x86_64, wmemset no longer incorrectly
falls back to SSE2; perf now shows __wmemset_evex/avx2 variants.
Testing:
- benchtests/bench-wmemset shows improved bandwidth across sizes.
- perf confirm the selected symbol is no longer SSE2.
Signed-off-by: xiejiamei <xiejiamei@hygon.com>
Signed-off-by: Li jing <lijing@hygon.cn>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 4d86b6cdd8132e0410347e07262239750f86dfb4)
| -rw-r--r-- | sysdeps/x86_64/multiarch/ifunc-wmemset.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/sysdeps/x86_64/multiarch/ifunc-wmemset.h b/sysdeps/x86_64/multiarch/ifunc-wmemset.h index 1fbbd3d68e..7f68fabfc8 100644 --- a/sysdeps/x86_64/multiarch/ifunc-wmemset.h +++ b/sysdeps/x86_64/multiarch/ifunc-wmemset.h @@ -35,7 +35,7 @@ IFUNC_SELECTOR (void) if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2) && X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, - AVX_Fast_Unaligned_Load, !)) + AVX_Fast_Unaligned_Load,)) { if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512VL)) { |
