summaryrefslogtreecommitdiff
path: root/benchtests
AgeCommit message (Collapse)Author
2025-11-21benchtests: Fix bench-build after cd748a63abAdhemerval Zanella
The benchtests does not define _LIBC.
2025-11-21bench-malloc-thread: Add libm for powfAdhemerval Zanella
The compiler might not constant fold the call, which issues linker error. Reviewed-by: Sam James <sam@gentoo.org>
2025-11-21benchtests: Remove clang warningsAdhemerval Zanella
clangs warns of the implicit cast of RAND_MAX to float: error: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Werror,-Wimplicit-const-int-float-conversion] So make it explicit. Reviewed-by: Sam James <sam@gentoo.org>
2025-11-21benchtests: Add attribute_optimizeAdhemerval Zanella
Similar to tst-printf-bz18872.sh, add the attribute_optimize to avoid build failures with compilers that do not support "GCC optimize" pragma. Reviewed-by: Sam James <sam@gentoo.org>
2025-11-21benchtests: Use __f128 on ilogbf128-inputs constantsAdhemerval Zanella
The f128 is not a valid floating constant suffix on clang. Reviewed-by: Sam James <sam@gentoo.org>
2025-11-21benchtests: Add fmaf benchtestsAdhemerval Zanella
Random inputs in the range [0,10]. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-11-21benchtests: Add fma benchtestsAdhemerval Zanella
Random inputs in the range [0,10]. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-11-13htl: Drop pthread-functions infrastructureSamuel Thibault
All previously forwarded functions are now called directly (either via local call in libc, or through a __export).t
2025-11-12hurd: Drop remnants of cthreadsSamuel Thibault
These are not used in GNU/Hurd since very long now.
2025-11-10benchtests: Add benchmarks for frexp functionsOsama Abdelkader
Add benchmark support for frexp, frexpf, and frexpl to measure the performance improvement of the fast path optimization. - Created frexp-inputs, frexpf-inputs, frexpl-inputs with random test values - Added frexp, frexpf, frexpl to bench-math list - Added CFLAGS to disable builtins for accurate benchmarking These benchmarks will be used to quantify the performance gains from the fast path optimization for normal floating-point numbers. Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com>
2025-10-22plot_strings.py: Replace np.complex with complexH.J. Lu
Replace np.complex with complex to fix numpy error: AttributeError: module 'numpy' has no attribute 'complex'. `np.complex` was a deprecated alias for the builtin `complex`. To avoid this error in existing code, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Collin Funk <collin.funk1@gmail.com>
2025-10-08benchtests: Improve fmod benchmarkAdhemerval Zanella
The gcc implements fmod as a built-in for x86, so disable it to benchmark the C implementation. Also, make fmod and fmodf use the workload directive to measure the reciprocal throughput.
2025-10-08benchtests: Add lgammaf_r benchmarkAdhemerval Zanella
Random inputs in the range [-20.0,20.0].
2025-10-01benchtests: Add remainderf benchtestAdhemerval Zanella
The inputs are based on fmodf-inputs. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-10-01benchtests: Add remainder benchtestAdhemerval Zanella
The inputs are based on fmod-inputs. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-09-27AArch64: Implement AdvSIMD and SVE log10p1(f) routinesLuna Lamb
Vector variants of the new C23 log10p1 routines. Note: Benchmark inputs for log10p1(f) are identical to log1p(f) Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-09-27AArch64: Implement AdvSIMD and SVE log2p1(f) routinesLuna Lamb
Vector variants of the new C23 log2p1 routines. Note: Benchmark inputs for log2p1(f) are identical to log1p(f). Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-09-23benchtests: Add workload for tgammaf-inputsAdhemerval Zanella
2025-09-22benchtests: Fix warning in bench-strchr.cWilco Dijkstra
Ensure benchtests compile with trunk GCC.
2025-09-11benchtests: Add workload directive for tgammaAdhemerval Zanella
2025-09-11benchtests: Add workload directive for erf and erfcAdhemerval Zanella
2025-09-11benchtests: Add workload for lgammaAdhemerval Zanella
Random inputs in range [-20.00,20.00]. Reviewed-by: DJ Delorie <dj@redhat.com>
2025-09-11benchtests: Add workload for asinhAdhemerval Zanella
Random input in range [-10,10]. Reviewed-by: DJ Delorie <dj@redhat.com>
2025-09-11benchtests: Add workload for acoshAdhemerval Zanella
Random inputs in range [1.00,21.00] Reviewed-by: DJ Delorie <dj@redhat.com>
2025-09-02AArch64: Implement exp2m1 and exp10m1 routinesHasaan Khan
Vector variants of the new C23 exp2m1 & exp10m1 routines. Note: Benchmark inputs for exp2m1 & exp10m1 are identical to exp2 & exp10 respectively, this also includes the floating point variations. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-08-27added benchmark inputs for rsqrtf and rsqrtPaul Zimmermann
Changes with respect to v1: - added missing rsqrt and rsqrtf in bench-math
2025-08-26add missing benchmark files for several C23 binary64 functionsPaul Zimmermann
These files were prepared together with Saban Houssein.
2025-08-12benchtests: Avoid truncation in random memcpy/memset benchmarksWilco Dijkstra
Use uint16_t rather than uint8_t for the size arrays. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-08-04Revert "benchtests: Avoid overflow in random memcpy/memset benchmarks"Wilco Dijkstra
This reverts commit 09604542d31abf1e35cd00c1db8d9bee9568bdd0.
2025-08-04benchtests: Avoid overflow in random memcpy/memset benchmarksWilco Dijkstra
Use uint16_t rather than uint8_t for the size arrays.
2025-08-02benchtests: Cleanup bench-malloc-threadWilco Dijkstra
Change duration to 3 seconds.  Add spaces before '('. Reviewed-by: DJ Delorie <dj@redhat.com>
2025-07-29replace atan2-inputs with more meaningful inputsPaul Zimmermann
Commit 934d88d used inputs with exponent generated at random in the whole binary64 exponent range, which yields essentially very large or very small values of |y/x|. Instead, this commit generates x, y at random in [-10,10], which should better corresponds to real applications. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-06-24benchtests: Add IPv6 inet_ntop benchmarkAdhemerval Zanella
Random IP addresses in the full range. There is no extra workload to check the effectiveness '::' optimization for a set of 0-oct sets (although it would be a possible workload). Reviewed-by: DJ Delorie <dj@redhat.com>
2025-06-24benchtests: Add IPv4 inet_ntop benchmarkAdhemerval Zanella
Random IP addresses in the full range. Reviewed-by: Collin Funk <collin.funk1@gmail.com> Reviewed-by: DJ Delorie <dj@redhat.com>
2025-06-13benchtests: Improve modf benchtestAdhemerval Zanella
It adds four ranges, which is how the generic implementation handles normal numbers: 1. Random inputs in the range [0.0, 1.0]; 2. Random inputs in the range [1.0, (double)(UINT64_C(1) << 52))]; 3. Random inputs in the range [(double)(UINT64_C(1) << 52), DBL_MAX]; 4. Random integral inputs in the range [0.0, (double)(UINT64_C(1) << 52)]. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-06-13benchtests: Add modff benchtestAdhemerval Zanella
It adds four ranges, which is how the generic implementation handles normal numbers: 1. Random inputs in the range [0.0, 1.0]; 2. Random inputs in the range [1.0, (float)(1U << 23)]; 3. Random inputs in the range [(float)(1U << 23), FLT_MAX]; 4. Random integral inputs in the range [0.0, (float)(1U << 23)]. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-05-23libmvec: Add inputs for asinpi(f), acospi(f), atanpi(f) and atan2pi(f)Wilco Dijkstra
Add initial inputs for asinpi(f), acospi(f), atanpi(f) and atan2pi(f) based on existing asin/acos/atan inputs. Benchtests now works on the new libmvec function. Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
2025-05-15benchtest: malloc tcache hotpath benchtestCupertino Miranda
Existing benchtests for malloc infrastructure seem to be rather generic to test global malloc implementation performance. This new benchtest focus on reducing any non tcache related side effects, allowing to more realistically predict performance impacts of tcache code changes. The test was inpired in bench-[cm]alloc-thread code, with severe simplifications: - forces single thread execution, reducing concurrency side-effects, like cache incoherence penalties due simultaneous writes to the same cache pages; - Focus on allocating and deallocating a single size for all the duration of the benchmark. Since all it does is allocate and deallocate, it will measure the tcache hotpath without any side-effects. - Allows to specify the allocation size as input argument. Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-05-13added benchtest inputs for log2lPaul Zimmermann
2025-05-13added benchtest inputs for explPaul Zimmermann
2025-05-13added benchtest inputs for powlPaul Zimmermann
changes in v2: * fixed the missing Makefile entry in the first version Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-05-13added benchtest inputs for fmalPaul Zimmermann
2025-04-25benchtest: Correct shell script related to bench-malloc-threadCupertino Miranda
This patch changes the shell script that selects which arguments are used for the execution of bench-malloc-thread. The problem seems to have been introduced in commit: commit 2d6427a63cad8056ba6bcaaaa8df21977c8dde3d Author: Wangyang Guo <wangyang.guo@intel.com> Date: Fri Nov 29 16:05:35 2024 +0800 benchtests: Add calloc test With current condition, the following error "/bin/sh: 3: [[: not found" occurs when executing `make bench BENCHSET="malloc-thread"` and the else path is taken, using incorrect arguments for bench test execution. Error is reproducible in Debian based distros. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-03-18benchtests: Increase iterations of bench-malloc-simpleWilco Dijkstra
Increase iterations so it runs for ~1 second on modern CPUs. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-03-13x86_64: Add atanh with FMASunil K Pandey
On SPR, it improves atanh bench performance by: Before After Improvement reciprocal-throughput 15.1715 14.8628 2% latency 57.1941 56.1883 2% Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2025-03-13x86_64: Add sinh with FMASunil K Pandey
On SPR, it improves sinh bench performance by: Before After Improvement reciprocal-throughput 14.2017 11.815 17% latency 36.4917 35.2114 4% Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2025-03-13benchtests: Remove wrong snippet from 360cce0b06Adhemerval Zanella
2025-03-13nptl: Check if thread is already terminated in sigcancel_handler (BZ 32782)Adhemerval Zanella
The SIGCANCEL signal handler should not issue __syscall_do_cancel, which calls __do_cancel and __pthread_unwind, if the cancellation is already in proces (and libgcc unwind is not reentrant). Any cancellation signal received after is ignored. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Tested-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-03-05benchtests: Add random strlen benchmarkWilco Dijkstra
Add a new randomized strlen test similar to bench-random-memcpy. Instead of repeating the same call to strlen over and over again, it times a large number of different strings. The distribution of the string length and alignment is based on SPEC2017. Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-03-05benchtests: Improve large memcpy/memset benchmarksWilco Dijkstra
Adjust sizes between 64KB and 16MB and iterations based on length. Remove incorrect uses of alloc_bufs since we're not interested in measuring Linux clear_page time. Use getpagesize() - 1 instead of 4095 when aligning within a page. Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>