diff options
| author | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2025-10-10 15:15:22 -0300 |
|---|---|---|
| committer | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2025-10-27 09:34:04 -0300 |
| commit | d1509f2ce333cc638074f04650030ce897dca47f (patch) | |
| tree | abc3e1cfc8d40ee1214990deb8e2cb9f512e4993 /SHARED-FILES | |
| parent | 3d20d746c3fc98092b364c198245ae7d2b81ac09 (diff) | |
math: Use acosh from CORE-MATH
The current implementation precision shows the following accuracy, on
two different ranges ([1,21) and [21, DBL_MAX)) with 10e9 uniform
randomly generated numbers (first column is the accuracy in ULP, with
'0' being correctly rounded, second is the number of samples with the
corresponding precision):
* range [1,21]
* FE_TONEAREST
0: 8931139411 89.31%
1: 1068697545 10.69%
2: 163044 0.00%
* FE_UPWARD
0: 7936620731 79.37%
1: 2062594522 20.63%
2: 783977 0.01%
3: 770 0.00%
* FE_DOWNWARD
0: 7936459794 79.36%
1: 2062734117 20.63%
2: 805312 0.01%
3: 777 0.00%
* FE_TOWARDZERO
0: 7910345595 79.10%
1: 2088584522 20.89%
2: 1069106 0.01%
3: 777 0.00%
* Range [21, DBL_MAX)
* FE_TONEAREST
0: 5163888431 51.64%
1: 4836111569 48.36%
* FE_UPWARD
0: 4835951885 48.36%
1: 5164048115 51.64%
* FE_DOWNWARD
0: 5164048432 51.64%
1: 4835951568 48.36%
* FE_TOWARDZERO
0: 5164058042 51.64%
1: 4835941958 48.36%
The CORE-MATH implementation is correctly rounded for any rounding mode.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1) shows:
reciprocal-throughput master patched improvement
x86_64 20.9131 47.2187 -125.79%
x86_64v2 20.8823 41.1042 -96.84%
x86_64v3 19.0282 25.8045 -35.61%
aarch64 14.7419 18.1535 -23.14%
power10 8.98341 11.0423 -22.92%
Latency master patched improvement
x86_64 75.5494 89.5979 -18.60%
x86_64v2 74.4443 87.6292 -17.71%
x86_64v3 71.8558 70.7086 1.60%
aarch64 30.3361 29.2709 3.51%
power10 20.9263 19.2482 8.02%
For x86_64/x86_64-v2, most performance hit came from the fma call
through the ifunc mechanism.
Checked on x86_64-linux-gnu, aarch64-linux-gnu, and
powerpc64le-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>
Diffstat (limited to 'SHARED-FILES')
| -rw-r--r-- | SHARED-FILES | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/SHARED-FILES b/SHARED-FILES index ba80026eb9..ee9b291010 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -239,6 +239,8 @@ tzdata: # The project is distribute here: # https://gitlab.inria.fr/core-math/core-math/ core-math: + # src/binary64/acosh/acosh.c, revision 69062c4d + sysdeps/ieee754/dbl-64/e_acosh.c # src/binary32/acos/acosf.c, revision 56dd347 sysdeps/ieee754/flt-32/e_acosf.c # src/binary32/acosh/acoshf.c, revision d0b9ddd |
