summaryrefslogtreecommitdiff
path: root/llvm/lib/Support/APFloat.cpp
AgeCommit message (Collapse)Author
2025-11-01[Support] Remove redundant declarations (NFC) (#165971)Kazu Hirata
In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. Identified with readability-redundant-declaration.
2025-10-26[llvm] Migrate away from a soft-deprecated constructor of APInt (NFC) (#165164)Kazu Hirata
We have: /// Once all uses of this constructor are migrated to other constructors, /// consider marking this overload ""= delete" to prevent calls from being /// incorrectly bound to the APInt(unsigned, uint64_t, bool) constructor. LLVM_ABI APInt(unsigned numBits, unsigned numWords, const uint64_t bigVal[]); This patch migrates away from this soft-deprecated constructor.
2025-10-19[ADT, Support] Use std::min and std::max (NFC) (#164145)Kazu Hirata
2025-10-18[APFloat] Outline special member functions (#164073)Yingwei Zheng
As discussed in https://github.com/llvm/llvm-project/pull/111544#issuecomment-3405281695, large special member functions in APFloat prevent function inlining and cause compile-time regression. This patch moves them into the cpp file. Compile-time improvement (-0.1%): https://llvm-compile-time-tracker.com/compare.php?from=0f68dc6cffd93954188f73bff8aced93aab63687&to=d3105c0860920651a7e939346e67c040776b2278&stat=instructions:u
2025-10-18[APFloat] Inline static getters (#163794)Yingwei Zheng
This patch exposes the declaration of fltSemantics to inline PPCDoubleDouble() calls in the IEEEFloat/DoubleAPFloat dispatch. It slightly improves the compile time: https://llvm-compile-time-tracker.com/compare.php?from=f4359301c033694d36865c7560714164d2050240&to=68de94d77d5bd33603193e8769829345b18fbae3&stat=instructions:u With https://github.com/llvm/llvm-project/pull/111544, the improvement is more significant: https://llvm-compile-time-tracker.com/compare.php?from=e438bae71d1fd55640d942b9ad795de2f60e44f2&to=04751477940890c092dc4edb74e284de8f746d5a&stat=instructions:u Address comment https://github.com/llvm/llvm-project/pull/111544#issuecomment-3405281695. If breaking changes are allowed, we can encode all the properties of fltSemantics within a 64-bit integer. Then we don't need `Semantics <-> const fltSemantic` conversion.
2025-10-02[ADT] Fix a bug in DoubleAPFloat::frexp (#161625)Kazu Hirata
Without this patch, we call APFloat::makeQuiet() in frexp like so: Quiet.getFirst().makeQuiet(); The problem is that makeQuiet returns a new value instead of modifying "*this" in place, so we end up discarding the newly returned value. This patch fixes the problem by assigning the result back to Quiet.getFirst(). We should put [[nodiscard]] on APFloat::makeQuiet, but I'll do that in another patch.
2025-08-24[APFloat] Properly implement DoubleAPFloat::convertFromAPIntDavid Majnemer
The old implementation converted to the legacy semantics, inducing rounding and not properly handling inputs like (2^1000 + 2^200) which have have more precision than the legacy semantics can represent. Instead, we convert the integer into two floats and an error. The error is used to implement the rounding behavior. Remove related dead, untested code: convertFrom*ExtendedInteger
2025-08-21[APFloat] Properly implement DoubleAPFloat::compareAbsoluteValueDavid Majnemer
The prior implementation would treat X+Y and X-Y as having equal magnitude. Rework the implementation to be more resilient.
2025-08-20Reapply "[APFloat] Fix getExactInverse for DoubleAPFloat"David Majnemer
The previous implementation of getExactInverse used the following check to identify powers of two: // Check that the number is a power of two by making sure that only the // integer bit is set in the significand. if (significandLSB() != semantics->precision - 1) return false; This condition verifies that the only set bit in the significand is the integer bit, which is correct for normal numbers. However, this logic is not correct for subnormal values. APFloat represents subnormal numbers by shifting the significand right while holding the exponent at its minimum value. For a power of two in the subnormal range, its single set bit will therefore be at a position lower than precision - 1. The original check would consequently fail, causing the function to determine that these numbers do not have an exact multiplicative inverse. The new logic calculated this correctly but it seems that test/CodeGen/Thumb2/mve-vcvt-fixed-to-float.ll expected the old behavior. Seeing as how getExactInverse does not have tests or documentation, we conservatively maintain (and document) this behavior. This reverts commit 47e62e846beb267aad50eb9195dfd855e160483e.
2025-08-14Revert "[APFloat] Fix getExactInverse for DoubleAPFloat"Aiden Grossman
This reverts commit f4941319cba19d7691baa6ec783c84be4d847637. This broke llvm/test/CodeGen/Thumb2/mve-vcvt-fixed-to-float.ll which took out a ton of buildbots and also broke premerge.
2025-08-13[APFloat] Fix getExactInverse for DoubleAPFloatDavid Majnemer
Some background: getExactInverse()'s callers expect that the result is not subnormal. DoubleAPFloat implemented getExactInverse() by going through semPPCDoubleDoubleLegacy. This means that numbers like 0x1p1022 which would have a normal inverse in semPPCDoubleDouble would not in semPPCDoubleDoubleLegacy. This commit refactors the logic into a single method on APFloat which uses getExactLog2Abs() and scalbn() to calculate the inverse without having to compute a reciprocal and test if it is inexact. This approach works for both IEEEFloat and DoubleAPFloat.
2025-08-12[APFloat] Remove some overly optimistic assertionsDavid Majnemer
An earlier draft of DoubleAPFloat::convertToSignExtendedInteger had arranged for overflow to be handled in a different way. However, these assertions are now possible if Hi+Lo are out of range and Lo != 0. A test has been added to defend against a regression.
2025-08-12[APFloat] Properly implement frexp(DoubleAPFloat)David Majnemer
The prior implementation did not consider that the Lo component may underflow when it undergoes scaling. This means that we need to carefully handle things like binade crossings or how to handle roundTowardZero when Hi and Lo have different signs. Particularly annoying is roundTiesToAway when Hi and Lo have different signs. It basically requires us to implement roundTiesTowardZero.
2025-08-12Reapply "[APFloat] Properly implement ↵David Majnemer
DoubleAPFloat::convertToSignExtendedInteger" This reverts commit 8b44945a9231d4d7be0858a1c5d9c13d397bc512. The compilation failure under !NDEBUG has been fixed.
2025-08-10Revert "[APFloat] Properly implement ↵Kazu Hirata
DoubleAPFloat::convertToSignExtendedInteger" This reverts commit 052c38be824d9dabb1e8fb64c1c7c3908d786e83. I'm getting: llvm/lib/Support/APFloat.cpp:5627:29: error: use of undeclared identifier 'Parts' 5627 | assert(DstPartsCount <= Parts.size() && "Integer too big"); | ^ 1 error generated.
2025-08-09[APFloat] Properly implement DoubleAPFloat::convertToSignExtendedIntegerDavid Majnemer
Use DoubleAPFloat::roundToIntegral to get a pair of APFloat values which hold integral values. Then we sum the pair, taking care to make sure that we handle edge cases like (hi=2^128, lo=-1) and ensuring that they fit in an unsigned i128.
2025-08-06[APFloat] Properly implement DoubleAPFloat::roundToIntegralDavid Majnemer
The previous implementation did not correctly handle double-doubles like 0x1p100 + 0x1p1 as the low order component would need more than a 106-bit significand to represent.
2025-08-01[APFloat] Properly implement next for DoubleAPFloatDavid Majnemer
Rather than converting to the legacy 106-bit format, perform next() on the low APFloat. Of course, we need to renormalize the two APFloats if either of the two constraints are violated: 1. abs(low) <= ulp(high)/2 2. high = rtne(high + low) Should renormalization be needed, it will increment the high component and set low to the smallest value which obeys these rules.
2025-05-21[NFC][ADT/Support] Add {} for else when if body has {} (#140758)Rahul Joshi
2025-05-11[llvm] Use StringRef::consume_front (NFC) (#139458)Kazu Hirata
2025-04-14[InstCombine] Fold fneg/fabs patterns with ppc_f128 (#130557)Yingwei Zheng
This patch is needed by https://github.com/llvm/llvm-project/pull/130496.
2025-03-10[APFloat] Fix `IEEEFloat::addOrSubtractSignificand` and ↵beetrees
`IEEEFloat::normalize` (#98721) Fixes #63895 Fixes #104984 Before this PR, `addOrSubtractSignificand` presumed that the loss came from the side being subtracted, and didn't handle the case where lhs == rhs and there was loss. This can occur during FMA. This PR fixes the situation by correctly determining where the loss came from and handling it appropriately. Additionally, `normalize` failed to adjust the exponent when the significand is zero but `lost_fraction != lfExactlyZero`. This meant that the test case from #63895 was rounded incorrectly as the loss wasn't adjusted to account for the exponent being below the minimum exponent. This PR fixes this by only skipping the exponent adjustment if the significand is zero and there was no lost fraction. (Note to reviewer: I don't have commit access)
2025-03-05ADT: Switch to a raw pointer for DoubleAPFloat::Floats.Peter Collingbourne
In order for the union APFloat::Storage to permit access to the semantics field when another union member is stored there, all members of Storage must be standard layout. This is not necessarily the case for DoubleAPFloat which may be non-standard layout because there is no requirement that its std::unique_ptr member is standard layout. Fix this by converting Floats to a raw pointer. Reviewers: arsenm Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/129981
2025-02-25[X86][DAGCombiner] Skip x87 fp80 values in `combineFMulOrFDivWithIntPow2` ↵Yingwei Zheng
(#128618) f80 is not a valid IEEE floating-point type. Closes https://github.com/llvm/llvm-project/issues/128528.
2025-01-14[APFloat][NFC]extract `fltSemantics::isRepresentableBy` to header (#122636)Congcong Cai
isRepresentableBy is useful to check float point type compatibility
2024-12-04[llvm][NFC] `APFloat`: Add missing semantics to enum (#117291)Matthias Springer
* Add missing semantics to the `Semantics` enum. * Move all documentation of the semantics to the header file. * Also rename some functions for consistency.
2024-11-16[llvm] `APFloat`: Add helpers to query NaN/inf semantics (#116315)Matthias Springer
`APFloat` changes extracted from #116176 as per reviewer comments.
2024-11-15[llvm] `APFloat`: Query `hasNanOrInf` from semantics (#116158)Matthias Springer
Whether a floating point type supports NaN or infinity can be queried from its semantics. No need to hard-code a list of types.
2024-11-12[llvm] Remove redundant control flow statements (NFC) (#115831)Kazu Hirata
Identified with readability-redundant-control-flow.
2024-10-22Fix bitcasting E8M0 APFloat to APInt (#113298)Sergey Kozub
Fixes a bug in APFloat handling of E8M0 type (zero mantissa). Related PRs: - https://github.com/llvm/llvm-project/pull/107127 - https://github.com/llvm/llvm-project/pull/111028
2024-10-09[APFloat] add predicates to fltSemantics for hasZero and hasSignedRepr (#111451)Ariel-Burton
We add static methods to APFloatBase to allow the hasZero and hasSignedRepr properties of fltSemantics to be obtained.
2024-10-09[ADT][APFloat] Make sure EBO is performed on APFloat (#111641)Yingwei Zheng
Since both APFloat and (Double)IEEEFloat inherit from APFloatBase, empty base optimization is not performed by GCC/Clang (Minimal reproducer: https://godbolt.org/z/dY8cM3Wre). This patch removes inheritance relation between (Double)IEEEFloat and APFloatBase to make sure EBO is performed on APFloat. After this patch, the size of `ConstantFPRange` will be reduced from 72 to 56. Address comment https://github.com/llvm/llvm-project/pull/111544#discussion_r1792398427.
2024-10-02[APFloat] Add APFloat support for E8M0 type (#107127)Durgadoss R
This patch adds an APFloat type for unsigned E8M0 format. This format is used for representing the "scale-format" in the MX specification: (section 5.4) https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf This format does not support {Inf, denorms, zeroes}. Like FP32, this format's exponents are 8-bits (all bits here) and the bias value is 127. However, it differs from IEEE-FP32 in that the minExponent is -127 (instead of -126). There are updates done in the APFloat utility functions to handle these constraints for this format. * The bias calculation is different and convertIEEE* APIs are updated to handle this. * Since there are no significand bits, the isSignificandAll{Zeroes/Ones} methods are updated accordingly. * Although the format does not have any precision, the precision bit in the fltSemantics is set to 1 for consistency with APFloat's internal representation. * Many utility functions are updated to handle the fact that this format does not support Zero. * Provide a separate initFromAPInt() implementation to handle the quirks of the format. * Add specific tests to verify the range of values for this format. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-09-25[LLVM][IR] Add constant range support for floating-point types (#86483)Yingwei Zheng
This patch adds basic constant range support for floating-point types to enable range-based optimizations.
2024-08-25Revert "Enable logf128 constant folding for hosts with 128bit long double ↵NAKAMURA Takumi
(#104929)" ConstantFolding behaves differently depending on host's `HAS_IEE754_FLOAT128`. LLVM should not change the behavior depending on host configurations. This reverts commit 14c7e4a1844904f3db9b2dc93b722925a8c66b27. (llvmorg-20-init-3262-g14c7e4a18449 and llvmorg-20-init-3498-g001e423ac626)
2024-08-22Enable logf128 constant folding for hosts with 128bit long double (#104929)Matthew Devereau
This is a reland of (#96287). This patch attempts to reduce the reverted patch's clang compile time by removing #includes of float128.h and inlining convertToQuad functions instead.
2024-08-14Revert "Reland logf128 constant folding (#103217)"Nikita Popov
This reverts commit 3cab7c555ad6451f2b1b4dc918a4b4f4e4a3e45d. The modified test fails on ppc64le buildbots.
2024-08-14Reland logf128 constant folding (#103217)Matthew Devereau
This is a reland of #96287. This change makes tests in logf128.ll ignore the sign of NaNs for negative value tests and moves an #include <cmath> to be blocked behind #ifndef _GLIBCXX_MATH_H.
2024-08-09Revert "Enable logf128 constant folding for hosts with 128bit floats (#96287)"Nikita Popov
This reverts commit ccb2b011e577e861254f61df9c59494e9e122b38. Causes buildbot failures, e.g. on ppc64le builders.
2024-08-09Enable logf128 constant folding for hosts with 128bit floats (#96287)Matthew Devereau
Hosts which support a float size of 128 bits can benefit from constant fp128 folding.
2024-07-30[APFloat] Add support for f8E3M4 IEEE 754 type (#99698)Alexander Pivovarov
This PR adds `f8E4M3` type to APFloat. `f8E3M4` type follows IEEE 754 convention ```c f8E3M4 (IEEE 754) - Exponent bias: 3 - Maximum stored exponent value: 6 (binary 110) - Maximum unbiased exponent value: 6 - 3 = 3 - Minimum stored exponent value: 1 (binary 001) - Minimum unbiased exponent value: 1 − 3 = −2 - Precision specifies the total number of bits used for the significand (mantissa), including implicit leading integer bit = 4 + 1 = 5 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 3 - Min exp (unbiased): -2 - Infinities (+/-): S.111.0000 - Zeros (+/-): S.000.0000 - NaNs: S.111.{0,1}⁴ except S.111.0000 - Max normal number: S.110.1111 = +/-2^(6-3) x (1 + 15/16) = +/-2^3 x 31 x 2^(-4) = +/-15.5 - Min normal number: S.001.0000 = +/-2^(1-3) x (1 + 0) = +/-2^(-2) - Max subnormal number: S.000.1111 = +/-2^(-2) x 15/16 = +/-2^(-2) x 15 x 2^(-4) = +/-15 x 2^(-6) - Min subnormal number: S.000.0001 = +/-2^(-2) x 1/16 = +/-2^(-2) x 2^(-4) = +/-2^(-6) ``` Related PRs: - [PR-97179](https://github.com/llvm/llvm-project/pull/97179) [APFloat] Add support for f8E4M3 IEEE 754 type
2024-07-17[APFloat] Add support for f8E4M3 IEEE 754 type (#97179)Alexander Pivovarov
This PR adds `f8E4M3` type to APFloat. `f8E4M3` type follows IEEE 754 convention ```c f8E4M3 (IEEE 754) - Exponent bias: 7 - Maximum stored exponent value: 14 (binary 1110) - Maximum unbiased exponent value: 14 - 7 = 7 - Minimum stored exponent value: 1 (binary 0001) - Minimum unbiased exponent value: 1 − 7 = −6 - Precision specifies the total number of bits used for the significand (mantisa), including implicit leading integer bit = 3 + 1 = 4 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 7 - Min exp (unbiased): -6 - Infinities (+/-): S.1111.000 - Zeros (+/-): S.0000.000 - NaNs: S.1111.{001, 010, 011, 100, 101, 110, 111} - Max normal number: S.1110.111 = +/-2^(7) x (1 + 0.875) = +/-240 - Min normal number: S.0001.000 = +/-2^(-6) - Max subnormal number: S.0000.111 = +/-2^(-6) x 0.875 = +/-2^(-9) x 7 - Min subnormal number: S.0000.001 = +/-2^(-6) x 0.125 = +/-2^(-9) ``` Related PRs: - [PR-97118](https://github.com/llvm/llvm-project/pull/97118) Add f8E4M3 IEEE 754 type to mlir
2024-07-04[NFC] [APFloat] Refactor IEEEFloat::toString (#97117)Ariel-Burton
This PR lifts the body of IEEEFloat::toString out to a standalone function. We do this to facilitate code sharing with other floating point types, e.g., the forthcoming support for HexFloat. There is no change in functionality.
2024-06-29Rename f8E4M3 to f8E4M3FN in mlir.extras.types py package (#97102)Alexander Pivovarov
Currently `f8E4M3` is mapped to `Float8E4M3FNType`. This PR renames `f8E4M3` to `f8E4M3FN` to accurately reflect the actual type. This PR is needed to avoid names conflict in upcoming PR which will add IEEE 754 `Float8E4M3Type`. https://github.com/llvm/llvm-project/pull/97118 Add f8E4M3 IEEE 754 type Maksim, can you review this PR? @makslevental ?
2024-06-14[APFloat] Add APFloat support for FP4 data type (#95392)Durgadoss R
This patch adds APFloat type support for the E2M1 FP4 datatype. The definitions for this format are detailed in section 5.3.3 of the OCP specification, which can be accessed here: https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-06-11[APFloat] Add APFloat support for FP6 data types (#94735)Durgadoss R
This patch adds APFloat type support for two FP6 data types, E2M3 and E3M2. The definitions for the two formats are detailed in section 5.3.2 of the OCP specification, which can be accessed here: https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-05-29Constant Fold logf128 calls (#90611)Matthew Devereau
This is a second attempt to land #84501 which failed on several targets. This patch adds the HAS_IEE754_FLOAT128 define which makes the check for typedef'ing float128 more precise by checking whether __uint128_t is available and checking if the host does not use __ibm128 which is prevalent on power pc targets and replaces IEEE754 float128s.
2024-05-14[APFloat] Replace partsCount array with single variable (NFC) (#91910)Nikita Popov
We only ever use the last element of this array, so there shouldn't be a need to store the preceding elements as well.
2024-05-04[Support] Use StringRef::operator== instead of StringRef::equals (NFC) (#91042)Kazu Hirata
I'm planning to remove StringRef::equals in favor of StringRef::operator==. - StringRef::operator== outnumbers StringRef::equals by a factor of 25 under llvm/ in terms of their usage. - The elimination of StringRef::equals brings StringRef closer to std::string_view, which has operator== but not equals. - S == "foo" is more readable than S.equals("foo"), especially for !Long.Expression.equals("str") vs Long.Expression != "str".
2024-05-01Revert "Constant Fold logf128 calls"Matt Devereau
This reverts commit 088aa81a545421933254f19cd3c8914a0373b493.