summaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
AgeCommit message (Collapse)Author
2025-11-19CodeGen: Add subtarget to TargetLoweringBase constructor (#168620)Matt Arsenault
Currently LibcallLoweringInfo is defined inside of TargetLowering, which is owned by the subtarget. Pass in the subtarget so we can construct LibcallLoweringInfo with the subtarget. This is a temporary step that should be revertable in the future, after LibcallLoweringInfo is moved out of TargetLowering.
2025-11-12DAG: Move expandMultipleResultFPLibCall to TargetLowering (NFC) (#166988)Matt Arsenault
This kind of helper is higher level and not general enough to go directly in SelectionDAG. Most similar utilities are in TargetLowering.
2025-11-07Add `llvm.vector.partial.reduce.fadd` intrinsic (#159776)Damian Heaton
With this intrinsic, and supporting SelectionDAG nodes, we can better make use of instructions such as AArch64's `FDOT`.
2025-10-31[SDAG] Set InBounds when when computing offsets into memory objects (#165425)Fabian Ritter
When a load or store accesses N bytes starting from a pointer P, and we want to compute an offset pointer within these N bytes after P, we know that the arithmetic to add the offset must be inbounds. This is for example relevant when legalizing too-wide memory accesses, when lowering memcpy&Co., or when optimizing "vector-load -> extractelement" into an offset load. For SWDEV-516125.
2025-10-25[DAGCombine] Improve bswap lowering for machines that support bit rotates ↵AZero13
(#164848) Source: Hacker's delight.
2025-10-13Wasm fmuladd relaxed (#163177)Sam Parker
Reland #161355, after fixing up the cross-projects-tests for the wasm simd intrinsics. Original commit message: Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions. If we have FP16, then lower v8f16 fmuladds to FMA. I've introduced an ISD node for fmuladd to maintain the rounding ambiguity through legalization / combine / isel.
2025-10-13Revert "[WebAssembly] Lower fmuladd to madd and nmadd" (#163171)Sam Parker
Reverts llvm/llvm-project#161355 Looks like I've broken some intrinsic code generation.
2025-10-13[WebAssembly] Lower fmuladd to madd and nmadd (#161355)Sam Parker
Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions. If we have FP16, then lower v8f16 fmuladds to FMA. I've introduced an ISD node for fmuladd to maintain the rounding ambiguity through legalization / combine / isel.
2025-09-29[TargetLowering] Remove NoSignedZerosFPMath uses (#160975)paperchalice
Remove NoSignedZerosFPMath in TargetLowering part, users should always use instruction level fast math flags.
2025-09-26[SelectionDAG] Improve v2f16 maximumnum expansion (#160723)Lewis Crawford
On targets where f32 maximumnum is legal, but maximumnum on vectors of smaller types is not legal (e.g. v2f16), try unrolling the vector first as part of the expansion. Only fall back to expanding the full maximumnum computation into compares + selects if maximumnum on the scalar element type cannot be supported.
2025-09-25[TargetLowering][ExpandABD] Prefer selects over usubo if we do the same for ↵AZero13
ucmp (#159889) Same deal we use for determining ucmp vs scmp. Using selects on platforms that like selects is better than using usubo. Rename function to be more general fitting this new description.
2025-09-19[KnownBits] Add setAllConflict to set all bits in Zero and One. NFC (#159815)Craig Topper
This is a common pattern to initialize Knownbits that occurs before loops that call intersectWith.
2025-09-19[AMDGPU][SDAG] Handle ISD::PTRADD in various special cases (#145330)Fabian Ritter
There are more places in SIISelLowering.cpp and AMDGPUISelDAGToDAG.cpp that check for ISD::ADD in a pointer context, but as far as I can tell those are only relevant for 32-bit pointer arithmetic (like frame indices/scratch addresses and LDS), for which we don't enable PTRADD generation yet. For SWDEV-516125.
2025-09-17[SelectionDAG] Deal with POISON for INSERT_VECTOR_ELT/INSERT_SUBVECTOR (#143102)Björn Pettersson
As reported in https://github.com/llvm/llvm-project/issues/141034 SelectionDAG::getNode had some unexpected behaviors when trying to create vectors with UNDEF elements. Since we treat both UNDEF and POISON as undefined (when using isUndef()) we can't just fold away INSERT_VECTOR_ELT/INSERT_SUBVECTOR based on isUndef(), as that could make the resulting vector more poisonous. Same kind of bug existed in DAGCombiner::visitINSERT_SUBVECTOR. Here are some examples: This fold was done even if vec[idx] was POISON: INSERT_VECTOR_ELT vec, UNDEF, idx -> vec This fold was done even if any of vec[idx..idx+size] was POISON: INSERT_SUBVECTOR vec, UNDEF, idx -> vec This fold was done even if the elements not extracted from vec could be POISON: sub = EXTRACT_SUBVECTOR vec, idx INSERT_SUBVECTOR UNDEF, sub, idx -> vec With this patch we avoid such folds unless we can prove that the result isn't more poisonous when eliminating the insert. Fixes https://github.com/llvm/llvm-project/issues/141034
2025-09-05[SelectionDAG] Clean up SCALAR_TO_VECTOR handling in ↵Björn Pettersson
SimplifyDemandedVectorElts (#157027) This patch reverts changes from commit 585e65d3307f5f0 (https://reviews.llvm.org/D104250), as it doesn't seem to be needed nowadays. The removed code was doing a recursive call to SimplifyDemandedVectorElts trying to simplify the vector %vec when finding things like (SCALAR_TO_VECTOR (EXTRACT_VECTOR_ELT %vec, 0)) I figure that (EXTRACT_VECTOR_ELT %vec, 0) would be simplified based on only demanding element zero regardless of being used in a SCALAR_TO_VECTOR operation or not. It had been different if the code tried to simplify the whole expression as %vec. That could also have motivate why to make element zero a special case. But it only simplified %vec without folding away the SCALAR_TO_VECTOR.
2025-08-31[SelectionDAG] Return std::optional<unsigned> from getValidShiftAmount and ↵Craig Topper
friends. NFC (#156224) Instead of std::optional<uint64_t>. Shift amounts must be less than or equal to our maximum supported bit widths which fit in unsigned. Most of the callers already assumed it fit in unsigned.
2025-08-31[TargetLowering] Only freeze LHS and RHS if they are used multiple times in ↵AZero13
expandABD (#156193) Not all paths in expandABD are using LHS and RHS twice.
2025-08-28[ValueTracking][SelectionDAG] Use KnownBits::reverseBits/byteSwap. NFC (#155847)Craig Topper
2025-08-28[KnownBits] Add operator<<=(unsigned) and operator>>=(unsigned). NFC (#155751)Craig Topper
Add operators to shift left or right and insert unknown bits.
2025-08-18[CodeGen][Mips] Remove fp128 libcall list (#153798)Nikita Popov
Mips requires fp128 args/returns to be passed differently than i128. It handles this by inspecting the pre-legalization type. However, for soft float libcalls, the original type is currently not provided (it will look like a i128 call). To work around that, MIPS maintains a list of libcalls working on fp128. This patch removes that list by providing the original, pre-softening type to calling convention lowering. This is done by carrying additional information in CallLoweringInfo, as we unfortunately do need both types (we want the un-softened type for OrigTy, but we need the softened type for the actual register assignment etc.) This is in preparation for completely removing all the custom pre-analysis code in the Mips backend and replacing it with use of OrigTy.
2025-08-15[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817)Nikita Popov
This ensures that the required fields are set, and also makes the construction more convenient.
2025-08-14[CodeGen] Remove unnecessary setTypeListBeforeSoften() parameter (NFC)Nikita Popov
It does not make sense to set the softening type list without setting IsSoften=true.
2025-08-10[DAGCombine] Correctly extend the constant RHS in ↵Yingwei Zheng
`TargetLowering::SimplifySetCC` (#152862) In https://github.com/llvm/llvm-project/pull/150270, when the predicate is eq/ne and the trunc has only an nsw flag, the RHS is incorrectly zero-extended. Closes https://github.com/llvm/llvm-project/issues/152630.
2025-08-05[DAGCombiner] Fold setcc of trunc, generalizing some NVPTX isel logic (#150270)Alex MacLean
That change adds support for folding a SETCC when one or both of the operands is a TRUNCATE with the appropriate no-wrap flags. This pattern can occur when promoting i8 operations in NVPTX, and we currently have some ISel rules to try to handle it.
2025-08-05[DAG] visitFREEZE - replace multiple frozen/unfrozen uses of an SDValue with ↵Simon Pilgrim
just the frozen node (#150017) Similar to InstCombinerImpl::freezeOtherUses, attempt to ensure that we merge multiple frozen/unfrozen uses of a SDValue. This fixes a number of hasOneUse() problems when trying to push FREEZE nodes through the DAG. Remove SimplifyMultipleUseDemandedBits handling of FREEZE nodes as we now want to keep the common node, and not bypass for some nodes just because of DemandedElts. Fixes #149799
2025-08-04[TargetLowering][RISCV] Use sra for (X & -256) == 256 -> (X >> 8) == 1 if it ↵Craig Topper
yields a better icmp constant. (#151762) If using srl does not produce a legal constant for the RHS of the final compare, try to use sra instead. Because the AND constant is negative, the sign bits participate in the compare. Using an arithmetic shift right duplicates that bit.
2025-08-02[TargetLowering] Use getShiftAmountConstant in buildSDIVPow2WithCMov.Craig Topper
2025-07-29[TargetLowering] Use getShiftAmountConstant in CTTZTableLookup. NFCCraig Topper
2025-07-22[DAG] expandVECTOR_COMPRESS - remove superfluous getFreeze. NFC. (#150062)Simon Pilgrim
freeze(freeze(extract_vector_elt(x,i))) -> freeze(extract_vector_elt(x,i))
2025-07-22[SelectionDAG] Pass SDNodeFlags through getNode instead of setFlags. (#149852)Craig Topper
getNode updates flags correctly for CSE. Calling setFlags after getNode may set the flags where they don't apply. I've added a Flags argument to getSelectCC and the signature of getNode that takes an ArrayRef of EVTs.
2025-07-22[DAG] isNonZeroModBitWidthOrUndef - fix bugprone-argument-comment analyzer ↵Simon Pilgrim
warning. NFC. matchUnaryPredicate argument is AllowUndefs not AllowUndef
2025-07-20[DAG] Add missing Depth argument to isGuaranteedNotToBeUndefOrPoison calls ↵Simon Pilgrim
inside SimplifyDemanded methods (#149550) Ensure we don't exceed the maximum recursion depth
2025-07-11[NFC] Correct typo: invertion -> inversion (#147995)Fraser Cormack
2025-07-10[TargetLowering] Change getOptimalMemOpType and findOptimalMemOpLowering to ↵Boyao Wang
take LLVM Context (#147664) Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So that we can use EVT::getVectorVT to generate EVT type in getOptimalMemOpType. Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).
2025-07-09RuntimeLibcalls: Remove table of soft float compare cond codes (#146082)Matt Arsenault
Previously we had a table of entries for every Libcall for the comparison to use against an integer 0 if it was a soft float compare function. This was only relevant to a handful of opcodes, so it was wasteful. Now that we can distinguish the abstract libcall for the compare with the concrete implementation, we can just directly hardcode the comparison against the libcall impl without this configuration system.
2025-07-07DAG: Remove verifyReturnAddressArgumentIsConstant (#147240)Matt Arsenault
The intrinsic argument is already marked with immarg so non-constant values are rejected by the IR verifier.
2025-07-07[TargetLowering] hasAndNotCompare should be checking for X, not Y (#146935)AZero13
Y is the one being bitwise-not, so it should not be passed, as the other one should be passed instead.
2025-06-27[TargetLowering] Fold (a | b) ==/!= b -> (a & ~b) ==/!= 0 when and-not ↵AZero13
exists (#145368) This is especially helpful for AArch64, which simplifies ands + cmp to tst. Alive2: https://alive2.llvm.org/ce/z/LLgcJJ --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-06-27DAG: Check libcall function is supported before emission (#144314)Matt Arsenault
2025-06-25[SelectionDAG] Fold undemanded operand to UNDEF for VECTOR_SHUFFLE (#145524)Björn Pettersson
Always let SimplifyDemandedVectorElts fold either side of a VECTOR_SHUFFLE to UNDEF if no elements are demanded from that side. For a single use this could be done by SimplifyDemandedVectorElts already, but in case the operand had multiple uses we did not eliminate the use.
2025-06-22[SelectionDAG] Handle `fneg`/`fabs`/`fcopysign` in `SimplifyDemandedBits` ↵Iris Shi
(#139239)
2025-06-20[LLVM][CodeGen][SVE] Add isel for bfloat unordered reductions. (#143540)Paul Walker
The omissions are VECREDUCE_SEQ_* and MUL. The former goes down a different code path and the latter is unsupported across all element types.
2025-06-17DAG: Move soft float predicate management into RuntimeLibcalls (#142905)Matt Arsenault
Work towards making RuntimeLibcalls the centralized location for all libcall information. This requires changing the encoding from tracking the ISD::CondCode to using CmpInst::Predicate.
2025-06-10DAG: Assert fcmp uno runtime calls are boolean values (#142898)Matt Arsenault
This saves 2 instructions in the ARM soft float case for fcmp ueq. This code is written in an confusingly overly general way. The point of getCmpLibcallCC is to express that the compiler-rt implementations of the FP compares are different aliases around functions which may return -1 in some cases. This does not apply to the call for unordered, which returns a normal boolean. Also stop overriding the default value for the unordered compare for ARM. This was setting it to the same value as the default, which is now assumed.
2025-06-09[SDAG] Add partial_reduce_sumla node (#141267)Philip Reames
We have recently added the partial_reduce_smla and partial_reduce_umla nodes to represent Acc += ext(b) * ext(b) where the two extends have to have the same source type, and have the same extend kind. For riscv64 w/zvqdotq, we have the vqdot and vqdotu instructions which correspond to the existing nodes, but we also have vqdotsu which represents the case where the two extends are sign and zero respective (i.e. not the same type of extend). This patch adds a partial_reduce_sumla node which has sign extension for A, and zero extension for B. The addition is somewhat mechanical.
2025-06-04Revert "[SDAG] Fix fmaximum legalization errors (#142170)"Nikita Popov
This reverts commit 58cc1675ec7b4aa5bc2dab56180cb7af1b23ade5. I also made the incorrect assumption that we know both values are +/-0.0 here as well. Revert for now.
2025-06-04Revert "[SelectionDAG] Avoid one comparison when legalizing fmaximum (#142732)"Nikita Popov
This reverts commit 54da543a14da6dd0e594875241494949cb659b08. I made a logic error here with the assumption that both values are known to be +/-0.0.
2025-06-04[SelectionDAG] Avoid one comparison when legalizing fmaximum (#142732)Nikita Popov
When ordering signed zero, only check the sign of one of the values. We already know at this point that both values must be +/-0.0, so it is sufficient to check one of them to correctly order them. For example, for fmaximum, if we know LHS is `+0.0` then we can always select LHS, value of RHS does not matter. If LHS is `-0.0` we can always select RHS, value of RHS doesn't matter.
2025-06-04expandFMINIMUMNUM_FMAXIMUMNUM: Quiet is not needed for NaN vs NaN (#139237)YunQiang Su
New LangRef doesn't requires quieting for NaN vs NaN, aka the result may be sNaN for sNaN vs NaN. See: https://github.com/llvm/llvm-project/pull/139228
2025-06-02[SDAG] Fix fmaximum legalization errors (#142170)Nikita Popov
FMAXIMUM is currently legalized via IS_FPCLASS for the signed zero handling. This is problematic, because it assumes the equivalent integer type is legal. Many targets have legal fp128, but illegal i128, so this results in legalization failures. Fix this by replacing IS_FPCLASS with checking the bitcast to integer instead. In that case it is sufficient to use any legal integer type, as we're just interested in the sign bit. This can be obtained via a stack temporary cast. There is existing FloatSignAsInt functionality used for legalization of FABS and similar we can use for this purpose. Fixes https://github.com/llvm/llvm-project/issues/139380. Fixes https://github.com/llvm/llvm-project/issues/139381. Fixes https://github.com/llvm/llvm-project/issues/140445.