summaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen
AgeCommit message (Collapse)Author
2025-11-23Revert "[RegAlloc] Fix the terminal rule check for interfere with DstReg ↵Aiden Grossman
(#168661)" This reverts commit 0859ac5866a0228f5607dd329f83f4a9622dedcc. This caused a couple test failures, likely due to a mid-air collision. Reverting for now to get the tree back to green and allow the original author to run UTC/friends and verify the output.
2025-11-23[RegAlloc] Fix the terminal rule check for interfere with DstReg (#168661)hstk30-hw
This maybe a bug which is introduced by commit 6749ae36b4a33769e7a77cf812d7cd0a908ae3b9, and has been present ever since. In this case, `OtherReg` always overlaps with `DstReg` cause they from the `Copy` all.
2025-11-22[llvm] Use llvm::equal (NFC) (#169173)Kazu Hirata
While I am at it, this patch uses const l-value references for std::shared_ptr. We don't need to increment the reference count by passing std::shared_ptr by value. Identified with llvm-use-ranges.
2025-11-22[CallBrPrepare] Prefer Function &F over Function &FnAiden Grossman
Function &F is the more standard abbreviation (~4000 uses in llvm versus ~300 uses).
2025-11-22[DAGCombiner] Don't optimize insert_vector_elt into shuffle if implicit ↵Hongyu Chen
truncation exists (#169022) Fixes #169017
2025-11-20TargetLowering: Avoid hardcoding OpenBSD + __guard_local name (#167744)Matt Arsenault
Query RuntimeLibcalls for the support and the name. The check that the implementation is exactly __guard_local instead of unsupported feels a bit strange.
2025-11-20[DAGCombiner] Remove unneeded m_BitReverse from visitBITREVERSE. NFC (#168918)Craig Topper
We already know we're looking at BITREVERSE, we can match on the source operand.
2025-11-20Reapply "DAG: Allow select ptr combine for non-0 address spaces" (#168292) ↵Matt Arsenault
(#168786) This reverts commit 6d5f87fc4284c4c22512778afaf7f2ba9326ba7b. Previously this failed due to treating the unknown MachineMemOperand value as known uniform.
2025-11-20[SDAG] Fix whitespace errors (NFC) (#168897)Ramkumar Ramachandra
To make life easier for future contributors. Note that formatting changes are due to git clang-format on the touched whitespace-error lines.
2025-11-20[DebugInfo] Force early line-zero calls to have meaningful locations (#156850)Jeremy Morse
In functions that have been seriously deformed during optimisation, there can be call instructions with line-zero immediately after frame setup (see C reproducer in the test added). Our previous algorithms for prologue_end ignored these, meaning someone entering a function at prologue_end would break-in after a function call had completed. Prefer instead to place prologue_end and the function scope-line on the line zero call: this isn't false (it's the first meaningful instruction of the function) and is approximately true. Given a less than ideal function, this is an OK solution.
2025-11-19[CFIInserter] Turn a reachable llvm_unreachable into a report_fatal_error. ↵Craig Topper
(#168777) This prevents it from being optimized out in non-asserts builds. Update X86 test to remove REQUIRES: asserts and check for LLVM ERROR. Add FileCheck to RISC-V test and remove UNSUPPORTED. This is the more complete fix for #168772 and #168525.
2025-11-20DAG: Fix constructing a temporary TargetTransformInfo instance (#168480)Matt Arsenault
2025-11-20RenameIndependentSubregs: try to only implicit def used subregs (#167486)Carl Ritson
Attempt to only define used subregisters when creating IMPLICIT_DEF fix ups for live interval subranges. This avoids the appearance at the MIR level of entire (wide) registers becoming live rather than relying only on transient LiveIntervals dead definitions for unused subregisters.
2025-11-19DAG: Use poison for some vector result widening (#168290)Matt Arsenault
2025-11-19CodeGen: Add subtarget to TargetLoweringBase constructor (#168620)Matt Arsenault
Currently LibcallLoweringInfo is defined inside of TargetLowering, which is owned by the subtarget. Pass in the subtarget so we can construct LibcallLoweringInfo with the subtarget. This is a temporary step that should be revertable in the future, after LibcallLoweringInfo is moved out of TargetLowering.
2025-11-19DAG: Use poison when splitting vector_shuffle results (#168176)Matt Arsenault
2025-11-19[AArch64][GlobalISel] Check unmergeSrc is a vector in ↵Ryan Cowan
matchCombineBuildUnmerge (#168692) This aims to fix the crash in #168495, my combine rule was missing a check that the source vector was in fact a vector. This then caused the legality check to fail in this example as the concat was trying to concat a non vector. I have also gated the bitcast of the concat to only work on non-scalable vectors as the mutation calls `getNumElements` which crashes when called on a scalable vector. Fixes #168495
2025-11-19[DAG] Update canCreateUndefOrPoison to handle ISD::VECTOR_COMPRESS (#168010)陈子昂
Fixes #167710
2025-11-18Introduce DwarfUnit::addBlock helper method (#168446)Tom Tromey
This patch is just a small cleanup that unifies the various spots that add a DWARF expression to the output.
2025-11-18[GISel] Use getScalarSizeInBits in LegalizerHelper::lowerBitCount (#168584)Craig Topper
For vectors, CTLZ, CTTZ, CTPOP all operate on individual elements. The lowering should be based on the element width. I noticed this by inspection. No tests in tree are currently affected, but I thought it would be good to fix so someone doesn't have to debug it in the future.
2025-11-18[RISCV] Legalize misaligned unmasked vp.load/vp.store to vle8/vse8. (#167745)Craig Topper
If vector-unaligned-mem support is not enabled, we should not generate loads/stores that are not aligned to their element size. We already do this for non-VP vector loads/stores. This code has been in our downstream for about a year and a half after finding the vectorizer generating misaligned loads/stores. I don't think that is unique to our downstream. Doing this for masked vp.load/store requires widening the mask as well which is harder to do. NOTE: Because we have to scale the VL, this will introduce additional vsetvli and the VL optimizer will not be effective at optimizing any arithmetic that is consumed by the store.
2025-11-18[GISel][RISCV] Compute CTPOP of small odd-sized integer correctly (#168559)Hongyu Chen
Fixes the assertion in #168523 This patch lifts the small, odd-sized integer to 8 bits, ensuring that the following lowering code behaves correctly.
2025-11-18[AArch64][GISel] Don't crash in known-bits when copying from vectors to ↵Nathan Corbyn
non-vectors (#168081) Updates the demanded elements before recursing through copies in case the type of the source register changes from a non-vector register to a vector register. Fixes #167842.
2025-11-18[CGP]: Optimize mul.overflow. (#148343)Hassnaa Hamdi
- Detect cases where LHS & RHS values will not cause overflow (when the Hi halfs are zero).
2025-11-18[AArch64][GlobalISel] Add better basic legalization for llround. (#168427)David Green
This adds handling for f16 and f128 lround/llround under LP64 targets, promoting the f16 where needed and using a libcall for f128. This codegen is now identical to the selection dag version.
2025-11-18[DAGCombiner] Fold select into partial.reduce.add operands. (#167857)Sander de Smalen
This generates more optimal codegen when using partial reductions with predication. ``` partial_reduce_*mla(acc, sel(p, mul(*ext(a), *ext(b)), splat(0)), splat(1)) -> partial_reduce_*mla(acc, sel(p, a, splat(0)), b) partial.reduce.*mla(acc, sel(p, *ext(op), splat(0)), splat(1)) -> partial.reduce.*mla(acc, sel(p, op, splat(0)), splat(trunc(1))) ```
2025-11-17[MLGO] Fully Remove MLRegalloc Experimental Features (#168252)Aiden Grossman
20a22a45e96bc94c3a8295cccc9031bd87552725 was supposed to fully remove these, but left around the functionality to actually compute them and a unittest that ensured they worked. These are not development features in the sense of features used in development mode, but experimental features that have been superseded by MIR2Vec.
2025-11-17[AArch64][GlobalISel] Add combine for build_vector(unmerge, unmerge, undef, ↵Ryan Cowan
undef) (#165539) This PR adds a new combine to the `post-legalizer-combiner` pass. The new combine checks for vectors being unmerged and subsequently padded with `G_IMPLICIT_DEF` values by building a new vector. If such a case is found, the vector being unmerged is instead just concatenated with a `G_IMPLICIT_DEF` that is as wide as the vector being unmerged. This removes unnecessary `mov` instructions in a few places.
2025-11-17[DAG] Add strictfp implicit def reg after metadata. (#168282)David Green
This prevents a machine verifier error, where it "Expected implicit register after groups". Fixes #158661
2025-11-17[MachinePipeliner] Detect a cycle in PHI dependencies early on (#167095)Abinaya Saravanan
- This patch detects cycles by phis and bails out if one is found. - It prevents to violate DAG restrictions. Abort pipelining in the below case %1 = phi i32 [ %a, %entry ], [ %3, %loop ] %2 = phi i32 [ %a, %entry ], [ %1, %loop ] %3 = phi i32 [ %b, %entry ], [ %2, %loop ] --------- Co-authored-by: Ryotaro Kasuga <kasuga.ryotaro@fujitsu.com>
2025-11-17[InlineAsmLowering] unsigned -> TypeSize for getTypeStoreSize resultpvanhout
2025-11-17[GlobalMerge]Prefer use global-merge-max-offset instead of the ↵hstk30-hw
target-specific constant offset. (#165591) In the Dhrystone benchmark, I find some adjacent global not be merged, on the contrary the GCC's anchor optimize is work. Use global-merge-max-offset to set the max offset can yield similar results (still slightly different, at least we can control the offset).
2025-11-16Revert "DAG: Allow select ptr combine for non-0 address spaces" (#168292)ronlieb
Reverts llvm/llvm-project#167909
2025-11-16[CodeGen] Remove a redundant declaration (NFC) (#168285)Kazu Hirata
EnableFSDiscriminator is declared in DebugInfoMetadata.h. Identified with readability-redundant-declaration.
2025-11-16DAG: Preserve poison in combineConcatVectorOfScalars (#168220)Matt Arsenault
2025-11-16[CodeGen] Turn MCRegUnit into an enum class (NFC) (#167943)Sergei Barannikov
This changes `MCRegUnit` type from `unsigned` to `enum class : unsigned` and inserts necessary casts. The added `MCRegUnitToIndex` functor is used with `SparseSet`, `SparseMultiSet` and `IndexedMap` in a few places. `MCRegUnit` is opaque to users, so it didn't seem worth making it a full-fledged class like `Register`. Static type checking has detected one issue in `PrologueEpilogueInserter.cpp`, where `BitVector` created for `MCRegister` is indexed by both `MCRegister` and `MCRegUnit`. The number of casts could be reduced by using `IndexedMap` in more places and/or adding a `BitVector` adaptor, but the number of casts *per file* is still small and `IndexedMap` has limitations, so it didn't seem worth the effort. Pull Request: https://github.com/llvm/llvm-project/pull/167943
2025-11-16[SelectionDAG] Verify SDTCisVT and SDTCVecEltisVT constraints (#150125)Sergei Barannikov
Teach `SDNodeInfoEmitter` TableGen backend to process `SDTypeConstraint` records and emit tables for them. The tables are used by `SDNodeInfo::verifyNode()` to validate a node being created. This PR only adds validation code for `SDTCisVT` and `SDTCVecEltisVT` constraints to keep it smaller. Pull Request: https://github.com/llvm/llvm-project/pull/150125
2025-11-15[SelectionDAG] Fix AArch64 machine verifier bug when expanding ↵AZero13
LOOP_DEPENDENCE_MASK (#168221) TargetConstant nodes don't match TableGen ImmLeaf patterns during instruction selection. When this zero constant flows into the AArch64 CCMP formation code, the machine verifier hits an assertion in expensive checks. Fixes: #168227
2025-11-16[revert][CodeGen] add a command to force global merge (#168230)Austin
sorry, this was my mistake
2025-11-16[CodeGen] add a command to force global mergeAustin
I found that in some performance scenarios, such as under O2, this pr can be helpful for a series of loading global variables.
2025-11-15DAG: Use poison in SplitVecRes_VP_LOAD_FF (#167753)Matt Arsenault
2025-11-15DAG: Use poison when legalizing scalar_to_vector results (#167751)Matt Arsenault
2025-11-14[AArch64][GlobalISel] Improve lowering of vector fp16 fpext (#165554)Ryan Cowan
This PR improves the lowering of vectors of fp16 when using fpext. Previously vectors of fp16 were scalarized leading to lots of extra instructions. Now, vectors of fp16 will be lowered when extended to fp64 via the preexisting lowering logic for extends. To make use of the existing logic, we need to add elements until we reach the next power of 2.
2025-11-14[SelectionDAGBuilder] Propagate fast-math flags to fpext (#167574)Mikołaj Piróg
As in title. Without this, fpext behaves in selectionDAG as always having no fast-math flags.
2025-11-14[RDF] Rename RegisterId field in RegisterRef Reg->Id. NFC (#168154)Craig Topper
Not all RegisterId values are registers, so Id is a more appropriate name. Use asMCReg() in some places that assumed it was a register.
2025-11-15[GlobalISel] Return byte offsets from computeValueLLTs (NFC) (#166747)Sergei Barannikov
To avoid scaling offsets back and forth. This is also what SelectionDAG equivalent (ComputeValueVTs) does, and will allow to reuse ComputeValueTypes with less effort.
2025-11-14opt: Fix bad merge of #167996 (#168110)Matt Arsenault
After the base branch was moved to main, this somehow ended up adding a second definition of RTLCI, instead of modifying the existing one. Also fix other build error with gcc bots.
2025-11-14RuntimeLibcalls: Move VectorLibrary handling into TargetOptions (#167996)Matt Arsenault
This fixes the -fveclib flag getting lost on its way to the backend. Previously this was its own cl::opt with a random boolean. Move the flag handling into CommandFlags with other backend ABI-ish options, and have clang directly set it, rather than forcing it to go through command line parsing. Prior to de68181d7f, codegen used TargetLibraryInfo to find the vector function. Clang has special handling for TargetLibraryInfo, where it would directly construct one with the vector library in the pass pipeline. RuntimeLibcallsInfo currently is not used as an analysis in codegen, and needs to know the vector library when constructed. RuntimeLibraryAnalysis could follow the same trick that TargetLibraryInfo is using in the future, but a lot more boilerplate changes are needed to thread that analysis through codegen. Ideally this would come from an IR module flag, and nothing would be in TargetOptions. For now, it's better for all of these sorts of controls to be consistent.
2025-11-14[RDF] RegisterRef/RegisterId improvements. NFC (#168030)Craig Topper
RegisterId can represent a physical register, a MCRegUnit, or an index into a side structure that stores register masks. These 3 types were encoded by using the physical reg, stack slot, and virtual register encoding partitions from the Register class. This encoding scheme alias wasn't well contained so Register::index2StackSlot and Register::stackSlotIndex appeared in multiple places. This patch gives RegisterRef its own encoding defines and separates it from Register. I've removed the generic idx() method in favor of getAsMCReg(), getAsMCRegUnit(), and getMaskIdx() for some degree of type safety. Some places used the RegisterId field of RegisterRef directly as a register. Those have been updated to use getAsMCReg. Some special cases for RegisterId 0 have been removed as it can be treated like a MCRegister by existing code. I think I want to rename the Reg field of RegisterRef to Id, but I'll do that in another patch. Additionally, callers of the RegisterRef constructor need to be audited for implicit conversions from Register/MCRegister to unsigned.
2025-11-14[GlobalISel] Add support for value/constants as inline asm memory operand ↵Pierre van Houtryve
(#161501) InlineAsmLowering rejected inline assembly with memory reference inputs if the values passed to the inline asm weren't pointers. The DAG lowering however handled them just fine. This patch updates InlineAsmLowering to store such values on the stack, and then use the stack pointer as the "indirect" version of the operand.