summaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Vectorize
AgeCommit message (Collapse)Author
2025-11-22[VPlan] Share PreservesUniformity logic between isSingleScalar and ↵Florian Hahn
isUniformAcrossVFsAndUFs Extract the PreservesUniformity logic from isSingleScalar into a shared static helper function. Update isUniformAcrossVFsAndUFs to use this logic for VPWidenRecipe and VPInstruction, so that any opcode that preserves uniformity is considered uniform-across-vf-and-uf if its operands are. This unifies the uniformity checking logic and makes it easier to extend in the future. This should effectively by NFC currently.
2025-11-22[VPlan] Create resume phis in scalar preheader early. (NFC) (#166099)Florian Hahn
Create phi recipes for scalar resume value up front in addInitialSkeleton during initial construction. This will allow moving the remaining code dealing with resume values to VPlan transforms/construction. PR: https://github.com/llvm/llvm-project/pull/166099
2025-11-21[VPlan] Cast to VPIRMetadata in getMemoryLocation (NFC) (#169028)Ramkumar Ramachandra
This allows us to strip an unnecessary TypeSwitch.
2025-11-21[VPlan] Only apply forced cost to recipes with underlying values. (#168372)Florian Hahn
Only apply forced instruction costs to recipes with underlying values to match the legacy cost model. A VPlan may have a number of additional VPInstructions without underlying values that are not considered for its cost, and assigning forced costs to them would incorrectly inflate its cost. This fixes a cost divergence between legacy and VPlan-based cost models with forced instruction costs. PR: https://github.com/llvm/llvm-project/pull/168372
2025-11-21[VPlan] Drop poison-generating flags on induction trunc (#168922)Ramkumar Ramachandra
After truncating an integer-induction, neither nuw nor nsw hold. Fixes #168902. Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-11-20[VPlan] Remove PtrIV::IsScalarAfterVectorization, use VPlan analysis. (#168289)Florian Hahn
Remove `VPWidenPointerInductionRecipe::IsScalarAfterVectorization` and replace it with `onlyScalarValuesUsed`. This removes the need to carry state from the legacy cost model through VPlan, and the VPlan-based analysis gives more accurate results, avoiding a number of extracts. PR: https://github.com/llvm/llvm-project/pull/168289
2025-11-20[SLP]Check if the non-schedulable phi parent node has unique operandsAlexey Bataev
Need to check if the non-schedulable phi parent node has unique operands, if the incoming node has copyables, and the node is commutative. Otherwise, there might be issues with the correct calculation of the dependencies. Fixes #168589
2025-11-20[LV] Check full partial reduction chains in order. (#168036)Florian Hahn
https://github.com/llvm/llvm-project/pull/162822 added another validation step to check if entries in a partial reduction chain have the same scale factor. But the validation was still dependent on the order of entries in PartialReductionChains, and would fail to reject some cases (e.g. if the first first link matched the scale of the second link, but the second link is invalidated later). To fix that, group chains by their starting phi nodes, then perform the validation for each chain, and if it fails, invalidate the whole chain for the phi. Fixes https://github.com/llvm/llvm-project/issues/167243. Fixes https://github.com/llvm/llvm-project/issues/167867. PR: https://github.com/llvm/llvm-project/pull/168036
2025-11-20[LV] Allow partial reductions with an extended bin op (#165536)Sam Tebbs
A pattern of the form reduce.add(ext(mul)) is valid for a partial reduction as long as the mul and its operands fulfill the requirements of a normal partial reduction. The mul's extend operands will be optimised to the wider extend, and we already have oneUse checks in place to make sure the mul and operands can be modified safely. 1. -> https://github.com/llvm/llvm-project/pull/165536 2. https://github.com/llvm/llvm-project/pull/165543
2025-11-19Re-land [Transform][LoadStoreVectorizer] allow redundant in Chain (#168135)Gang Chen
This is the fixed version of https://github.com/llvm/llvm-project/pull/163019
2025-11-19[SLP]Fix insertion point for setting for the nodesAlexey Bataev
The problem with the many def-use chain problems in SLP vectorizer are related to the fact that some nodes reuse the same instruction as insertion point. Insertion point is not the instruction, but the place between instructions. To set it correctly, better to generate pseudo instruction immediately after the last instruction, and use it as insertion point. It resolves the issues in most cases. Fixes #168512 #168576
2025-11-19[VPlan] Collect FMFs for in-loop reduction chain in VPlan. (NFC)Florian Hahn
Replace retrieving FMFs for in-loop reduction via underlying instruction + legal by collecting the flags during reduction chain traversal in VPlan.
2025-11-19[SLPVectorizer] Widen constant strided loads. (#162324)Mikhail Gudim
Given a set of pointers, check if they can be rearranged as follows (%s is a constant): %b + 0 * %s + 0 %b + 0 * %s + 1 %b + 0 * %s + 2 ... %b + 0 * %s + w %b + 1 * %s + 0 %b + 1 * %s + 1 %b + 1 * %s + 2 ... %b + 1 * %s + w ... If the pointers can be rearanged in the above pattern, it means that the memory can be accessed with a strided loads of width `w` and stride `%s`.
2025-11-19[NFC][LLVM] Namespace cleanup in SLPVectorizer (#168623)Rahul Joshi
- Remove file local functions out of `llvm` or anonymous namespace and make them static. - Use namespace qualifier to define `BoUpSLP` class and several template specializations.
2025-11-19[LV] Consolidate shouldOptimizeForSize and remove unused BFI/PSI. NFC (#168697)Luke Lau
#158690 plans on passing BFI as a lazy lambda to avoid computing BlockFrequencyInfo when not needed. In preparation for that, this PR removes BFI and PSI from some constructors that aren't used. It also consolidates the two calls to llvm::shouldOptimizeForSize so that the result is computed once and passed where needed. This also renames OptForSize in LoopVectorizationLegality to clarify that it's to prevent runtime SCEV checks, see https://reviews.llvm.org/D68082
2025-11-19[VPLan] Reduce duplication in VPHeaderPHIRecipe::classof. (NFCI)Florian Hahn
Implement VPHeaderPHIRecipe::classof(const VPValue *V) in terms of the variant taking VPRecipeBase. Reduces some duplication, split off from https://github.com/llvm/llvm-project/pull/141431.
2025-11-19[VPlan] Print debug info for all recipes. (#168454)Florian Hahn
Use the recently refactored VPRecipeBase::print to print debug location for all recipes. PR: https://github.com/llvm/llvm-project/pull/168454
2025-11-19[LV]: Skip Epilogue scalable VF greater than RemainingIterations. (#156724)Hassnaa Hamdi
Consider skipping epilogue scalable VF when they are greater than RemainingIterations same as fixed VF. And skip scalable RemainingIterations from that comparison because SCEV ATM can't evaluate non-canonical vscale-based expressions.
2025-11-19[TTI] Use MemIntrinsicCostAttributes for getMaskedMemoryOpCost (#168029)Shih-Po Hung
- Split from #165532. This is a step toward a unified interface for masked/gather-scatter/strided/expand-compress cost modeling. - Replace the ad-hoc parameter list with a single attributes object. API change: ``` - InstructionCost getMaskedMemoryOpCost(Opcode, Src, Alignment, - AddressSpace, CostKind); + InstructionCost getMaskedMemoryOpCost(MemIntrinsicCostAttributes, + CostKind); ``` Notes: - NFCI intended: callers populate MemIntrinsicCostAttributes with the same information as before. - Follow-up: migrate gather/scatter, strided, and expand/compress cost queries to the same attributes-based entry point.
2025-11-18[VPlan] VPIRFlags kind for FCmp with predicate + fast-math flags (NFCI).Florian Hahn
FCmp instructions have both a predicate and fast-math flags. Introduce a new FCmp kind, that combines both to model this correctly in the current system. This should be NFC modulo VPlan printing which now includes the correct fast-math flags.
2025-11-18[VPlan] Fix OpType-mismatch in getFlagsFromIndDesc (#168560)Ramkumar Ramachandra
Follow up on a cse OpType-mismatch crash reported due to ef023cae388d (Reland [VPlan] Expand WidenInt inductions with nuw/nsw), setting the OpType correctly when returning from getFlagsFromIndDesc.
2025-11-18[VPlan] Populate and use VPIRFlags from initial VPInstruction. (#168450)Florian Hahn
Update VPlan to populate VPIRFlags during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRFlags from the underlying IR instruction each time. The VPRecipeWithIRFlags constructor taking an underlying instruction and setting the flags based on it has been removed. This centralizes initial VPIRFlags creation and ensures flags are consistently available throughout VPlan transformations and makes sure we don't accidentally re-add flags from the underlying instruction that already got dropped during transformations. Follow-up to https://github.com/llvm/llvm-project/pull/167253, which did the same for VPIRMetadata. Should be NFC w.r.t. to the generated IR. PR: https://github.com/llvm/llvm-project/pull/168450
2025-11-18[VPlan] Support isa/dyn_cast from VPRecipeBase to VPIRMetadata (NFC). (#166245)Florian Hahn
Implement CastInfo from VPRecipeBase to VPIRMetadata to support isa/dyn_Cast. This is similar to CastInfoVPPhiAccessors, supporting dyn_cast by down-casting to the concrete recipe types inheriting from VPIRMetadata. Can be used for more generalized VPIRMetadata printing following https://github.com/llvm/llvm-project/pull/165825. PR: https://github.com/llvm/llvm-project/pull/166245
2025-11-18[VPlan] Hoist loads with invariant addresses using noalias metadata. (#166247)Florian Hahn
This patch implements a transform to hoists single-scalar replicated loads with invariant addresses out of the vector loop to the preheader when scoped noalias metadata proves they cannot alias with any stores in the loop. This enables hosting of loads we can prove do not alias any stores in the loop due to memory runtime checks added during vectorization. PR: https://github.com/llvm/llvm-project/pull/166247
2025-11-18[SLP] Invariant loads cannot have a memory dependency on stores. (#167929)Michael Bedy
2025-11-17[VPlan] Populate and use VPIRMetadata from VPInstructions (NFC) (#167253)Florian Hahn
Update VPlan to populate VPIRMetadata during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRMetadata from the underlying IR instruction each time. This centralizes VPIRMetadata in VPInstructions and ensures metadata is consistently available throughout VPlan transformations. PR: https://github.com/llvm/llvm-project/pull/167253
2025-11-17[VPlan] Replace VPIRMetadata::addMetadata with setMetadata. (NFC)Florian Hahn
Replace addMetadata with setMetadata, which sets metadata, updating existing entries or adding a new entry otherwise. This isn't strictly needed at the moment, but will be needed for follow-up patches.
2025-11-17Reland [VPlan] Expand WidenInt inductions with nuw/nsw (#168354)Ramkumar Ramachandra
Changes: The previous patch had to be reverted to a mismatching-OpType assert in cse. The reduced-test has now been added corresponding to a RVV pointer-induction, and the pointer-induction case has been updated to use createOverflowingBinaryOp. While at it, record VPIRFlags in VPWidenInductionRecipe.
2025-11-17[VPlan] Add printRecipe, prepare printing metadata in ::print (NFC) (#166244)Florian Hahn
Add a new pinrRecipe which handles printing the recipe without common info like debug info or metadata. Prepares to print them once, in ::print(), after/in combination with https://github.com/llvm/llvm-project/pull/165825. PR: https://github.com/llvm/llvm-project/pull/166244
2025-11-17[VPlan] Fix LastActiveLane assertion on scalar VF (#167897)Luke Lau
For a scalar only VPlan with tail folding, if it has a phi live out then legalizeAndOptimizeInductions will scalarize the widened canonical IV feeding into the header mask: <x1> vector loop: { vector.body: EMIT vp<%4> = CANONICAL-INDUCTION ir<0>, vp<%index.next> vp<%5> = SCALAR-STEPS vp<%4>, ir<1>, vp<%0> EMIT vp<%6> = icmp ule vp<%5>, vp<%3> EMIT vp<%index.next> = add nuw vp<%4>, vp<%1> EMIT branch-on-count vp<%index.next>, vp<%2> No successors } Successor(s): middle.block middle.block: EMIT vp<%8> = last-active-lane vp<%6> EMIT vp<%9> = extract-lane vp<%8>, vp<%5> Successor(s): ir-bb<exit> The verifier complains about this but this should still generate the correct last active lane, so this fixes the assert by handling this case in isHeaderMask. There is a similar pattern already there for ActiveLaneMask, which also expects a VPScalarIVSteps recipe. Fixes #167813
2025-11-17[VPlan] Mark getPredicatedMask static (NFC) (#168067)Ramkumar Ramachandra
2025-11-17[VPlan] Improve code in RemoveMask_match (NFC) (#168065)Ramkumar Ramachandra
2025-11-16[VPlan] Delegate to other VPInstruction constructors. (NFCI)Florian Hahn
Update VPInstruction constructor to delegate to constructor with more comprehensive checking and validation. This required updating some unit tests, to make sure the constructed VPInstructions are valid.
2025-11-16[SLP]Do not consider split nodes, when checking parent PHI-based nodesAlexey Bataev
The compiler should not consider split vectorize nodes, when checking for non-schedulable PHI-based parent nodes. Only pure PHI nodes must be considered, they only can be considered as explicit users, split nodes are not. Fixes #168268
2025-11-15[LV] Use VPlan pattern matching in adjustRecipesForReductions (NFC)Florian Hahn
Replace the assert checking if CurrentLinkI is a CmpInst with a pattern matching check in the if condition. This uses VPlan-level pattern matching instead of inspecting the underlying instruction type.
2025-11-15[VPlan] Always set trip count when creating plan for unit tests (NFC).Florian Hahn
Simplifies some tests which no do not need to pass TC, and future changes will require to always have a trip count available.
2025-11-15[llvm] Delete pointers without null checks (NFC) (#168183)Kazu Hirata
Identified with readability-delete-null-pointer.
2025-11-15[VPlan] Support VPWidenIntOrFpInduction in getSCEVExprForVPValue. (NFCI)Florian Hahn
Construct SCEVs for VPWidenIntOrFpInductionRecipe analogous to VPCanonicalInductionPHIRecipe: create an AddRec with start + step from the recipe. Currently the only impact should be computing more costs of replicating stores directly in VPlan.
2025-11-15[VPlan] Strip outdated comment in optimizeForVFAndUF (NFC) (#168068)Ramkumar Ramachandra
2025-11-14[SLP]Check if the copyable element is a sub instruciton with abs in isCommutableAlexey Bataev
Need to check if the non-copyable element is an instruction before actually trying to check its NSW attribute.
2025-11-14Revert "[SLP]Check if the copyable element is a sub instruciton with abs in ↵Alexey Bataev
isCommutable" This reverts commit ddf5bb0a2e2d2dd77bce66173387d62ab7174d9f to fix buildbots https://lab.llvm.org/buildbot/#/builders/11/builds/28083.
2025-11-14[SLP]Check if the copyable element is a sub instruciton with abs in isCommutableAlexey Bataev
Need to check if the non-copyable element is an instruction before actually trying to check its NSW attribute.
2025-11-14Revert "[Transform][LoadStoreVectorizer] allow redundant in Chain (#1… ↵Gang Chen
(#168105) …63019)" This reverts commit 92e5608ffa6ff39ac3707f29418cc9482471f5d9.
2025-11-14[SLP]Enable Sub as a base instruction in copyablesAlexey Bataev
Patch adds support for sub instructions as main instruction in copyables elements. Also, adds a check if the base instruction is not profitable for the selection if at least one instruction with the main opcode is used as an immediate operand. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/163231
2025-11-14Revert "[VPlan] Expand WidenInt inductions with nuw/nsw" (#168080)Alex Bradbury
Reverts llvm/llvm-project#163538 This is causing build failures on the two-stage RVV buildbots. e.g. https://lab.llvm.org/buildbot/#/builders/214/builds/1363. I've shared a reproducer and more information at https://github.com/llvm/llvm-project/pull/163538#issuecomment-3533482822 This reverts commit 355e0f94af5adabe90ac57110ce1b47596afd4cd.
2025-11-14[VPlan] Expand WidenInt inductions with nuw/nsw (#163538)Ramkumar Ramachandra
While at it, record VPIRFlags in VPWidenInductionRecipe.
2025-11-14[LV] Explicitly disable in-loop reductions for AnyOf and FindIV. nfc (#163541)Mel Chen
Currently, in-loop reductions for AnyOf and FindIV are not supported. They were implicitly blocked. This happened because RecurrenceDescriptor::getReductionOpChain could not detect their recurrence chain. The reason is that RecurrenceDescriptor::getOpcode was set to Instruction::Or, but the recurrence chains of AnyOf and FindIV do not actually contain an Instruction::Or. This patch explicitly disables in-loop reductions for AnyOf and FindIV instead of relying on getReductionOpChain to implicitly prevent them.
2025-11-14[VPlan] Disable partial reductions again with EVL tail folding (#167863)Luke Lau
VPPartialReductionRecipe doesn't yet support an EVL variant, and we guard against this by not calling convertToAbstractRecipes when we're tail folding with EVL. However recently some things got shuffled around which means we may detect some scaled reductions in collectScaledReductions and store them in ScaledReductionMap, where outside of convertToAbstractRecipes we may look them up and start e.g. adding a scale factor to an otherwise regular VPReductionPHI. This fixes it by skipping collectScaledReductions, and fixes #167861
2025-11-13[VPlan] Add findComputeReductionResult helper. (NFC)Florian Hahn
Move utility to helper for re-use in follow-up patches.
2025-11-13Revert "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. ↵Florian Hahn
(#149042)" This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b. This appears to be causing some runtime failures on RISCV https://lab.llvm.org/buildbot/#/builders/210/builds/5221