summaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
AgeCommit message (Collapse)Author
2025-11-22[VPlan] Create resume phis in scalar preheader early. (NFC) (#166099)Florian Hahn
Create phi recipes for scalar resume value up front in addInitialSkeleton during initial construction. This will allow moving the remaining code dealing with resume values to VPlan transforms/construction. PR: https://github.com/llvm/llvm-project/pull/166099
2025-11-21[VPlan] Drop poison-generating flags on induction trunc (#168922)Ramkumar Ramachandra
After truncating an integer-induction, neither nuw nor nsw hold. Fixes #168902. Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-11-18[VPlan] Populate and use VPIRFlags from initial VPInstruction. (#168450)Florian Hahn
Update VPlan to populate VPIRFlags during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRFlags from the underlying IR instruction each time. The VPRecipeWithIRFlags constructor taking an underlying instruction and setting the flags based on it has been removed. This centralizes initial VPIRFlags creation and ensures flags are consistently available throughout VPlan transformations and makes sure we don't accidentally re-add flags from the underlying instruction that already got dropped during transformations. Follow-up to https://github.com/llvm/llvm-project/pull/167253, which did the same for VPIRMetadata. Should be NFC w.r.t. to the generated IR. PR: https://github.com/llvm/llvm-project/pull/168450
2025-11-18[VPlan] Hoist loads with invariant addresses using noalias metadata. (#166247)Florian Hahn
This patch implements a transform to hoists single-scalar replicated loads with invariant addresses out of the vector loop to the preheader when scoped noalias metadata proves they cannot alias with any stores in the loop. This enables hosting of loads we can prove do not alias any stores in the loop due to memory runtime checks added during vectorization. PR: https://github.com/llvm/llvm-project/pull/166247
2025-11-17[VPlan] Populate and use VPIRMetadata from VPInstructions (NFC) (#167253)Florian Hahn
Update VPlan to populate VPIRMetadata during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRMetadata from the underlying IR instruction each time. This centralizes VPIRMetadata in VPInstructions and ensures metadata is consistently available throughout VPlan transformations. PR: https://github.com/llvm/llvm-project/pull/167253
2025-11-17[VPlan] Replace VPIRMetadata::addMetadata with setMetadata. (NFC)Florian Hahn
Replace addMetadata with setMetadata, which sets metadata, updating existing entries or adding a new entry otherwise. This isn't strictly needed at the moment, but will be needed for follow-up patches.
2025-11-17Reland [VPlan] Expand WidenInt inductions with nuw/nsw (#168354)Ramkumar Ramachandra
Changes: The previous patch had to be reverted to a mismatching-OpType assert in cse. The reduced-test has now been added corresponding to a RVV pointer-induction, and the pointer-induction case has been updated to use createOverflowingBinaryOp. While at it, record VPIRFlags in VPWidenInductionRecipe.
2025-11-17[VPlan] Mark getPredicatedMask static (NFC) (#168067)Ramkumar Ramachandra
2025-11-17[VPlan] Improve code in RemoveMask_match (NFC) (#168065)Ramkumar Ramachandra
2025-11-15[VPlan] Strip outdated comment in optimizeForVFAndUF (NFC) (#168068)Ramkumar Ramachandra
2025-11-14Revert "[VPlan] Expand WidenInt inductions with nuw/nsw" (#168080)Alex Bradbury
Reverts llvm/llvm-project#163538 This is causing build failures on the two-stage RVV buildbots. e.g. https://lab.llvm.org/buildbot/#/builders/214/builds/1363. I've shared a reproducer and more information at https://github.com/llvm/llvm-project/pull/163538#issuecomment-3533482822 This reverts commit 355e0f94af5adabe90ac57110ce1b47596afd4cd.
2025-11-14[VPlan] Expand WidenInt inductions with nuw/nsw (#163538)Ramkumar Ramachandra
While at it, record VPIRFlags in VPWidenInductionRecipe.
2025-11-13Revert "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. ↵Florian Hahn
(#149042)" This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b. This appears to be causing some runtime failures on RISCV https://lab.llvm.org/buildbot/#/builders/210/builds/5221
2025-11-13[VPlan] Simplify ExplicitVectorLength(%AVL) -> %AVL when AVL <= VF (#167647)Luke Lau
[`llvm.experimental.get.vector.length`](https://llvm.org/docs/LangRef.html#id2399) has the property that if the AVL (%cnt) is less than or equal to VF (%max_lanes) then the return value is just AVL. This patch uses SCEV to simplify this in optimizeForVFAndUF, and adds `ExplicitVectorLength` to `VPInstruction::opcodeMayReadOrWriteFromMemory` so it gets removed once dead.
2025-11-12[VPlan] Fix assert in store-user in narrowToSingleScalars (#167686)Ramkumar Ramachandra
Follow up on c2d4c7c18b96 ([VPlan] Permit more users in narrowToSingleScalars) to fix an assert related to WidenStore users of the recipe being narrowed in narrowToSingleScalars.
2025-11-12[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042)Florian Hahn
Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042
2025-11-12[VPlan] Plumb scalable register size through narrowInterleaveGroups (#167505)Luke Lau
On RISC-V narrowInterleaveGroups doesn't kick in because the wrong VectorRegWidth is passed to isConsecutiveInterleaveGroup. narrowInterleaveGroups is always passed the RGK_FixedWidthVector register size, but on RISC-V the RGK_ScalableVector size is twice as large because we want to use LMUL 2. This causes the `GroupSize == VectorRegWidth` check to fail. This fixes it by using the scalable register size whenever the VF is scalable and plumbing it through as a potentially scalable TypeSize. Note that this only makes a difference when tail folding is disabled, as narrowInterleaveGroups can't handle EVL based IVs yet.
2025-11-12[VPlan] Merge fcmp uno feeding Or. (#167251)Florian Hahn
Fold or (fcmp uno %A, %A), (fcmp uno %B, %B), ... -> or (fcmp uno %A, %B), ... This pattern is generated to check if any vector lane is NaN, and combining multiple compares is beneficial on architectures that have dedicated instructions. Alive2 Proof: https://alive2.llvm.org/ce/z/vA_aoM Combine suggested as part of #161735 PR: https://github.com/llvm/llvm-project/pull/167251
2025-11-12[LV][EVL] Replace VPInstruction::Select with vp.merge for predicated div/rem ↵Mel Chen
(#154072) Since div/rem operations don’t support a mask operand, the lanes of the divisor that are masked out are currently replaced with 1 using VPInstruction::Select before the predicated div/rem operation. This patch replaces ``` VPInstruction::Select(logical_and(header_mask, conditional_mask), LHS, RHS) ``` with ``` vp.merge(conditional_mask, LHS, RHS, EVL) ``` so that the header mask can be replaced by EVL in this usage scenario when tail folding with EVL.
2025-11-11[VPlan] Remove unneeded getDefiningRecipe with isa/cast/dyn_cast. (NFC)Florian Hahn
Classof for most recipes directly supports VPValue, so there is no need to call getDefiningRecipe when using isa/cast/dyn_cast.
2025-11-11[VPlan] Add getSingleUser helper (NFC).Florian Hahn
Add helper to make it easier to retrieve the single user of a VPUser.
2025-11-11Revert "[VPlan] Handle WidenGEP in narrowToSingleScalars" (#167509)Ramkumar Ramachandra
This reverts commit fdd52f5fe130fb8b98f4aed3d15aa0789cce6b40, as it causes buildbot failures. This will give us time to investigate the failure. https://lab.llvm.org/buildbot/#/builders/210/builds/5160
2025-11-11[LV] Consider interleaving when -enable-wide-lane-mask=true (#163387)Kerry McLaughlin
Currently the only way to enable the use of wide active lane masks is to pass -enable-wide-lane-mask and force both interleaving & tail-folding with additional flags. This patch changes selectInterleaveCount to consider interleaving if wide lane masks were requested, although the feature remains off by default.
2025-11-11[VPlan] Handle WidenGEP in narrowToSingleScalars (#166740)Ramkumar Ramachandra
This allows us to strip a special case in VPWidenGEP::execute.
2025-11-10[VPlan] Update canNarrowLoad to check WidenMember0's op first (NFCI).Florian Hahn
This hardens the code to check based on WideMember0's operands. This ensures each call will go through the same check. Should be NFC currently but needed when generalizing in follow-up patches.
2025-11-10[VPlan] Permit more users in narrowToSingleScalars (#166559)Ramkumar Ramachandra
narrowToSingleScalarRecipes can permit users that are WidenStore, or a VPInstruction that has a suitable opcode. This is a generalization and extension of the existing code.
2025-11-10[VPlan] Simplify branch-cond with getVectorTripCount (#155604)Ramkumar Ramachandra
Call getVectorTripCount first, and call getTripCount failing that, in simplifyBranchConditionForVFAndUF, to simplify missed cases. While at it, strip the dead check for a zero TC.
2025-11-07[VPlan] Convert redundant isSingleScalar check into assert (NFC).Florian Hahn
Follow-up to post-commit suggestion in https://github.com/llvm/llvm-project/pull/165506. C must be a single-scalar, turn check into assert.
2025-11-06[VPlan] Rename onlyFirst(Lane|Part)Used (NFC) (#166562)Ramkumar Ramachandra
Rename onlyFirst(Lane|Part)Used to usesFirst(Lane|Part)Only, in line with usesScalars, for clarity.
2025-11-06[VPlan] Retrieve alignment from Load/StoreInst in constructors. nfc (#165722)Mel Chen
This patch removes the explicit Alignment parameter from VPWidenLoadRecipe and VPWidenStoreRecipe constructors. Instead, these recipes now directly retrieve the alignment from their LoadInst/StoreInst.
2025-11-05[VPlan] Move code narrowing ops feeding an interleave group to helper (NFCI)Florian Hahn
Move and combine the code to narrow ops feeding interleave groups to a single unified static helper. NFC, as legalization logic has not changed.
2025-11-05[VPlan] Handle single-scalar conds in VPWidenSelectRecipe. (#165506)Florian Hahn
Generalize VPWidenSelectRecipe codegen to consider single-scalar conditions instead of just loop-invariant ones. If the condition is a single-scalar, we can simply use a scalar condition. PR: https://github.com/llvm/llvm-project/pull/165506
2025-11-05[VPlan] Avoid sinking allocas in sinkScalarOperands (#166135)Ramkumar Ramachandra
Use cannotHoistOrSinkRecipe to forbid sinking allocas.
2025-11-04[VPlan] Fix first-lane comment in sinkScalarOperands (NFC) (#166347)Ramkumar Ramachandra
To follow-up on a post-commit review.
2025-11-04[VPlan] Shorten insert-idiom in sinkScalarOperands (NFC) (#166343)Ramkumar Ramachandra
To follow-up on a post-commit review.
2025-11-03[VPlanTransform] Specialize simplifyRecipe for VPSingleDefRecipe pointer. ↵Mel Chen
nfc (#165568) The function simplifyRecipe now takes a VPSingleDefRecipe pointer since it only simplifies single-def recipes for now.
2025-11-03[VPlan] Perform optimizeMaskToEVL in terms of pattern matching (#155394)Luke Lau
Currently in optimizeMaskToEVL we convert every widened load, store or reduction to a VP predicated recipe with EVL, regardless of whether or not it uses the header mask. So currently we have to be careful when working on other parts VPlan to make sure that the EVL transform doesn't break or transform something incorrectly, because it's not a semantics preserving transform. Forgetting to do so has caused miscompiles before, like the case that was fixed in #113667 This PR rewrites it to work in terms of pattern matching, so it now only converts a recipe to a VP predicated recipe if it is exactly masked with the header mask. After this the transform should be a true optimisation and not change any semantics, so it shouldn't miscompile things if other parts of VPlan change. This fixes #152541, and allows us to move addExplicitVectorLength into tryToBuildVPlanWithVPRecipes in #153144 It also splits out the load/store transforms into separate patterns for reversed and non-reversed, which should make #146525 easier to implement and reason about.
2025-11-03[VPlan] Rewrite sinkScalarOperands (NFC) (#151696)Ramkumar Ramachandra
Rewrite sinkScalarOperands in VPlanTransforms for clarity, in preparation for follow-up work to extend it to handle more recipes.
2025-11-02[VPlan] Mark BranchOnCount and BranchOnCond as having side effects (NFC)Florian Hahn
BranchOnCount and BranchOnCond do not read memory, but cannot be moved. Mark them as having side-effects, but not reading/writing memory, which more accurately models that above. This allows removing some special checking for branches both in the current code and future patches.
2025-11-01[VPlan] Convert BuildVector with all-equal values to Broadcast. (#165826)Florian Hahn
Fold BuildVector where all operands are equal to Broadcast of the first operand. This will subsequently make it easier to remove additional buildvectors/broadcasts, e.g. via https://github.com/llvm/llvm-project/pull/165506. PR: https://github.com/llvm/llvm-project/pull/165826
2025-11-01[VPlan] Add getConstantInt helpers for constant int creation (NFC).Florian Hahn
Add getConstantInt helper methods to VPlan to simplify the common pattern of creating constant integer live-ins. Suggested as follow-up in https://github.com/llvm/llvm-project/pull/164127.
2025-10-31[VPlan] Add VPRegionBlock::getCanonicalIVType (NFC). (#164127)Florian Hahn
Split off from https://github.com/llvm/llvm-project/pull/156262. Similar to VPRegionBlock::getCanonicalIV, add helper to get the type of the canonical IV, in preparation for removing VPCanonicalIVPHIRecipe. PR: https://github.com/llvm/llvm-project/pull/164127
2025-10-31[VPlan] Remove original recipe after narrowing to single-scalar.Florian Hahn
Directly remove RepOrWidenR after replacing all uses. Removing the dead user early unlocks additional opportunities for further narrowing.
2025-10-29[VPlan] Don't preserve LCSSA in expandSCEVs. (#165505)Florian Hahn
This follows similar reasoning as 45ce88758d24 (https://github.com/llvm/llvm-project/pull/159556): LV does not preserve LCSSA, it constructs it just before processing a loop to vectorize. Runtime check expressions are invariant to that loop, so expanding them should not break LCSSA form for the loop we are about to vectorize. LV creates SCEV and memory runtime checks early on and then disconnects the blocks temporarily. The patch fixes a mis-compile, where previously LCSSA construction during SCEV expand may replace uses in currently unreachable SCEV/memory check blocks. Fixes https://github.com/llvm/llvm-project/issues/162512 PR: https://github.com/llvm/llvm-project/pull/165505
2025-10-28[LV] Bundle (partial) reductions with a mul of a constant (#162503)Sam Tebbs
A reduction (including partial reductions) with a multiply of a constant value can be bundled by first converting it from `reduce.add(mul(ext, const))` to `reduce.add(mul(ext, ext(const)))` as long as it is safe to extend the constant. This PR adds such bundling by first truncating the constant to the source type of the other extend, then extending it to the destination type of the extend. The first truncate is necessary so that the types of each extend's operand are then the same, and the call to canConstantBeExtended proves that the extend following a truncate is safe to do. The truncate is removed by optimisations. This is a stacked PR, 1a and 1b can be merged in any order: 1a. https://github.com/llvm/llvm-project/pull/147302 1b. https://github.com/llvm/llvm-project/pull/163175 2. -> https://github.com/llvm/llvm-project/pull/162503
2025-10-28[VPlan] Introduce cannotHoistOrSinkRecipe, fix miscompile (#162674)Ramkumar Ramachandra
Factor out common code to determine legality of hoisting and sinking. The patch has the side-effect of fixing an underlying bug, where a load/store pair is reordered.
2025-10-28[VPlan] Store memory alignment in VPWidenMemoryRecipe. nfc (#165255)Mel Chen
Add an member Alignment to VPWidenMemoryRecipe to store memory alignment directly in the recipe. Update constructors, clone(), and relevant methods to use this stored alignment instead of querying the IR instruction. This allows VPWidenLoadRecipe/VPWidenStoreRecipe to be constructed without relying on the original IR instruction in the future.
2025-10-24[VPlan] Extend tryToFoldLiveIns to fold binary intrinsics (#161703)Ramkumar Ramachandra
InstSimplifyFolder can fold binary intrinsics, so take the opportunity to unify code with getOpcodeOrIntrinsicID, and handle the case. The additional handling of WidenGEP is non-functional, as the GEP is simplified before it is widened, as the included test shows.
2025-10-23[VPlan] Limit narrowInterleaveGroups to single block regions for now.Florian Hahn
Currently only regions with a single block are supported by the legality checks.
2025-10-23[LV] Bundle partial reductions inside VPExpressionRecipe (#147302)Sam Tebbs
This PR bundles partial reductions inside the VPExpressionRecipe class. Stacked PRs: 1. https://github.com/llvm/llvm-project/pull/147026 2. https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/156976 4. https://github.com/llvm/llvm-project/pull/160154 5. -> https://github.com/llvm/llvm-project/pull/147302 6. https://github.com/llvm/llvm-project/pull/162503 7. https://github.com/llvm/llvm-project/pull/147513