| Age | Commit message (Collapse) | Author |
|
isUniformAcrossVFsAndUFs
Extract the PreservesUniformity logic from isSingleScalar into a shared
static helper function. Update isUniformAcrossVFsAndUFs to use this
logic for VPWidenRecipe and VPInstruction, so that any opcode that
preserves uniformity is considered uniform-across-vf-and-uf if its
operands are.
This unifies the uniformity checking logic and makes it easier to extend
in the future.
This should effectively by NFC currently.
|
|
Create phi recipes for scalar resume value up front in addInitialSkeleton during initial construction. This will allow moving the remaining code dealing with resume values to VPlan transforms/construction.
PR: https://github.com/llvm/llvm-project/pull/166099
|
|
This allows us to strip an unnecessary TypeSwitch.
|
|
Only apply forced instruction costs to recipes with underlying values to
match the legacy cost model. A VPlan may have a number of additional
VPInstructions without underlying values that are not considered for its
cost, and assigning forced costs to them would incorrectly inflate its
cost.
This fixes a cost divergence between legacy and VPlan-based cost models
with forced instruction costs.
PR: https://github.com/llvm/llvm-project/pull/168372
|
|
After truncating an integer-induction, neither nuw nor nsw hold.
Fixes #168902.
Co-authored-by: Florian Hahn <flo@fhahn.com>
|
|
Remove `VPWidenPointerInductionRecipe::IsScalarAfterVectorization` and
replace it with `onlyScalarValuesUsed`. This removes the need to carry
state from the legacy cost model through VPlan, and the VPlan-based
analysis gives more accurate results, avoiding a number of extracts.
PR: https://github.com/llvm/llvm-project/pull/168289
|
|
Need to check if the non-schedulable phi parent node has unique
operands, if the incoming node has copyables, and the node is
commutative. Otherwise, there might be issues with the correct
calculation of the dependencies.
Fixes #168589
|
|
https://github.com/llvm/llvm-project/pull/162822 added another
validation step to check if entries in a partial reduction chain have
the same scale factor. But the validation was still dependent on the
order of entries in PartialReductionChains, and would fail to reject
some cases (e.g. if the first first link matched the scale of the second
link, but the second link is invalidated later).
To fix that, group chains by their starting phi nodes, then perform the
validation for each chain, and if it fails, invalidate the whole chain
for the phi.
Fixes https://github.com/llvm/llvm-project/issues/167243.
Fixes https://github.com/llvm/llvm-project/issues/167867.
PR: https://github.com/llvm/llvm-project/pull/168036
|
|
A pattern of the form reduce.add(ext(mul)) is valid for a partial
reduction as long as the mul and its operands fulfill the requirements
of a normal partial reduction. The mul's extend operands will be
optimised to the wider extend, and we already have oneUse checks in
place to make sure the mul and operands can be modified safely.
1. -> https://github.com/llvm/llvm-project/pull/165536
2. https://github.com/llvm/llvm-project/pull/165543
|
|
This is the fixed version of
https://github.com/llvm/llvm-project/pull/163019
|
|
The problem with the many def-use chain problems in SLP vectorizer are
related to the fact that some nodes reuse the same instruction as
insertion point. Insertion point is not the instruction, but the place
between instructions. To set it correctly, better to generate pseudo
instruction immediately after the last instruction, and use it as
insertion point. It resolves the issues in most cases.
Fixes #168512 #168576
|
|
Replace retrieving FMFs for in-loop reduction via underlying instruction
+ legal by collecting the flags during reduction chain traversal in
VPlan.
|
|
Given a set of pointers, check if they can be rearranged as follows (%s is a constant):
%b + 0 * %s + 0
%b + 0 * %s + 1
%b + 0 * %s + 2
...
%b + 0 * %s + w
%b + 1 * %s + 0
%b + 1 * %s + 1
%b + 1 * %s + 2
...
%b + 1 * %s + w
...
If the pointers can be rearanged in the above pattern, it means that the
memory can be accessed with a strided loads of width `w` and stride `%s`.
|
|
- Remove file local functions out of `llvm` or anonymous namespace and
make them static.
- Use namespace qualifier to define `BoUpSLP` class and several template
specializations.
|
|
#158690 plans on passing BFI as a lazy lambda to avoid computing
BlockFrequencyInfo when not needed.
In preparation for that, this PR removes BFI and PSI from some
constructors that aren't used. It also consolidates the two calls to
llvm::shouldOptimizeForSize so that the result is computed once and
passed where needed.
This also renames OptForSize in LoopVectorizationLegality to clarify
that it's to prevent runtime SCEV checks, see
https://reviews.llvm.org/D68082
|
|
Implement VPHeaderPHIRecipe::classof(const VPValue *V) in terms of the
variant taking VPRecipeBase.
Reduces some duplication, split off from
https://github.com/llvm/llvm-project/pull/141431.
|
|
Use the recently refactored VPRecipeBase::print to print debug location
for all recipes.
PR: https://github.com/llvm/llvm-project/pull/168454
|
|
Consider skipping epilogue scalable VF when they are greater than
RemainingIterations same as fixed VF.
And skip scalable RemainingIterations from that comparison because
SCEV ATM can't evaluate non-canonical vscale-based expressions.
|
|
- Split from #165532. This is a step toward a unified interface for
masked/gather-scatter/strided/expand-compress cost modeling.
- Replace the ad-hoc parameter list with a single attributes object.
API change:
```
- InstructionCost getMaskedMemoryOpCost(Opcode, Src, Alignment,
- AddressSpace, CostKind);
+ InstructionCost getMaskedMemoryOpCost(MemIntrinsicCostAttributes,
+ CostKind);
```
Notes:
- NFCI intended: callers populate MemIntrinsicCostAttributes with the
same information as before.
- Follow-up: migrate gather/scatter, strided, and expand/compress cost
queries to the same attributes-based entry point.
|
|
FCmp instructions have both a predicate and fast-math flags. Introduce a
new FCmp kind, that combines both to model this correctly in the current
system.
This should be NFC modulo VPlan printing which now includes the correct
fast-math flags.
|
|
Follow up on a cse OpType-mismatch crash reported due to ef023cae388d
(Reland [VPlan] Expand WidenInt inductions with nuw/nsw), setting the
OpType correctly when returning from getFlagsFromIndDesc.
|
|
Update VPlan to populate VPIRFlags during VPInstruction construction and
use it when creating widened recipes, instead of constructing VPIRFlags
from the underlying IR instruction each time. The VPRecipeWithIRFlags
constructor taking an underlying instruction and setting the flags based
on it has been removed.
This centralizes initial VPIRFlags creation and ensures flags are
consistently available throughout VPlan transformations and makes sure
we don't accidentally re-add flags from the underlying instruction that
already got dropped during transformations.
Follow-up to https://github.com/llvm/llvm-project/pull/167253, which did
the same for VPIRMetadata.
Should be NFC w.r.t. to the generated IR.
PR: https://github.com/llvm/llvm-project/pull/168450
|
|
Implement CastInfo from VPRecipeBase to VPIRMetadata to support
isa/dyn_Cast. This is similar to CastInfoVPPhiAccessors, supporting
dyn_cast by down-casting to the concrete recipe types inheriting from
VPIRMetadata.
Can be used for more generalized VPIRMetadata printing following
https://github.com/llvm/llvm-project/pull/165825.
PR: https://github.com/llvm/llvm-project/pull/166245
|
|
This patch implements a transform to hoists single-scalar replicated
loads with invariant addresses out of the vector loop to the preheader
when scoped noalias metadata proves they cannot alias with any stores in
the loop.
This enables hosting of loads we can prove do not alias any stores in
the loop due to memory runtime checks added during vectorization.
PR: https://github.com/llvm/llvm-project/pull/166247
|
|
|
|
Update VPlan to populate VPIRMetadata during VPInstruction construction
and use it when creating widened recipes, instead of constructing
VPIRMetadata from the underlying IR instruction each time.
This centralizes VPIRMetadata in VPInstructions and ensures metadata is
consistently available throughout VPlan transformations.
PR: https://github.com/llvm/llvm-project/pull/167253
|
|
Replace addMetadata with setMetadata, which sets metadata, updating
existing entries or adding a new entry otherwise.
This isn't strictly needed at the moment, but will be needed for
follow-up patches.
|
|
Changes: The previous patch had to be reverted to a mismatching-OpType
assert in cse. The reduced-test has now been added corresponding to a
RVV pointer-induction, and the pointer-induction case has been updated
to use createOverflowingBinaryOp.
While at it, record VPIRFlags in VPWidenInductionRecipe.
|
|
Add a new pinrRecipe which handles printing the recipe without common
info like debug info or metadata.
Prepares to print them once, in ::print(), after/in combination with
https://github.com/llvm/llvm-project/pull/165825.
PR: https://github.com/llvm/llvm-project/pull/166244
|
|
For a scalar only VPlan with tail folding, if it has a phi live out then
legalizeAndOptimizeInductions will scalarize the widened canonical IV
feeding into the header mask:
<x1> vector loop: {
vector.body:
EMIT vp<%4> = CANONICAL-INDUCTION ir<0>, vp<%index.next>
vp<%5> = SCALAR-STEPS vp<%4>, ir<1>, vp<%0>
EMIT vp<%6> = icmp ule vp<%5>, vp<%3>
EMIT vp<%index.next> = add nuw vp<%4>, vp<%1>
EMIT branch-on-count vp<%index.next>, vp<%2>
No successors
}
Successor(s): middle.block
middle.block:
EMIT vp<%8> = last-active-lane vp<%6>
EMIT vp<%9> = extract-lane vp<%8>, vp<%5>
Successor(s): ir-bb<exit>
The verifier complains about this but this should still generate the
correct last active lane, so this fixes the assert by handling this case
in isHeaderMask. There is a similar pattern already there for
ActiveLaneMask, which also expects a VPScalarIVSteps recipe.
Fixes #167813
|
|
|
|
|
|
Update VPInstruction constructor to delegate to constructor with more
comprehensive checking and validation.
This required updating some unit tests, to make sure the constructed
VPInstructions are valid.
|
|
The compiler should not consider split vectorize nodes, when checking
for non-schedulable PHI-based parent nodes. Only pure PHI nodes must be
considered, they only can be considered as explicit users, split nodes
are not.
Fixes #168268
|
|
Replace the assert checking if CurrentLinkI is a CmpInst with a pattern
matching check in the if condition. This uses VPlan-level pattern matching
instead of inspecting the underlying instruction type.
|
|
Simplifies some tests which no do not need to pass TC, and future
changes will require to always have a trip count available.
|
|
Identified with readability-delete-null-pointer.
|
|
Construct SCEVs for VPWidenIntOrFpInductionRecipe analogous to
VPCanonicalInductionPHIRecipe: create an AddRec with start + step from
the recipe.
Currently the only impact should be computing more costs of replicating
stores directly in VPlan.
|
|
|
|
Need to check if the non-copyable element is an instruction before actually
trying to check its NSW attribute.
|
|
isCommutable"
This reverts commit ddf5bb0a2e2d2dd77bce66173387d62ab7174d9f to fix
buildbots https://lab.llvm.org/buildbot/#/builders/11/builds/28083.
|
|
Need to check if the non-copyable element is an instruction before actually
trying to check its NSW attribute.
|
|
(#168105)
…63019)"
This reverts commit 92e5608ffa6ff39ac3707f29418cc9482471f5d9.
|
|
Patch adds support for sub instructions as main instruction in copyables
elements. Also, adds a check if the base instruction is not profitable
for the selection if at least one instruction with the main opcode is
used as an immediate operand.
Reviewers: RKSimon, hiraditya
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/163231
|
|
Reverts llvm/llvm-project#163538
This is causing build failures on the two-stage RVV buildbots. e.g.
https://lab.llvm.org/buildbot/#/builders/214/builds/1363. I've shared a
reproducer and more information at
https://github.com/llvm/llvm-project/pull/163538#issuecomment-3533482822
This reverts commit 355e0f94af5adabe90ac57110ce1b47596afd4cd.
|
|
While at it, record VPIRFlags in VPWidenInductionRecipe.
|
|
Currently, in-loop reductions for AnyOf and FindIV are not supported.
They were implicitly blocked. This happened because
RecurrenceDescriptor::getReductionOpChain could not detect their
recurrence chain. The reason is that RecurrenceDescriptor::getOpcode was
set to Instruction::Or, but the recurrence chains of AnyOf and FindIV do
not actually contain an Instruction::Or.
This patch explicitly disables in-loop reductions for AnyOf and FindIV
instead of relying on getReductionOpChain to implicitly prevent them.
|
|
VPPartialReductionRecipe doesn't yet support an EVL variant, and we
guard against this by not calling convertToAbstractRecipes when we're
tail folding with EVL.
However recently some things got shuffled around which means we may
detect some scaled reductions in collectScaledReductions and store them
in ScaledReductionMap, where outside of convertToAbstractRecipes we may
look them up and start e.g. adding a scale factor to an otherwise
regular VPReductionPHI.
This fixes it by skipping collectScaledReductions, and fixes #167861
|
|
Move utility to helper for re-use in follow-up patches.
|
|
(#149042)"
This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b.
This appears to be causing some runtime failures on RISCV
https://lab.llvm.org/buildbot/#/builders/210/builds/5221
|