llvm-project.git/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp, branch users/shawbyoung/spr/main.boltnfc-refactoring-callgraph

[VPlan] Generalize type inference for binary VPInstructions (NFC).

2024-06-10T20:57:14+00:00

Generalize logic to set the result type for ops where the result type
and the types of all operands match. Use it to support any unary and
binops.

VPlan: add missing case for LogicalAnd; fix crash (#93553)

2024-06-04T07:58:16+00:00

VPTypeAnalysis::inferScalarTypeForRecipe is missing the case for
VPInstruction::LogicalAnd, due to which the test
vplan-incomplete-cases.ll crashes. Add this missing case, and move the
test in vplan-infer-not-or-type.ll to vplan-incomplete-cases.ll, showing
correct codegen for trip-counts 2 and 3.

[VPlan] Model FOR extract of exit value in VPlan. (#93395)

2024-06-03T19:20:30+00:00

This patch introduces a new ExtractFromEnd VPInstruction opcode to
extract the value of a FOR for users outside the loop (i.e. in the
scalar loop's exits). This moves the first part of fixing first order
recurrences to VPlan, and removes some additional code to patch up
live-outs, which is now handled automatically.

The majority of test changes is due to changes in the order of which the
extracts are generated now. As we are now using VPTransformState to
generate the extracts, we may be able to re-use existing extracts in the
loop body in some cases. For scalable vectors, in some cases we now have
to compute the runtime VF twice, as each extract is now independent, but
those should be trivial to clean up for later passes (and in line with
other places in the code that also liberally re-compute runtime VFs).

PR: https://github.com/llvm/llvm-project/pull/93395

[VPlan] Add scalar inferencing support for addrspace cast (#92107)

2024-05-15T13:03:21+00:00

Fixes https://github.com/llvm/llvm-project/issues/91434

PR: https://github.com/llvm/llvm-project/pull/92107

[VPlan] Add scalar inferencing support for Not and Or insns (#89160)

2024-04-23T14:48:43+00:00

Fixes #87394.

PR: https://github.com/llvm/llvm-project/pull/89160

[VPlan] Introduce recipes for VP loads and stores. (#87816)

2024-04-19T08:44:23+00:00

Introduce new subclasses of VPWidenMemoryRecipe for VP
(vector-predicated) loads and stores to address multiple TODOs from
https://github.com/llvm/llvm-project/pull/76172

Note that the introduction of the new recipes also improves code-gen for
VP gather/scatters by removing the redundant header mask. With the new
approach, it is not sufficient to look at users of the widened canonical
IV to find all uses of the header mask.

In some cases, a widened IV is used instead of separately widening the
canonical IV. To handle that, first collect all VPValues representing header
masks (by looking at users of both the canonical IV and widened inductions
that are canonical) and then checking all users (recursively) of those header
masks.

Depends on https://github.com/llvm/llvm-project/pull/87411.

PR: https://github.com/llvm/llvm-project/pull/87816

[VPlan] Split VPWidenMemoryInstructionRecipe (NFCI). (#87411)

2024-04-17T10:00:58+00:00

This patch introduces a new VPWidenMemoryRecipe base class and distinct
sub-classes to model loads and stores.

This is a first step in an effort to simplify and modularize code
generation for widened loads and stores and enable adding further more
specialized memory recipes.

PR: https://github.com/llvm/llvm-project/pull/87411

[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172)

2024-04-04T22:30:17+00:00

This patch introduces generating VP intrinsics in the Loop Vectorizer.

Currently the Loop Vectorizer supports vector predication in a very
limited capacity via tail-folding and masked load/store/gather/scatter
intrinsics. However, this does not let architectures with active vector
length predication support take advantage of their capabilities.
Architectures with general masked predication support also can only take
advantage of predication on memory operations. By having a way for the
Loop Vectorizer to generate Vector Predication intrinsics, which (will)
provide a target-independent way to model predicated vector
instructions. These architectures can make better use of their
predication capabilities.

Our first approach (implemented in this patch) builds on top of the
existing tail-folding mechanism in the LV (just adds a new tail-folding
mode using EVL), but instead of generating masked intrinsics for memory
operations it generates VP intrinsics for loads/stores instructions. The
patch adds a new VPlanTransforms to replace the wide header predicate
compare with EVL and updates codegen for load/stores to use VP
store/load with EVL.

Other important part of this approach is how the Explicit Vector Length
is computed. (VP intrinsics define this vector length parameter as
Explicit Vector Length (EVL)). We use an experimental intrinsic
`get_vector_length`, that can be lowered to architecture specific
instruction(s) to compute EVL.

Also, added a new recipe to emit instructions for computing EVL. Using
VPlan in this way will eventually help build and compare VPlans
corresponding to different strategies and alternatives.

Differential Revision: https://reviews.llvm.org/D99750

[VPlan] Explicitly handle scalar pointer inductions. (#83068)

2024-03-26T15:01:57+00:00

Add a new PtrAdd opcode to VPInstruction that corresponds to
IRBuilder::CreatePtrAdd, which creates a GEP with source element type
i8.

This is then used to model scalarizing VPWidenPointerInductionRecipe by
introducing scalar-steps to model the index increment followed by a
PtrAdd.

Note that PtrAdd needs to be able to generate code for only the first
lane or for all lanes. This may warrant introducing a separate recipe
for scalarizing that can be created without relying on the underlying
IR.

Depends on https://github.com/llvm/llvm-project/pull/80271

PR: https://github.com/llvm/llvm-project/pull/83068

[VPlan] Support live-ins without underlying IR in type analysis. (#80723)

2024-02-21T19:37:15+00:00

A VPlan contains multiple live-ins without underlying IR, like VFxUF or
VectorTripCount. Trying to infer the scalar type of those causes a crash
at the moment.

Update VPTypeAnalysis to take a VPlan in its constructor and assign
types to those live-ins up front. All those live-ins share the type of
the canonical IV.

PR: https://github.com/llvm/llvm-project/pull/80723