| Age | Commit message (Collapse) | Author |
|
delete ops (#166766)
Reland https://github.com/llvm/llvm-project/pull/165725, fix the Failed
test by removing successor operands before delete operations. Following
the deletion of cond.branch, its successor operands will subsequently be
removed.
|
|
(#166746)
Reverts llvm/llvm-project#165725. See
https://lab.llvm.org/buildbot/#/builders/169/builds/16768,
|
|
Fix https://github.com/llvm/llvm-project/issues/163051. Some Ops which
have multiple blocks, before deleting the ops, first remove the dead
parameters within its blocks.
|
|
This is still somehow a WIP, we have some issues with this interface
that are not trivial to solve. This patch tries to make the concepts of
RegionBranchPoint and RegionSuccessor more robust and aligned with their
definition:
- A `RegionBranchPoint` is either the parent (`RegionBranchOpInterface`)
op or a `RegionBranchTerminatorOpInterface` operation in a nested
region.
- A `RegionSuccessor` is either one of the nested region or the parent
`RegionBranchOpInterface`
Some new methods with reasonnable default implementation are added to
help resolving the flow of values across the RegionBranchOpInterface.
It is still not trivial in the current state to walk the def-use chain
backward with this interface. For example when you have the 3rd block
argument in the entry block of a for-loop, finding the matching operands
requires to know about the hidden loop iterator block argument and where
the iterargs start. The API is designed around forward-tracking of the
chain unfortunately.
Try to reland #161575 ; I suspect a buildbot incremental build issue.
|
|
Reverts llvm/llvm-project#161575
Broke Windows on ARM buildbot build, needs investigations.
|
|
This is still somehow a WIP, we have some issues with this interface
that are not trivial to solve. This patch tries to make the concepts of
RegionBranchPoint and RegionSuccessor more robust and aligned with their
definition:
- A `RegionBranchPoint` is either the parent (`RegionBranchOpInterface`)
op or a `RegionBranchTerminatorOpInterface` operation in a nested
region.
- A `RegionSuccessor` is either one of the nested region or the parent
`RegionBranchOpInterface`
Some new methods with reasonnable default implementation are added to
help resolving the flow of values across the RegionBranchOpInterface.
It is still not trivial in the current state to walk the def-use chain
backward with this interface. For example when you have the 3rd block
argument in the entry block of a for-loop, finding the matching operands
requires to know about the hidden loop iterator block argument and where
the iterargs start. The API is designed around forward-tracking of the
chain unfortunately.
|
|
arguments (#160755)
In #153973 I added the correctly handling of block arguments,
unfortunately this was gated on operation that also have results. This
wasn't intentional and this excluded operations like function from being
correctly processed.
|
|
## Problem
`RemoveDeadValues` can legally drop dead function arguments on private
`func.func` callees. But call-sites to such functions aren't fixed if
the call operation keeps its call arguments in a **segmented operand
group** (i.ie, uses `AttrSizedOperandSegments`), unless the call op
implements `getArgOperandsMutable` and the RDV pass actually uses it.
## Fix
When RDV decides to drop callee function args, it should, for each
call-site that implements `CallOpInterface`, **shrink the call's
argument segment** via `getArgOperandsMutable()` using the same dead-arg
indices. This keeps both the flat operand list and the
`operand_segment_sizes` attribute in sync (that's what
`MutableOperandRange` does when bound to the segment).
## Note
This change is a no-op for:
* call ops without segment operands (they still get their flat operands
erased via the generic path)
* call ops whose calle args weren't dropped (public, external,
non-`func-func`, unresolved symbol, etc)
* `llvm.call`/`llvm.invoke` (RDV doesn't drop `llvm.func` args
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
last block is not the exit. (#156123)
'processFuncOp' queries the number of returned values of a function
using the terminator of the last block's getNumOperands(). It presumes
the last block is the exit. It is not always the case.
This patch fixes the bug by querying from FunctionInterfaceOp directly.
|
|
This patch is forcing all values to be initialized by the
LivenessAnalysis, even in dead blocks. The dataflow framework will skip
visiting values when its already knows that a block is dynamically
unreachable, so this requires specific handling.
Downstream code could consider that the absence of liveness is the same
a "dead".
However as the code is mutated, new value can be introduced, and a
transformation like "RemoveDeadValue" must conservatively consider that
the absence of liveness information meant that we weren't sure if a
value was dead (it could be a newly introduced value.
Fixes #153906
|
|
|
|
Operation streaming (#150636)
|
|
Change local variants to use new central one.
|
|
This patch fixes a bug in the RemoveDeadValues pass where unused
function arguments were not removed from the function signature in an
edge case where the function returns void.
A corresponding test was added to the MLIR LIT test suite to cover this
case.
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
Simple IR patterns will trigger assertion error:
```
func.func @test_zero_operands(%I: memref<10xindex>, %I2: memref<10xf32>) {
%v0 = arith.constant 0 : index
%result = memref.alloca_scope -> index {
%c = arith.addi %v0, %v0 : index
memref.store %c, %I[%v0] : memref<10xindex>
memref.alloca_scope.return %c: index
}
func.return
}
```
with error: `mlir/include/mlir/IR/Operation.h:988:
mlir::detail::OperandStorage& mlir::Operation::getOperandStorage():
Assertion `hasOperandStorage && "expected operation to have operand
storage"' failed.`
This PR will fix this issue.
---------
Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
(#144695)
Debugging issues with this pass is quite difficult at the moment, this
should help.
|
|
|
|
`FunctionOpInterface` assumed the fact that the function type (attribute
of the operation) can be cloned with arbirary lists of function
arguments and results to support argument and result list mutation. This
is not always correct, in particular, LLVM dialect functions require
exactly one result making it impossible to erase the result.
Allow function type cloning to fail and propagate this failure through
various APIs that use it. The common assumption is that existing IR has
not been modified.
Fixes #131142.
Reland a8c7ecdcbc3e89b493b495c6831cc93671c3b844 / #136300.
|
|
This reverts commit 20a104a7d6423784dab04371a5ca728cc27a15a9.
Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/157/builds/25688
I've reproduced the build failure.
|
|
`FunctionOpInterface` assumed the fact that the function type (attribute
of the operation) can be cloned with arbirary lists of function
arguments and results to support argument and result list mutation. This
is not always correct, in particular, LLVM dialect functions require
exactly one result making it impossible to erase the result.
Allow function type cloning to fail and propagate this failure through
various APIs that use it. The common assumption is that existing IR has
not been modified.
Fixes #131142.
|
|
Fixes: #131765
|
|
This PR adds process for RegionBranchOp with empty region, such as
'else' region of `scf.if`. Fixes #123246.
|
|
(#121079)
This is a follow-up for https://github.com/llvm/llvm-project/pull/119110
and a fix for https://github.com/llvm/llvm-project/issues/118450
RemoveDeadValues used to delete Values and analyzing the IR at the same
time, because of that, `isMemoryEffectFree` got invalid IR with
half-deleted linalg.generic operation. This PR separates analysis and
cleanup to prevent such situation.
Thank you!
---------
Co-authored-by: Renat Idrisov <parsifal-47@users.noreply.github.com>
Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>
|
|
Fixing RemoveDeadValues to properly remove arguments from
BranchOpInterface operations.
This is a follow-up for:
https://github.com/llvm/llvm-project/pull/117405
cc: @joker-eph @codemzs
---------
Co-authored-by: Renat Idrisov <parsifal-47@users.noreply.github.com>
|
|
`FunctionOpInterface` (#116573)
This PR fixes a bug in `RemoveDeadValues` where the
`FunctionOpInterface` does not have the `isDeclaration` method. As a
result, we should use the `isExternal` method instead. Fixes #116347.
|
|
(#117405)
This change removes the restriction on `SymbolUserOpInterface` operators
so they can be used with operators that implement `SymbolOpInterface`,
example:
`memref.global` implements `SymbolOpInterface` so it can be used with
`memref.get_global` which implements `SymbolUserOpInterface`
```
// Define a global constant array
memref.global "private" constant @global_array : memref<10xi32> = dense<[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]> : tensor<10xi32>
// Access this global constant within a function
func @use_global() {
%0 = memref.get_global @global_array : memref<10xi32>
}
```
Reference: https://github.com/llvm/llvm-project/pull/116519 and
https://discourse.llvm.org/t/question-on-criteria-for-acceptable-ir-in-removedeadvaluespass/83131
---------
Co-authored-by: Zeeshan Siddiqui <mzs@ntdev.microsoft.com>
|
|
values removed (#116519)
This change is related to discussion:
https://discourse.llvm.org/t/question-on-criteria-for-acceptable-ir-in-removedeadvaluespass/83131
I do not know the original reason to disallow the optimization on
modules with global private constant. Please let me know what am I
missing, I will be happy to make it better. Thank you!
CC: @Wheest
---------
Co-authored-by: Renat Idrisov <parsifal-47@users.noreply.github.com>
|
|
This PR adds `signalPassFailure` in RemoveDeadValues to ensure that a
pipeline would stop here.
Fixes #111757.
|
|
(#109990)
Fixes #107870.
We can allow the enclosing Module operation to have a symbol.
The check was likely originally not considering this case and intended
to catch symbols inside the region, not accounting that the walk would
visit the enclosing operation.
|
|
This patch skips `RemoveDeadValues` if funcOp is declaration, which
fixes a crash.
Fixes #107546.
|
|
This PR adds a check for IsTerminator trait to prevent deletion of ops
like gpu.terminator as a "simple op" by RemoveDeadValues pass.
|
|
CallableOpInterface
Functions are always callable operations and thus every operation
implementing the `FunctionOpInterface` also implements the
`CallableOpInterface`. The only exception was the FuncOp in the toy
example. To make implementation of the `FunctionOpInterface` easier,
this commit lets `FunctionOpInterface` inherit from
`CallableOpInterface` and merges some of their methods. More precisely,
the `CallableOpInterface` has methods to get the argument and result
attributes and a method to get the result types of the callable region.
These methods are always implemented the same way as their analogues in
`FunctionOpInterface` and thus this commit moves all the argument and
result attribute handling methods to the callable interface as well as
the methods to get the argument and result types. The
`FuntionOpInterface` then does not have to declare them as well, but
just inherits them from the `CallableOpInterface`.
Adding the inheritance relation also required to move the
`FunctionOpInterface` from the IR directory to the Interfaces directory
since IR should not depend on Interfaces.
Reviewed By: jpienaar, springerm
Differential Revision: https://reviews.llvm.org/D157988
|
|
`RegionBranchOpInterface`"
This reverts commit b26bb30b467b996c9786e3bd426c07684d84d406.
|
|
`RegionBranchOpInterface`"
This reverts commit 024f562da67180b7be1663048c960b26c2cc16f8.
Forgot to update flang
|
|
The current implementation is not very ergonomic or descriptive: It uses `std::optional<unsigned>` where `std::nullopt` represents the parent op and `unsigned` is the region number.
This doesn't give us any useful methods specific to region control flow and makes the code fragile to changes due to now taking the region number into account.
This patch introduces a new type called `RegionBranchPoint`, replacing all uses of `std::optional<unsigned>` in the interface. It can be implicitly constructed from a region or a `RegionSuccessor`, can be compared with a region to check whether the branch point is branching from the parent, adds `isParent` to check whether we are coming from a parent op and adds `RegionSuccessor::parent` as a descriptive way to indicate branching from the parent.
Differential Revision: https://reviews.llvm.org/D159116
|
|
This commit fixes an error in the `RemoveDeadValues` pass that is
associated with its incorrect usage of the `cloneInto()` function.
The `setOperands()` function that is used by the `cloneInto()` function
requires all operands to not be null. But, that is not possible in this
pass because we drop uses of dead values, thus making them null. It is
only at the end of the pass that we are assured that such null values
won't exist but during the execution of the pass, there could be null
values.
To fix this, we replace the usage of the `cloneInto()` function to copy
a region with `moveBlock()` to move each block of the region one by one.
This function does not require the presence of non-null values and is
thus the right choice here. This implementation is also more opttimized
because we are moving things instead of copying them. The goal was
always moving.
Signed-off-by: Srishti Srivastava <srishtisrivastava.ai@gmail.com>
Reviewed By: srishti-pm
Differential Revision: https://reviews.llvm.org/D158941
|
|
Large deep learning models rely on heavy computations. However, not
every computation is necessary. And, even when a computation is
necessary, it helps if the values needed for the computation are
available in registers (which have low-latency) rather than being in
memory (which has high-latency).
Compilers can use liveness analysis to:-
(1) Remove extraneous computations from a program before it executes on
hardware, and,
(2) Optimize register allocation.
Both these tasks help achieve one very important goal: reducing runtime.
Recently, liveness analysis was added to MLIR. Thus, this commit uses
the recently added liveness analysis utility to try to accomplish task
(1).
It adds a pass called `remove-dead-values` whose goal is
optimization (reducing runtime) by removing unnecessary instructions.
Unlike other passes that rely on local information gathered from
patterns to accomplish optimization, this pass uses a full analysis of
the IR, specifically, liveness analysis, and is thus more powerful.
Currently, this pass performs the following optimizations:
(A) Removes function arguments that are not live,
(B) Removes function return values that are not live across all callers of
the function,
(C) Removes unneccesary operands, results, region arguments, region
terminator operands of region branch ops, and,
(D) Removes simple and region branch ops that have all non-live results and
don't affect memory in any way,
iff
the IR doesn't have any non-function symbol ops, non-call symbol user ops
and branch ops.
Here, a "simple op" refers to an op that isn't a symbol op, symbol-user op,
region branch op, branch op, region branch terminator op, or return-like.
It is noteworthy that we do not refer to non-live values as "dead" in this
file to avoid confusing it with dead code analysis's "dead", which refers to
unreachable code (code that never executes on hardware) while "non-live"
refers to code that executes on hardware but is unnecessary. Thus, while the
removal of dead code helps little in reducing runtime, removing non-live
values should theoretically have significant impact (depending on the amount
removed).
It is also important to note that unlike other passes (like `canonicalize`)
that apply op-specific optimizations through patterns, this pass uses
different interfaces to handle various types of ops and tries to cover all
existing ops through these interfaces.
It is because of its reliance on (a) liveness analysis and (b) interfaces
that makes it so powerful that it can optimize ops that don't have a
canonicalizer and even when an op does have a canonicalizer, it can perform
more aggressive optimizations, as observed in the test files associated with
this pass.
Example of optimization (A):-
```
int add_2_to_y(int x, int y) {
return 2 + y
}
print(add_2_to_y(3, 4))
print(add_2_to_y(5, 6))
```
becomes
```
int add_2_to_y(int y) {
return 2 + y
}
print(add_2_to_y(4))
print(add_2_to_y(6))
```
Example of optimization (B):-
```
int, int get_incremented_values(int y) {
store y somewhere in memory
return y + 1, y + 2
}
y1, y2 = get_incremented_values(4)
y3, y4 = get_incremented_values(6)
print(y2)
```
becomes
```
int get_incremented_values(int y) {
store y somewhere in memory
return y + 2
}
y2 = get_incremented_values(4)
y4 = get_incremented_values(6)
print(y2)
```
Example of optimization (C):-
Assume only `%result1` is live here. Then,
```
%result1, %result2, %result3 = scf.while (%arg1 = %operand1, %arg2 = %operand2) {
%terminator_operand2 = add %arg2, %arg2
%terminator_operand3 = mul %arg2, %arg2
%terminator_operand4 = add %arg1, %arg1
scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3, %terminator_operand4
} do {
^bb0(%arg3, %arg4, %arg5):
%terminator_operand6 = add %arg4, %arg4
%terminator_operand5 = add %arg5, %arg5
scf.yield %terminator_operand5, %terminator_operand6
}
```
becomes
```
%result1, %result2 = scf.while (%arg2 = %operand2) {
%terminator_operand2 = add %arg2, %arg2
%terminator_operand3 = mul %arg2, %arg2
scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3
} do {
^bb0(%arg3, %arg4):
%terminator_operand6 = add %arg4, %arg4
scf.yield %terminator_operand6
}
```
It is interesting to see that `%result2` won't be removed even though it is
not live because `%terminator_operand3` forwards to it and cannot be
removed. And, that is because it also forwards to `%arg4`, which is live.
Example of optimization (D):-
```
int square_and_double_of_y(int y) {
square = y ^ 2
double = y * 2
return square, double
}
sq, do = square_and_double_of_y(5)
print(do)
```
becomes
```
int square_and_double_of_y(int y) {
double = y * 2
return double
}
do = square_and_double_of_y(5)
print(do)
```
Signed-off-by: Srishti Srivastava <srishtisrivastava.ai@gmail.com>
Reviewed By: matthiaskramm, Mogball, jcai19
Differential Revision: https://reviews.llvm.org/D157049
|