| Age | Commit message (Collapse) | Author |
|
(NFC) (#165658)
Simplify the implementation of `getCalledFunction` using
`resolveCallableInTable`.
|
|
As part of 2646c36a864aa6a62bc1280e9a8cd2bcd2695349,
`OneShotModuleBufferize` no longer descends into nested symbol tables,
recommending users who wish to do this should do so in a pass
pipeline/custom pass. This did not support the use case of ops that
weren't ModuleOps. The patch updates `OneShotModuleBufferize` to work on
any general op.
|
|
bufferization (#142099)
The current algorithm searching for circular function calls scales quadratically due to the linear scan of the functions vector that is performed for each element of the vector itself. The PR replaces such algorithm with an O(V + E) version based on the Khan's algorithm for topological sorting, where V is the number of functions and E is the number of function calls.
|
|
Address TODO regarding the recomputation of symbol tables. The signature of the `getFuncOpsOrderedByCalls` function is modified to receive the collection of cached symbol tables.
|
|
interface methods (#141466)
The PR continues the work started in #141019 by adding the `BufferizationState` class also to the `getBufferType` and `resolveConflicts` interface methods, together with the additional support functions that are used throughout the bufferization infrastructure.
|
|
Follow-up on #138143, which was reverted due to a missing update a method signature (more specifically, the bufferization interface for `tensor::ConcatOp`) that was not catched before merging. The old PR description is reported in the next lines.
This PR is a follow-up on https://github.com/llvm/llvm-project/pull/138125, and adds a bufferization state class providing information about the IR. The information currently consists of a cached list of symbol tables, which aims to solve the quadratic scaling of the bufferization task with respect to the number of symbols. The PR breaks API compatibility: the bufferize method of the BufferizableOpInterface has been enriched with a reference to a BufferizationState object.
The bufferization state must be kept in a valid state by the interface implementations. For example, if an operation with the Symbol trait is inserted or replaced, its parent SymbolTable must be updated accordingly (see, for example, the bufferization of arith::ConstantOp, where the symbol table of the module gets the new global symbol inserted). Similarly, the invalidation of a symbol table must be performed if an operation with the SymbolTable trait is removed (this can be performed using the invalidateSymbolTable method, introduced in https://github.com/llvm/llvm-project/pull/138014).
|
|
(#141012)
Reverts llvm/llvm-project#138143
The PR for the BufferizationState is temporarily reverted due to API incompatibilities that have been initially missed during the update and were not catched by PR checks.
|
|
This PR is a follow-up on #138125, and adds a bufferization state class providing information about the IR. The information currently consists of a cached list of symbol tables, which aims to solve the quadratic scaling of the bufferization task with respect to the number of symbols. The PR breaks API compatibility: the `bufferize` method of the `BufferizableOpInterface` has been enriched with a reference to a `BufferizationState` object.
The bufferization state must be kept in a valid state by the interface implementations. For example, if an operation with the `Symbol` trait is inserted or replaced, its parent `SymbolTable` must be updated accordingly (see, for example, the bufferization of `arith::ConstantOp`, where the symbol table of the module gets the new global symbol inserted). Similarly, the invalidation of a symbol table must be performed if an operation with the `SymbolTable` trait is removed (this can be performed using the `invalidateSymbolTable` method, introduced in #138014).
|
|
During bufferization, the callee of each `func::CallOp` / `CallableOpInterface` operation is retrieved by means of a symbol table that is temporarily built for the lookup purpose. The creation of the symbol table requires a linear scan of the operation body (e.g., a linear scan of the `ModuleOp` body). Considering that functions are typically called at least once, this leads to a scaling behavior that is quadratic with respect to the number of symbols. The problem is described in the following Discourse topic: https://discourse.llvm.org/t/quadratic-scaling-of-bufferization/86122/
This patch aims to partially address this scaling issue by leveraging the `SymbolTableCollection` class, whose instance is added to the `FuncAnalysisState` extension. Later modifications are also expected to address the problem in other methods required by `BufferizableOpInterface` (e.g., `bufferize` and `getBufferType`), which suffer of the same problem but do not provide access to any bufferization state.
|
|
bufferize nested symbol tables (#127726)
The existing OneShotModuleBufferize will analyze and bufferize
operations which are in nested symbol tables (e.g. nested
`builtin.module`, `gpu.module`, or similar operations). This
behavior is untested and likely unintentional given other
limitations of OneShotModuleBufferize (`func.call` can't call
into nested symbol tables). This change reverses the existing
behavior so that the operations considered by the analysis and
bufferization exclude any operations in nested symbol table
scopes. Users who desire to bufferize nested modules can still do
so by applying the transformation in a pass pipeline or in a
custom pass. This further enables controlling the order in which
modules are bufferized as well as allowing use of different
options for different kinds of modules.
|
|
Delete `equivalenceAnalysis`, which has been incorporated into the
`getAliasingValues` API. Also add an additional test case to ensure that
equivalence is properly propagated across function boundaries.
|
|
Multiple `func.return` ops inside of a `func.func` op are now supported
during bufferization. This PR extends the code base in 3 places:
- When inferring function return types, `memref.cast` ops are folded
away only if all `func.return` ops have matching buffer types. (E.g., we
don't fold if two `return` ops have operands with different layout
maps.)
- The alias sets of all `func.return` ops are merged. That's because
aliasing is a "may be" property.
- The equivalence sets of all `func.return` ops are taken only if they
match. If different `func.return` ops have different equivalence sets
for their operands, the equivalence information is dropped. That's
because equivalence is a "must be" property.
This commit is in preparation of removing the deprecated
`func-bufferize` pass. That pass can bufferize functions with multiple
`return` ops.
|
|
This commit adds support for recursive function calls to One-Shot
Bufferize.
The analysis does not support recursive function calls. The function
body itself can be analyzed, but we cannot make any assumptions about
the aliasing relation between function result and function arguments.
Similarly, when looking at a `call` op, we do not know whether the
operands will bufferize to a memory read/write. In the absence of such
information, we have to conservatively assume that they do.
This commit is in preparation of removing the deprecated
`func-bufferize` pass. That pass can bufferize recursive functions.
|
|
(#113124)
This reverts commit 2026501cf107fcb3cbd51026ba25fda3af823941.
Failing bot:
* https://lab.llvm.org/staging/#/builders/125/builds/389
|
|
**Description:**
This PR replaces a part of `FuncOp` and `CallOp` with
`FunctionOpInterface` and `CallOpInterface` in `OneShotModuleBufferize`.
Also fix the error from an integration test in the a previous PR
attempt. (https://github.com/llvm/llvm-project/pull/107295)
The below fixes skip `CallOpInterface` so that the assertions are not
triggered.
https://github.com/llvm/llvm-project/blob/8d780007625108a7f34e40efb8604b858e04c60c/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp#L254-L259
https://github.com/llvm/llvm-project/blob/8d780007625108a7f34e40efb8604b858e04c60c/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp#L311-L315
**Related Discord Discussion:**
[Link](https://discord.com/channels/636084430946959380/642426447167881246/1280556809911799900)
---------
Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com>
|
|
Reverts llvm/llvm-project#107295
This commit breaks an integration test:
```
build/bin/mlir-opt mlir/test/Integration/Dialect/Complex/CPU/correctness.mlir -one-shot-bufferize="bufferize-function-boundaries"
```
|
|
**Description:**
`OneShotModuleBufferize` deals with the bufferization of `FuncOp`,
`CallOp` and `ReturnOp` but they are hard-coded. Any custom
function-like operations will not be handled. The PR replaces a part of
`FuncOp` and `CallOp` with `FunctionOpInterface` and `CallOpInterface`
in `OneShotModuleBufferize` so that custom function ops and call ops can
be bufferized.
**Related Discord Discussion:**
[Link](https://discord.com/channels/636084430946959380/642426447167881246/1280556809911799900)
---------
Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com>
|
|
For more context on isa predicates, see:
https://github.com/llvm/llvm-project/pull/83753.
|
|
There is currently no lowering out of `ml_program` in the LLVM
repository. This change adds a lowering to `memref` so that it can be
lowered all the way to LLVM. This lowering was taken from the [reference
backend in
torch-mlir](https://github.com/llvm/torch-mlir/commit/f41695360019bde71d52ca7548944d5488779e12
).
I had tried implementing the `BufferizableOpInterface` for `ml_program`
instead of adding a new pass but that did not work because
`OneShotBufferize` does not visit module-level ops like
`ml_program.global`.
|
|
Add a new interface method to `BufferizableOpInterface`:
`hasTensorSemantics`. This method returns "true" if the op has tensor
semantics and should be bufferized.
Until now, we assumed that an op has tensor semantics if it has tensor
operands and/or tensor op results. However, there are ops like
`ml_program.global` that do not have any results/operands but must still
be bufferized (#75103). The new interface method can return "true" for
such ops.
This change also decouples `bufferization::bufferizeOp` a bit from the
func dialect.
|
|
Remove the `opFilter` and `copyBeforeWrite` function arguments. These
options can already be configured in the `options` object.
|
|
Cyclic function call graphs are generally not supported by One-Shot
Bufferize. However, they can be allowed when a function does not have
tensor arguments or results. This is because it is then no longer
necessary that the callee will be bufferized before the caller.
|
|
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This patch updates all remaining uses of the deprecated functionality in
mlir/. This was done with clang-tidy as described below and further
modifications to GPUBase.td and OpenMPOpsInterfaces.td.
Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
additional check:
main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
them to a pure state.
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
```
Differential Revision: https://reviews.llvm.org/D151542
|
|
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.
Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.
Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
additional check:
https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
them to a pure state.
4. Some changes have been deleted for the following reasons:
- Some files had a variable also named cast
- Some files had not included a header file that defines the cast
functions
- Some files are definitions of the classes that have the casting
methods, so the code still refers to the method instead of the
function without adding a prefix or removing the method declaration
at the same time.
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
mlir/lib/**/IR/\
mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
mlir/test/lib/Dialect/Test/TestTypes.cpp\
mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
mlir/test/lib/Dialect/Test/TestAttributes.cpp\
mlir/unittests/TableGen/EnumsGenTest.cpp\
mlir/test/python/lib/PythonTestCAPI.cpp\
mlir/include/mlir/IR/
```
Differential Revision: https://reviews.llvm.org/D150123
|
|
Having to choose from only static or dynamic layout for all function is limiting.
Differential Revision: https://reviews.llvm.org/D148074
|
|
The current bufferization on function boundaries works on `func.func`
and any call op implementing `CallOpInterface`. Then, an error is thrown
if there is a `CallOpInterface` op that is not `func.call`. This is
unnecessary and breaks the pass whenever such an op occurs (such as
`llvm.call`). This PR simply restricts the handling of call ops to
`func.call`.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D143724
|
|
There is no longer a need to keep the two separate. This is in preparation of reusing the same AnalysisState for tensor.empty elimination and One-Shot Bufferize (to address performance bottlenecks).
Differential Revision: https://reviews.llvm.org/D143313
|
|
This change is needed in order to set the flag when running the pass not via the command line.
It also allows simplifying the signature of some functions.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D143416
|
|
OneShotModuleBufferize fails if the input IR cannot be analyzed.
One can set CopyBeforeWrite=true in order to skip analysis.
In that case, a buffer copy is inserted on every write.
This leads to many copies, also in FuncOps that could be analyzed.
This change aims to copy buffers only when it is a must.
When running OneShotModuleBufferize with CopyBeforeWrite=false,
FuncOps whose names are specified in noAnalysisFuncFilter will not be
analyzed. Ops in these FuncOps will not be analyzed as well.
They will be bufferized with CopyBeforeWrite=true,
while the other ops will be bufferized with CopyBeforeWrite=false.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D142631
|
|
The analysis previous kept track of OpOperand -> OpResult and OpResult -> OpOperand aliasing mappings. Only one mapping is needed, the other one can be inferred.
Differential Revision: https://reviews.llvm.org/D142128
|
|
Print statistics about the number of alloc/deallocs and in-place/out-of-place bufferization.
Differential Revision: https://reviews.llvm.org/D139538
|
|
External functions have no body, so they cannot be analyzed. Assume conservatively that each tensor bbArg may be aliasing with each tensor result. Furthermore, assume that each function arg is read and written-to after bufferization. This default behavior can be controlled with `bufferization.access` (similar to `bufferization.memory_layout`) in test cases.
Also fix a bug in the dialect attribute verifier, which did not run for region argument attributes.
Differential Revision: https://reviews.llvm.org/D139517
|
|
SparsificationAndBufferizationPass
Reviewed By: aartbik, springerm
Differential Revision: https://reviews.llvm.org/D139218
|
|
Differential Revision: https://reviews.llvm.org/D135056
|
|
`DialectAnalysisState` is now `OneShotAnalysisState::Extension`.
This state extension mechanism is needed only for One-Shot Analysis, so it is moved from `BufferizableOpInterface.h` to `OneShotAnalysis.h`.
Extensions are now identified via TypeIDs instead of StringRefs. The API of state extensions is cleaned up and follows the same pattern as other extension mechanisms in MLIR (e.g., `transform::TransformState::Extension`).
Also delete some dead code.
Differential Revision: https://reviews.llvm.org/D135051
|
|
Expose `function-boundary-type-conversion` in `OneShotBufferizeOp`. To
reuse options between passes and transform operations, create a
`BufferizationEnums.td`.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D137833
|
|
Differential Revision: https://reviews.llvm.org/D137830
|
|
If this flag is set, the analysis is skipped and buffers are copied before every write.
Differential Revision: https://reviews.llvm.org/D133288
|
|
bufferization.writable is used in most cases instead. All remaining test cases are updated. Some code that is no longer needed is deleted.
Differential Revision: https://reviews.llvm.org/D129739
|
|
Another mechanical sweep to keep diff small for flip to _Prefixed.
|
|
|
|
With the recent refactorings, this class is no longer needed. We can use BufferizationOptions in all places were BufferizationState was used.
Differential Revision: https://reviews.llvm.org/D127653
|
|
This change changes the bufferization so that it utilizes the new TensorCopyInsertion pass. One-Shot Bufferize no longer calls the One-Shot Analysis. Instead, it relies on the TensorCopyInsertion pass to make the entire IR fully inplacable. The `bufferize` implementations of all ops are simplified; they no longer have to account for out-of-place bufferization decisions. These were already materialized in the IR in the form of `bufferization.alloc_tensor` ops during the TensorCopyInsertion pass.
Differential Revision: https://reviews.llvm.org/D127652
|
|
bufferization
This simplifies the bufferization itself and is in preparation of connecting with the sparse compiler.
Differential Revision: https://reviews.llvm.org/D126814
|
|
CallOp result are not equivalent to an OpOperand if the OpOperand bufferizes out-of-place.
Differential Revision: https://reviews.llvm.org/D126813
|
|
Bufferize
Users should explicitly run `-buffer-results-to-out-params` instead.
The purpose of this change is to remove `finalizeBuffers`, which made it difficult to extend the bufferization to custom buffer types.
Differential Revision: https://reviews.llvm.org/D126253
|
|
OneShotModuleBufferize.cpp (NFC)
|
|
Analysis and bufferization can now be run separately.
Differential Revision: https://reviews.llvm.org/D126572
|
|
Now that analysis and bufferization are better separated, post-analysis steps are no longer needed. Users can directly interleave analysis and bufferization as needed.
Differential Revision: https://reviews.llvm.org/D126571
|
|
Differential Revision: https://reviews.llvm.org/D126182
|