summaryrefslogtreecommitdiff
path: root/mlir/lib/Dialect/Async/Transforms/AsyncParallelFor.cpp
AgeCommit message (Collapse)Author
2025-07-24[mlir][NFC] update `mlir/Dialect` create APIs (15/n) (#149921)Maksim Levental
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-23[mlir] Remove unused includes (NFC) (#150266)Kazu Hirata
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-06-23switch type and value ordering for arith `Constant[XX]Op` (#144636)Skrai Pardus
This change standardizes the order of the parameters for `Constant[XXX] Ops` to match with all other `Op` `build()` constructors. In all instances of generated code for the MLIR dialects's Ops (that is the TableGen using the .td files to create the .h.inc/.cpp.inc files), the desired result type is always specified before the value. Examples: ``` // ArithOps.h.inc class ConstantOp : public ::mlir::Op<ConstantOp, ::mlir::OpTrait::ZeroRegions, ::mlir::OpTrait::OneResult, ::mlir::OpTrait::OneTypedResult<::mlir::Type>::Impl, ::mlir::OpTrait::ZeroSuccessors, ::mlir::OpTrait::ZeroOperands, ::mlir::OpTrait::OpInvariants, ::mlir::BytecodeOpInterface::Trait, ::mlir::OpTrait::ConstantLike, ::mlir::ConditionallySpeculatable::Trait, ::mlir::OpTrait::AlwaysSpeculatableImplTrait, ::mlir::MemoryEffectOpInterface::Trait, ::mlir::OpAsmOpInterface::Trait, ::mlir::InferIntRangeInterface::Trait, ::mlir::InferTypeOpInterface::Trait> { public: .... static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type result, ::mlir::TypedAttr value); static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypedAttr value); static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::TypedAttr value); static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {}); static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {}); ... ``` ``` // ArithOps.h.inc class SubIOp : public ::mlir::Op<SubIOp, ::mlir::OpTrait::ZeroRegions, ::mlir::OpTrait::OneResult, ::mlir::OpTrait::OneTypedResult<::mlir::Type>::Impl, ::mlir::OpTrait::ZeroSuccessors, ::mlir::OpTrait::NOperands<2>::Impl, ::mlir::OpTrait::OpInvariants, ::mlir::BytecodeOpInterface::Trait, ::mlir::ConditionallySpeculatable::Trait, ::mlir::OpTrait::AlwaysSpeculatableImplTrait, ::mlir::MemoryEffectOpInterface::Trait, ::mlir::InferIntRangeInterface::Trait, ::mlir::arith::ArithIntegerOverflowFlagsInterface::Trait, ::mlir::OpTrait::SameOperandsAndResultType, ::mlir::VectorUnrollOpInterface::Trait, ::mlir::OpTrait::Elementwise, ::mlir::OpTrait::Scalarizable, ::mlir::OpTrait::Vectorizable, ::mlir::OpTrait::Tensorizable, ::mlir::InferTypeOpInterface::Trait> { public: ... static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type result, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlagsAttr overflowFlags); static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlagsAttr overflowFlags); static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlagsAttr overflowFlags); static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type result, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlags overflowFlags = ::mlir::arith::IntegerOverflowFlags::none); static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlags overflowFlags = ::mlir::arith::IntegerOverflowFlags::none); static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value lhs, ::mlir::Value rhs, ::mlir::arith::IntegerOverflowFlags overflowFlags = ::mlir::arith::IntegerOverflowFlags::none); static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {}); static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {}); ... ``` In comparison, in the distinct case of `ConstantIntOp` and `ConstantFloatOp`, the ordering of the result type and the value is switched. Thus, this PR corrects the ordering of the aforementioned `Constant[XXX]Ops` to match with other constructors.
2025-04-28[MLIR][NFC] Retire let constructor for Async (#137461)lorenzo chelini
let constructor is legacy (do not use in tree!) since the tableGen backend emits most of the glue logic to build a pass. Note: The following constructor has been retired: ```cpp std::unique_ptr<Pass> createAsyncParallelForPass(bool asyncDispatch, int32_t numWorkerThreads, int32_t minTaskSize); ``` To update your codebase, replace it with the new options-based API: ```cpp AsyncParallelForPassOptions options{/*asyncDispatch=*/, /*numWorkerThreads=*/, /*minTaskSize=*/}; createAsyncParallelForPass(options); ```
2024-12-20[mlir] Enable decoupling two kinds of greedy behavior. (#104649)Jacques Pienaar
The greedy rewriter is used in many different flows and it has a lot of convenience (work list management, debugging actions, tracing, etc). But it combines two kinds of greedy behavior 1) how ops are matched, 2) folding wherever it can. These are independent forms of greedy and leads to inefficiency. E.g., cases where one need to create different phases in lowering and is required to applying patterns in specific order split across different passes. Using the driver one ends up needlessly retrying folding/having multiple rounds of folding attempts, where one final run would have sufficed. Of course folks can locally avoid this behavior by just building their own, but this is also a common requested feature that folks keep on working around locally in suboptimal ways. For downstream users, there should be no behavioral change. Updating from the deprecated should just be a find and replace (e.g., `find ./ -type f -exec sed -i 's|applyPatternsAndFoldGreedily|applyPatternsGreedily|g' {} \;` variety) as the API arguments hasn't changed between the two.
2024-10-18eliminating g++ warnings (#105520)Frank Schlimbach
Eliminating g++ warnings. Mostly declaring "[[maybe_unused]]", adding return statements where missing and fixing casts. @rengolin --------- Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech> Co-authored-by: Renato Golin <rengolin@systemcall.eu>
2024-01-25[mlir][IR] Add rewriter API for moving operations (#78988)Matthias Springer
The pattern rewriter documentation states that "*all* IR mutations [...] are required to be performed via the `PatternRewriter`." This commit adds two functions that were missing from the rewriter API: `moveOpBefore` and `moveOpAfter`. After an operation was moved, the `notifyOperationInserted` callback is triggered. This allows listeners such as the greedy pattern rewrite driver to react to IR changes. This commit narrows the discrepancy between the kind of IR modification that can be performed and the kind of IR modifications that can be listened to.
2023-12-20[mlir][SCF] `scf.parallel`: Make reductions part of the terminator (#75314)Matthias Springer
This commit makes reductions part of the terminator. Instead of `scf.yield`, `scf.reduce` now terminates the body of `scf.parallel` ops. `scf.reduce` may contain an arbitrary number of reductions, with one region per reduction. Example: ```mlir %init = arith.constant 0.0 : f32 %r:2 = scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init, %init) -> f32, f32 { %elem_to_reduce1 = load %buffer1[%iv] : memref<100xf32> %elem_to_reduce2 = load %buffer2[%iv] : memref<100xf32> scf.reduce(%elem_to_reduce1, %elem_to_reduce2 : f32, f32) { ^bb0(%lhs : f32, %rhs: f32): %res = arith.addf %lhs, %rhs : f32 scf.reduce.return %res : f32 }, { ^bb0(%lhs : f32, %rhs: f32): %res = arith.mulf %lhs, %rhs : f32 scf.reduce.return %res : f32 } } ``` `scf.reduce` operations can no longer be interleaved with other ops in the body of `scf.parallel`. This simplifies the op and makes it possible to assign the `RecursiveMemoryEffects` trait to `scf.reduce`. (This was not possible before because the op was not a terminator, causing the op to be DCE'd.)
2023-09-19[mlir][Interfaces] `LoopLikeOpInterface`: Support ops with multiple regions ↵Matthias Springer
(#66754) This commit implements `LoopLikeOpInterface` on `scf.while`. This enables LICM (and potentially other transforms) on `scf.while`. `LoopLikeOpInterface::getLoopBody()` is renamed to `getLoopRegions` and can now return multiple regions. Also fix a bug in the default implementation of `LoopLikeOpInterface::isDefinedOutsideOfLoop()`, which returned "false" for some values that are defined outside of the loop (in a nested op, in such a way that the value does not dominate the loop). This interface is currently only used for LICM and there is no way to trigger this bug, so no test is added.
2023-08-31[mlir] Move FunctionInterfaces to Interfaces directory and inherit from ↵Martin Erhart
CallableOpInterface Functions are always callable operations and thus every operation implementing the `FunctionOpInterface` also implements the `CallableOpInterface`. The only exception was the FuncOp in the toy example. To make implementation of the `FunctionOpInterface` easier, this commit lets `FunctionOpInterface` inherit from `CallableOpInterface` and merges some of their methods. More precisely, the `CallableOpInterface` has methods to get the argument and result attributes and a method to get the result types of the callable region. These methods are always implemented the same way as their analogues in `FunctionOpInterface` and thus this commit moves all the argument and result attribute handling methods to the callable interface as well as the methods to get the argument and result types. The `FuntionOpInterface` then does not have to declare them as well, but just inherits them from the `CallableOpInterface`. Adding the inheritance relation also required to move the `FunctionOpInterface` from the IR directory to the Interfaces directory since IR should not depend on Interfaces. Reviewed By: jpienaar, springerm Differential Revision: https://reviews.llvm.org/D157988
2023-01-20[MLIR] Remove scf.if builder with explicit result types and callbacksFrederik Gossen
Instead, use the builder and infer the return type based on the inner `yield` ops. Also, fix uses that do not create the terminator as required for the callback builders. Differential Revision: https://reviews.llvm.org/D142056
2023-01-12[mlir] Add operations to BlockAndValueMapping and rename it to IRMappingJeff Niu
The patch adds operations to `BlockAndValueMapping` and renames it to `IRMapping`. When operations are cloned, old operations are mapped to the cloned operations. This allows mapping from an operation to a cloned operation. Example: ``` Operation *opWithRegion = ... Operation *opInsideRegion = &opWithRegion->front().front(); IRMapping map Operation *newOpWithRegion = opWithRegion->clone(map); Operation *newOpInsideRegion = map.lookupOrNull(opInsideRegion); ``` Migration instructions: All includes to `mlir/IR/BlockAndValueMapping.h` should be replaced with `mlir/IR/IRMapping.h`. All uses of `BlockAndValueMapping` need to be renamed to `IRMapping`. Reviewed By: rriddle, mehdi_amini Differential Revision: https://reviews.llvm.org/D139665
2022-09-30[mlir:Async][NFC] Update Async API to use prefixed accessorsRiver Riddle
This doesn't flip the switch for prefix generation yet, that'll be done in a followup.
2022-09-29[mlir][arith] Change dialect name from Arithmetic to ArithJakub Kuderski
Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22. Tested with: `ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples` and `bazel build --config=generic_clang @llvm-project//mlir:all`. Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini Differential Revision: https://reviews.llvm.org/D134762
2022-08-31[MLIR] Update pass declarations to new autogenerated filesMichele Scuttari
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838
2022-08-30Revert "[MLIR] Update pass declarations to new autogenerated files"Michele Scuttari
This reverts commit 2be8af8f0e0780901213b6fd3013a5268ddc3359.
2022-08-30[MLIR] Update pass declarations to new autogenerated filesMichele Scuttari
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838
2022-06-20[mlir] move SCF headers to SCF/{IR,Transforms} respectivelyAlex Zinenko
This aligns the SCF dialect file layout with the majority of the dialects. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D128049
2022-04-18[mlir:NFC] Remove the forward declaration of FuncOp in the mlir namespaceRiver Riddle
FuncOp has been moved to the `func` namespace for a little over a month, the using directive can be dropped now.
2022-03-16[mlir:FunctionOpInterface] Rename the "type" attribute to "function_type"River Riddle
This removes any potential confusion with the `getType` accessors which correspond to SSA results of an operation, and makes it clear what the intent is (i.e. to represent the type of the function). Differential Revision: https://reviews.llvm.org/D121762
2022-03-08[mlir][NFC] Update the Builtin dialect to use "Both" accessorsRiver Riddle
Differential Revision: https://reviews.llvm.org/D121189
2022-03-01[mlir] Rename the Standard dialect to the Func dialectRiver Riddle
The last remaining operations in the standard dialect all revolve around FuncOp/function related constructs. This patch simply handles the initial renaming (which by itself is already huge), but there are a large number of cleanups unlocked/necessary afterwards: * Removing a bunch of unnecessary dependencies on Func * Cleaning up the From/ToStandard conversion passes * Preparing for the move of FuncOp to the Func dialect See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061 Differential Revision: https://reviews.llvm.org/D120624
2022-02-23[mlir] Async: update condition for dispatching block-aligned compute functionEugene Zhulenev
+ compare block size with the unrollable inner dimension + reduce nesting in the code and simplify a bit IR building Reviewed By: cota Differential Revision: https://reviews.llvm.org/D120075
2022-02-16[mlir] NFC Async: always use 'b' for the current builderEugene Zhulenev
Currently some of the nested IR building inconsistently uses `nb` and `b`, it's very easy to call wrong builder outside of the current scope, so for simplicity all builders are always called `b`, and in nested IR building regions they just shadow the "parent" builder. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D120003
2022-02-16[mlir] Async: create async.group inside the scf.if branchEugene Zhulenev
Reviewed By: cota Differential Revision: https://reviews.llvm.org/D119959
2022-02-07[mlir][NFC] Remove a few op builders that simply swap parameter orderRiver Riddle
Differential Revision: https://reviews.llvm.org/D119093
2022-02-02[mlir:Standard] Remove support for creating a `unit` ConstantOpRiver Riddle
This is completely unused upstream, and does not really have well defined semantics on what this is supposed to do/how this fits into the ecosystem. Given that, as part of splitting up the standard dialect it's best to just remove this behavior, instead of try to awkwardly fit it somewhere upstream. Downstream users are encouraged to define their own operations that clearly can define the semantics of this. This also uncovered several lingering uses of ConstantOp that weren't updated to use arith::ConstantOp, and worked during conversions because the constant was removed/converted into something else before verification. See https://llvm.discourse.group/t/standard-dialect-the-final-chapter/ for more discussion. Differential Revision: https://reviews.llvm.org/D118654
2022-02-02[mlir] Move SelectOp from Standard to ArithmeticRiver Riddle
This is part of splitting up the standard dialect. See https://llvm.discourse.group/t/standard-dialect-the-final-chapter/ for discussion. Differential Revision: https://reviews.llvm.org/D118648
2022-01-31[async] Get the number of worker threads from the runtime.bakhtiyar
Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D117751
2022-01-19[mlir] Make locations required when adding/creating block argumentsRiver Riddle
BlockArguments gained the ability to have locations attached a while ago, but they have always been optional. This goes against the core tenant of MLIR where location information is a requirement, so this commit updates the API to require locations. Fixes #53279 Differential Revision: https://reviews.llvm.org/D117633
2022-01-02Apply clang-tidy fixes for performance-unnecessary-value-param to MLIR (NFC)Mehdi Amini
Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D116250
2021-12-20[mlir] Switching accessors to prefixed form (NFC)Jacques Pienaar
Makes eventual prefixing flag flip smaller change.
2021-12-19Make AsyncParallelForRewrite parameterizable with a cost model which drives ↵bakhtiyar
deciding the parallelization granularity. Reviewed By: ezhulenev, mehdi_amini Differential Revision: https://reviews.llvm.org/D115423
2021-12-09[mlir] AsyncParallelFor: align block size to be a multiple of inner loops ↵Eugene Zhulenev
iterations Depends On D115263 By aligning block size to inner loop iterations parallel_compute_fn LLVM can later unroll and vectorize some of the inner loops with small number of trip counts. Up to 2x speedup in multiple benchmarks. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D115436
2021-12-09[mlir] AsyncParallelFor: sink constants into the parallel compute functionEugene Zhulenev
With complex recursive structure of async dispatch function LLVM can't always propagate constants to the parallel_compute_fn and it often prevents optimizations like loop unrolling and vectorization. We help LLVM by pushing known constants into the parallel_compute_fn explicitly. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D115263
2021-12-06[mlir] Improve async parallel for tests + fix typosEugene Zhulenev
Do load and store to verify that we process each element of the iteration space once. Reviewed By: cota Differential Revision: https://reviews.llvm.org/D115152
2021-11-24Promote readability by factoring out creation of min/max operation. Remove ↵bakhtiyar
unnecessary divisions. Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D110680
2021-10-13[MLIR] Replace std ops with arith dialect opsMogball
Precursor: https://reviews.llvm.org/D110200 Removed redundant ops from the standard dialect that were moved to the `arith` or `math` dialects. Renamed all instances of operations in the codebase and in tests. Reviewed By: rriddle, jpienaar Differential Revision: https://reviews.llvm.org/D110797
2021-09-28Remove unnecessary async group creates and awaits.bakhtiyar
Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D110605
2021-09-28Rename target block size to min task size for clarity.bakhtiyar
Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D110604
2021-08-02[mlir] Async: clone constants into async.execute functions and parallel ↵Eugene Zhulenev
compute functions Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D107007
2021-07-23[mlir] Async: special handling for parallel loops with zero iterationsEugene Zhulenev
Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D106590
2021-07-01[mlir][async] Remove unused variable. NFC.Benjamin Kramer
2021-06-29[mlir:Async] Change async-parallel-for block size/count calculationEugene Zhulenev
Depends On D105037 Avoid creating too many tasks when the number of workers is large. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D105126
2021-06-29[mlir:Async] Remove async operations if it is statically known that the ↵Eugene Zhulenev
parallel operation has a single compute block Depends On D104850 Add a test that verifies that canonicalization removes all async overheads if it is statically known that the scf.parallel operation will be computed using a single block. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D104891
2021-06-25[mlir:Async] Submit accidentally omitted changesEugene Zhulenev
Accidentally pushed old branches that did not include all the changes discussed in the PRs. https://reviews.llvm.org/rGd43b23608ad664f02f56e965ca78916bde220950 https://reviews.llvm.org/rG86ad0af87054c3cccd68d32e103a6f1f6c6194c7 Differential Revision: https://reviews.llvm.org/D104943
2021-06-25[mlir:Async] Implement recursive async work splitting for scf.parallel ↵Eugene Zhulenev
operation (async-parallel-for pass) Depends On D104780 Recursive work splitting instead of sequential async tasks submission gives ~20%-30% speedup in microbenchmarks. Algorithm outline: 1. Collapse scf.parallel dimensions into a single dimension 2. Compute the block size for the parallel operations from the 1d problem size 3. Launch parallel tasks 4. Each parallel task reconstructs its own bounds in the original multi-dimensional iteration space 5. Each parallel task computes the original parallel operation body using scf.for loop nest Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D104850
2021-06-25[mlir:Async] Add the size parameter to the async.groupEugene Zhulenev
Specify the `!async.group` size (the number of tokens that will be added to it) at construction time. `async.await_all` operation can potentially race with `async.execute` operations that keep updating the group, for this reason it is required to know upfront how many tokens will be added to the group. Reviewed By: ftynse, herhut Differential Revision: https://reviews.llvm.org/D104780
2021-04-13[mlir] Convert async dialect passes from function passes to op agnostic passesEugene Zhulenev
Differential Revision: https://reviews.llvm.org/D100401
2021-03-22[PatternMatch] Big mechanical rename OwningRewritePatternList -> ↵Chris Lattner
RewritePatternSet and insert -> add. NFC This doesn't change APIs, this just cleans up the many in-tree uses of these names to use the new preferred names. We'll keep the old names around for a couple weeks to help transitions. Differential Revision: https://reviews.llvm.org/D99127