| Age | Commit message (Collapse) | Author |
|
Identified with bugprone-unused-local-non-trivial-variable.
|
|
See
https://discourse.llvm.org/t/psa-opty-create-now-with-100-more-tab-complete/87339.
I plan to mark these as deprecated in
https://github.com/llvm/llvm-project/pull/164649.
|
|
(#164043)
This PR shifts from using the LLVM OpenMP enumerator bit flags to an
OpenMP dialect specific enumerator. This allows us to better represent
map types that wouldn't be of interest to the LLVM backend and runtime
in the dialect.
Primarily things like
ref_ptr/ref_ptee/ref_ptr_ptee/atach_none/attach_always/attach_auto which
are of interest to the compiler for certrain transformations (primarily
in the FIR transformation passes dealing with mapping), but the runtime
has no need to know about them. It also means if another OpenMP
implementation comes along they won't need to stick to the same bit flag
system LLVM chose/do leg work to address it.
|
|
Extends `do concurrent` to OpenMP device mapping by adding support for
mapping `reduce` specifiers to omp `reduction` clauses. The changes
attach 2 `reduction` clauses to the mapped OpenMP construct: one on the
`teams` part of the construct and one on the `wloop` part.
- https://github.com/llvm/llvm-project/pull/155754
- https://github.com/llvm/llvm-project/pull/155987
- https://github.com/llvm/llvm-project/pull/155992
- https://github.com/llvm/llvm-project/pull/155993
- https://github.com/llvm/llvm-project/pull/157638
- https://github.com/llvm/llvm-project/pull/156610 ◀️
- https://github.com/llvm/llvm-project/pull/156837
|
|
Extends support for mapping `do concurrent` on the device by adding
support for `local` specifiers. The changes in this PR map the local
variable to the `omp.target` op and uses the mapped value as the
`private` clause operand in the nested `omp.parallel` op.
- https://github.com/llvm/llvm-project/pull/155754
- https://github.com/llvm/llvm-project/pull/155987
- https://github.com/llvm/llvm-project/pull/155992
- https://github.com/llvm/llvm-project/pull/155993
- https://github.com/llvm/llvm-project/pull/157638 ◀️
- https://github.com/llvm/llvm-project/pull/156610
- https://github.com/llvm/llvm-project/pull/156837
|
|
Upstreams further parts of `do concurrent` to OpenMP conversion pass
from AMD's fork. This PR extends the pass by adding support for mapping
to the device.
PR stack:
- https://github.com/llvm/llvm-project/pull/155754
- https://github.com/llvm/llvm-project/pull/155987 ◀️
- https://github.com/llvm/llvm-project/pull/155992
- https://github.com/llvm/llvm-project/pull/155993
- https://github.com/llvm/llvm-project/pull/157638
- https://github.com/llvm/llvm-project/pull/156610
- https://github.com/llvm/llvm-project/pull/156837
|
|
`ConversionPatternRewriter::replaceAllUsesWith` (#155244)
This commit generalizes `replaceUsesOfBlockArgument` to
`replaceAllUsesWith`. In rollback mode, the same restrictions keep
applying: a value cannot be replaced multiple times and a call to
`replaceAllUsesWith` will replace all current and future uses of the
`from` value.
`replaceAllUsesWith` is now fully supported and its behavior is
consistent with the remaining dialect conversion API. Before this
commit, `replaceAllUsesWith` was immediately reflected in the IR when
running in rollback mode. After this commit, `replaceAllUsesWith`
changes are materialized in a delayed fashion, at the end of the dialect
conversion. This is consistent with the `replaceUsesOfBlockArgument` and
`replaceOp` APIs.
`replaceAllUsesExcept` etc. are still not supported and will be
deactivated on the `ConversionPatternRewriter` (when running in rollback
mode) in a follow-up commit.
Note for LLVM integration: Replace `replaceUsesOfBlockArgument` with
`replaceAllUsesWith`. If you are seeing failures, you may have patterns
that use `replaceAllUsesWith` incorrectly (e.g., being called multiple
times on the same value) or bypass the rewriter API entirely. E.g., such
failures were mitigated in Flang by switching to the walk-patterns
driver (#156171).
You can temporarily reactivate the old behavior by calling
`RewriterBase::replaceAllUsesWith`. However, note that that behavior is
faulty in a dialect conversion. E.g., the base
`RewriterBase::replaceAllUsesWith` implementation does not see uses of
the `from` value that have not materialized yet and will, therefore, not
replace them.
|
|
OpenMP (#155355)
Fixes #155273
This PR introduces 2 changes:
1. The `do concurrent` to OpenMP pass is now a module pass rather than a
function pass.
2. Reduction ops are looked up in the parent module before being
created.
The benefit of using a module pass is that the same reduction operation
can be used across multiple functions if the reduction type matches.
|
|
See https://github.com/llvm/llvm-project/pull/147168 for more info.
|
|
Now that we have changes introduced by #145837, mapping reductions from
`do concurrent` to OpenMP is almost trivial. This PR adds such mapping.
PR stack:
- https://github.com/llvm/llvm-project/pull/145837
- https://github.com/llvm/llvm-project/pull/146025
- https://github.com/llvm/llvm-project/pull/146028
- https://github.com/llvm/llvm-project/pull/146033 (this one)
|
|
Re-organizes the op definition a little bit and removes a method that
does not add much value to the API.
PR stack:
- https://github.com/llvm/llvm-project/pull/145837
- https://github.com/llvm/llvm-project/pull/146025
- https://github.com/llvm/llvm-project/pull/146028 (this one)
- https://github.com/llvm/llvm-project/pull/146033
|
|
regions) (#142795)
Extends support for locality specifier to OpenMP translation by adding
supprot for transling localizers that have `init` and `dealloc` regions.
|
|
Starts the effort to map `do concurrent` locality specifiers to OpenMP
clauses. This PR adds support for basic specifiers (no `init` or `copy`
regions yet).
|
|
`fir.do_concurrent` op (#138489)
This PR updates the `do concurrent` to OpenMP mapping pass to use the
newly added `fir.do_concurrent` ops that were recently added upstream
instead of handling nests of `fir.do_loop ... unordered` ops.
Parent PR: https://github.com/llvm/llvm-project/pull/137928.
|
|
This patch fixes:
flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp:184:18: error:
unused variable 'loc' [-Werror,-Wunused-variable]
|
|
Extends `do concurrent` mapping to handle "loop-local values". A
loop-local value is one that is used exclusively inside the loop but
allocated outside of it. This usually corresponds to temporary values
that are used inside the loop body for initialzing other variables for
example. After collecting these values, the pass localizes them to the
loop nest by moving their allocations.
PR stack:
- https://github.com/llvm/llvm-project/pull/126026
- https://github.com/llvm/llvm-project/pull/127595
- https://github.com/llvm/llvm-project/pull/127633
- https://github.com/llvm/llvm-project/pull/127634
- https://github.com/llvm/llvm-project/pull/127635 (this PR)
|
|
Adds support for converting mulit-range loops to OpenMP (on the host
only for now). The changes here "prepare" a loop nest for collapsing by
sinking iteration variables to the innermost `fir.do_loop` op in the
nest.
PR stack:
- https://github.com/llvm/llvm-project/pull/126026
- https://github.com/llvm/llvm-project/pull/127595
- https://github.com/llvm/llvm-project/pull/127633
- https://github.com/llvm/llvm-project/pull/127634 (this PR)
- https://github.com/llvm/llvm-project/pull/127635
|
|
(#127633)
Upstreams one more part of the ROCm `do concurrent` to OpenMP mapping
pass. This PR add support for converting simple loops to the equivalent
OpenMP constructs on the host: `omp parallel do`. Towards that end, we
have to collect more information about loop nests for which we add new
utils in the `looputils` name space.
PR stack:
- https://github.com/llvm/llvm-project/pull/126026
- https://github.com/llvm/llvm-project/pull/127595
- https://github.com/llvm/llvm-project/pull/127633 (this PR)
- https://github.com/llvm/llvm-project/pull/127634
- https://github.com/llvm/llvm-project/pull/127635
|
|
Upstreams the next part of do concurrent to OpenMP mapping pass (from
AMD's ROCm implementation). See
https://github.com/llvm/llvm-project/pull/126026 for more context.
This PR add loop nest detection logic. This enables us to discover
muli-range do concurrent loops and then map them as "collapsed" loop
nests to OpenMP.
This is a follow up for
https://github.com/llvm/llvm-project/pull/126026, only the latest commit
is relevant.
This is a replacement for
https://github.com/llvm/llvm-project/pull/127478 using a
`/user/<username>/<branchname>` branch.
PR stack:
- https://github.com/llvm/llvm-project/pull/126026
- https://github.com/llvm/llvm-project/pull/127595 (this PR)
- https://github.com/llvm/llvm-project/pull/127633
- https://github.com/llvm/llvm-project/pull/127634
- https://github.com/llvm/llvm-project/pull/127635
|
|
This PR starts the effort to upstream AMD's internal implementation of
`do concurrent` to OpenMP mapping. This replaces #77285 since we
extended this WIP quite a bit on our fork over the past year.
An important part of this PR is a document that describes the current
status downstream, the upstreaming status, and next steps to make this
pass much more useful.
In addition to this document, this PR also contains the skeleton of the
pass (no useful transformations are done yet) and some testing for the
added command line options.
This looks like a huge PR but a lot of the added stuff is documentation.
It is also worth noting that the downstream pass has been validated on
https://github.com/BerkeleyLab/fiats. For the CPU mapping, this achived
performance speed-ups that match pure OpenMP, for GPU mapping we are
still working on extending our support for implicit memory mapping and
locality specifiers.
PR stack:
- https://github.com/llvm/llvm-project/pull/126026 (this PR)
- https://github.com/llvm/llvm-project/pull/127595
- https://github.com/llvm/llvm-project/pull/127633
- https://github.com/llvm/llvm-project/pull/127634
- https://github.com/llvm/llvm-project/pull/127635
|