<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/openmp/libomptarget/DeviceRTL/src/Parallelism.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[Offload] Move `/openmp/libomptarget` to `/offload` (#75125)</title>
<updated>2024-04-22T16:51:33+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2024-04-22T16:51:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=330d8983d25d08580fc1642fea48b2473f47a9da'/>
<id>330d8983d25d08580fc1642fea48b2473f47a9da</id>
<content type='text'>
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -&gt; `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -&gt; `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam &lt;Saiyedul.Islam@amd.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -&gt; `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -&gt; `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam &lt;Saiyedul.Islam@amd.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[libomptarget][NFC] Outline parallel SPMD function (#78642)</title>
<updated>2024-01-29T16:41:35+00:00</updated>
<author>
<name>Gheorghe-Teodor Bercea</name>
<email>doru.bercea@amd.com</email>
</author>
<published>2024-01-29T16:41:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=430c1fd50d774dc30a9628bcf60ce243f74ff376'/>
<id>430c1fd50d774dc30a9628bcf60ce243f74ff376</id>
<content type='text'>
This patch outlines the SPMD code path into a separate function that can
be called directly.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch outlines the SPMD code path into a separate function that can
be called directly.</pre>
</div>
</content>
</entry>
<entry>
<title>[Libomptarget][NFC] Format in-line comments consistently (#77530)</title>
<updated>2024-01-10T16:10:08+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2024-01-10T16:10:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d03b8c3a048262eae1b13be829e20971e1714ade'/>
<id>d03b8c3a048262eae1b13be829e20971e1714ade</id>
<content type='text'>
Summary:
The LLVM style uses /*Foo=*/ when indicating the name of a constant. See
https://llvm.org/docs/CodingStandards.html#comment-formatting. This is
useful for consistency, as well as because `clang-format` understands
this syntax and formats it more cleanly. Do a bulk update of this
syntax.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
The LLVM style uses /*Foo=*/ when indicating the name of a constant. See
https://llvm.org/docs/CodingStandards.html#comment-formatting. This is
useful for consistency, as well as because `clang-format` understands
this syntax and formats it more cleanly. Do a bulk update of this
syntax.</pre>
</div>
</content>
</entry>
<entry>
<title>Attributes (#69358)</title>
<updated>2023-10-18T16:52:43+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>35342157+jhuber6@users.noreply.github.com</email>
</author>
<published>2023-10-18T16:52:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=b69081e3241be6c310ca98ecdad53643dd804e25'/>
<id>b69081e3241be6c310ca98ecdad53643dd804e25</id>
<content type='text'>
- [Libomptarget] Make the references to 'malloc' and 'free' weak.
- [Libomptarget][NFC] Use C++ style attributes instead</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- [Libomptarget] Make the references to 'malloc' and 'free' weak.
- [Libomptarget][NFC] Use C++ style attributes instead</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Force the parallel abstraction to be inlined</title>
<updated>2023-08-23T18:48:18+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2023-08-21T20:58:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=382b97554dd3e625a99bdaf3672f7f43ee6aef33'/>
<id>382b97554dd3e625a99bdaf3672f7f43ee6aef33</id>
<content type='text'>
This is good for performance and compile time and the indirection (+
switch statements) is nothing that needs to be preserved.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is good for performance and compile time and the indirection (+
switch statements) is nothing that needs to be preserved.
</pre>
</div>
</content>
</entry>
<entry>
<title>[Libomptarget] Remove debug RAII from libomptarget</title>
<updated>2023-08-03T14:37:47+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>jhuber6@vols.utk.edu</email>
</author>
<published>2023-08-03T13:48:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=46642cc83dc575962e1a6eb557714319c65ca5b8'/>
<id>46642cc83dc575962e1a6eb557714319c65ca5b8</id>
<content type='text'>
This feature was supposed to allow you to trace execution inside of
Libomptarget. However, this never really worked properly. The printing
was always reoganized, only worked for single  threads, and pretty much
only told you a handful of things about a runtime library that's an
implementation detail to all users. Despite this, it contributed about
40% of the total filesize of the deviceRTL. This patch simply removes
this functionalit which I think was past due.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D157001
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This feature was supposed to allow you to trace execution inside of
Libomptarget. However, this never really worked properly. The printing
was always reoganized, only worked for single  threads, and pretty much
only told you a handful of things about a runtime library that's an
implementation detail to all users. Despite this, it contributed about
40% of the total filesize of the deviceRTL. This patch simply removes
this functionalit which I think was past due.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D157001
</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][NFC] Reorganize the ompx::mapping layer in the GPU runtime</title>
<updated>2023-07-31T20:44:51+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2023-07-26T22:20:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1f3a28d4e54649d1453eb951f570a8c1958d4a5c'/>
<id>1f3a28d4e54649d1453eb951f570a8c1958d4a5c</id>
<content type='text'>
This change makes the naming more consistent, I hope.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This change makes the naming more consistent, I hope.
</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][NFCI] Avoid storing non-constant values in ICV</title>
<updated>2023-07-18T23:50:50+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2023-07-18T23:33:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=f914208c4388780e24ed3a1ab2e170c53129b2a1'/>
<id>f914208c4388780e24ed3a1ab2e170c53129b2a1</id>
<content type='text'>
If we store a constant in an ICV it is easier for the optimizer to
propagate it. Since we often use the full block for the thread limit and
the parallel team size, we can instead replace that dynamic value with a
constant that otherwise cannot occur, here 0.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we store a constant in an ICV it is easier for the optimizer to
propagate it. Since we often use the full block for the thread limit and
the parallel team size, we can instead replace that dynamic value with a
constant that otherwise cannot occur, here 0.
</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][NFCI] Split assertion message from assertion expression</title>
<updated>2023-07-18T23:50:50+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2023-07-18T23:05:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=88a68de14cf66d067be54b8127c17b51192f2aa8'/>
<id>88a68de14cf66d067be54b8127c17b51192f2aa8</id>
<content type='text'>
We ended up with `llvm.assume(icmp ne ptr as(4) null, as(4) @str)`
because the string in address space 4 was not known to be non-null.
There is no need to create these assumes.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We ended up with `llvm.assume(icmp ne ptr as(4) null, as(4) @str)`
because the string in address space 4 was not known to be non-null.
There is no need to create these assumes.
</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Ensure memory fences are created with barriers for AMDGPUs</title>
<updated>2023-04-17T22:27:17+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2023-03-21T01:55:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=67fed132f39c81e8006c4463ab1f173fea5e4e4b'/>
<id>67fed132f39c81e8006c4463ab1f173fea5e4e4b</id>
<content type='text'>
It turns out that the __builtin_amdgcn_s_barrier() alone does not emit
a fence. We somehow got away with this and assumed it would work as it
(hopefully) is correct on the NVIDIA path where we just emit a
__syncthreads. After talking to @arsenm we now (mostly) align with the
OpenCL barrier implementation [1] and emit explicit fences for AMDGPUs.

It seems this was the underlying cause for #59759, but I am not 100%
certain. There is a chance this simply hides the problem.

Fixes: https://github.com/llvm/llvm-project/issues/59759

[1] https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/07b347366eb2c6ebc3414af323c623cbbbafc854/opencl/src/workgroup/wgbarrier.cl#L21
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It turns out that the __builtin_amdgcn_s_barrier() alone does not emit
a fence. We somehow got away with this and assumed it would work as it
(hopefully) is correct on the NVIDIA path where we just emit a
__syncthreads. After talking to @arsenm we now (mostly) align with the
OpenCL barrier implementation [1] and emit explicit fences for AMDGPUs.

It seems this was the underlying cause for #59759, but I am not 100%
certain. There is a chance this simply hides the problem.

Fixes: https://github.com/llvm/llvm-project/issues/59759

[1] https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/07b347366eb2c6ebc3414af323c623cbbbafc854/opencl/src/workgroup/wgbarrier.cl#L21
</pre>
</div>
</content>
</entry>
</feed>
