<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/openmp/libomptarget/DeviceRTL/src/Synchronization.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[Offload] Move `/openmp/libomptarget` to `/offload` (#75125)</title>
<updated>2024-04-22T16:51:33+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2024-04-22T16:51:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=330d8983d25d08580fc1642fea48b2473f47a9da'/>
<id>330d8983d25d08580fc1642fea48b2473f47a9da</id>
<content type='text'>
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -&gt; `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -&gt; `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam &lt;Saiyedul.Islam@amd.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -&gt; `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -&gt; `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam &lt;Saiyedul.Islam@amd.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[Libomptarget] Remove remaining inline assembly from the device RTL (#79922)</title>
<updated>2024-01-30T14:08:51+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2024-01-30T14:08:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6aed6cc40ec0006bb43f1ec4b2ec87702392ad6e'/>
<id>6aed6cc40ec0006bb43f1ec4b2ec87702392ad6e</id>
<content type='text'>
Summary:
Recent patches have added some missing intrinsic functions NVPTX. This
patch gets rid of all the remaining uses of inline assembly. The one
change that wasn't directly replaced with a built-in was the `pack` and
`unpack` implementations. However, using the generic C implementation is
equivalent to the output SASS when run through PTXAS.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
Recent patches have added some missing intrinsic functions NVPTX. This
patch gets rid of all the remaining uses of inline assembly. The one
change that wasn't directly replaced with a built-in was the `pack` and
`unpack` implementations. However, using the generic C implementation is
equivalent to the output SASS when run through PTXAS.</pre>
</div>
</content>
</entry>
<entry>
<title>[Libomptarget] Use scoped atomics in the device runtime (#75834)</title>
<updated>2023-12-19T20:30:34+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2023-12-19T20:30:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=219355d4c0d2b6e2c0d5e022f8b7a78c1e9ce53f'/>
<id>219355d4c0d2b6e2c0d5e022f8b7a78c1e9ce53f</id>
<content type='text'>
Summary:
A recent patch allowed us to easily replace GNU atomics with scoped
variants that make use of the backend's handling for more permissive
scopes. The default is full "system" scope, that means the atomic
operation must be consistent with operations that may happen on the
host's memory. This is generally only required for processes that are
communicating with something via global fine-grained memory. This patch
uses these atomics to make everything device scoped, as nothing in the
OpenMP runtime should depend on the host.

This is only provided as a very new clang extension but the DeviceRTL is
only compiled with clang so it is always available.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
A recent patch allowed us to easily replace GNU atomics with scoped
variants that make use of the backend's handling for more permissive
scopes. The default is full "system" scope, that means the atomic
operation must be consistent with operations that may happen on the
host's memory. This is generally only required for processes that are
communicating with something via global fine-grained memory. This patch
uses these atomics to make everything device scoped, as nothing in the
OpenMP runtime should depend on the host.

This is only provided as a very new clang extension but the DeviceRTL is
only compiled with clang so it is always available.</pre>
</div>
</content>
</entry>
<entry>
<title>[Libomptarget] Add a wavefront sync builtin for the AMDGPU implementation (#70228)</title>
<updated>2023-10-25T19:27:14+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>35342157+jhuber6@users.noreply.github.com</email>
</author>
<published>2023-10-25T19:27:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=17b5445996c993057824b7142905b48ed67292b3'/>
<id>17b5445996c993057824b7142905b48ed67292b3</id>
<content type='text'>
Summary:
While this is technically a no-op for AMDGPU hardware, in cases where
the user would see fit to add an explicit wavefront sync on Nvidia
hardware, we should also inform the LLVM optimizer that this control
flow is convergent so we do not reorder blocks.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
While this is technically a no-op for AMDGPU hardware, in cases where
the user would see fit to add an explicit wavefront sync on Nvidia
hardware, we should also inform the LLVM optimizer that this control
flow is convergent so we do not reorder blocks.</pre>
</div>
</content>
</entry>
<entry>
<title>Attributes (#69358)</title>
<updated>2023-10-18T16:52:43+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>35342157+jhuber6@users.noreply.github.com</email>
</author>
<published>2023-10-18T16:52:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=b69081e3241be6c310ca98ecdad53643dd804e25'/>
<id>b69081e3241be6c310ca98ecdad53643dd804e25</id>
<content type='text'>
- [Libomptarget] Make the references to 'malloc' and 'free' weak.
- [Libomptarget][NFC] Use C++ style attributes instead</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- [Libomptarget] Make the references to 'malloc' and 'free' weak.
- [Libomptarget][NFC] Use C++ style attributes instead</pre>
</div>
</content>
</entry>
<entry>
<title>[Libomptarget] Remove debug RAII from libomptarget</title>
<updated>2023-08-03T14:37:47+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>jhuber6@vols.utk.edu</email>
</author>
<published>2023-08-03T13:48:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=46642cc83dc575962e1a6eb557714319c65ca5b8'/>
<id>46642cc83dc575962e1a6eb557714319c65ca5b8</id>
<content type='text'>
This feature was supposed to allow you to trace execution inside of
Libomptarget. However, this never really worked properly. The printing
was always reoganized, only worked for single  threads, and pretty much
only told you a handful of things about a runtime library that's an
implementation detail to all users. Despite this, it contributed about
40% of the total filesize of the deviceRTL. This patch simply removes
this functionalit which I think was past due.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D157001
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This feature was supposed to allow you to trace execution inside of
Libomptarget. However, this never really worked properly. The printing
was always reoganized, only worked for single  threads, and pretty much
only told you a handful of things about a runtime library that's an
implementation detail to all users. Despite this, it contributed about
40% of the total filesize of the deviceRTL. This patch simply removes
this functionalit which I think was past due.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D157001
</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Add ompx wrappers for __syncthreads</title>
<updated>2023-07-31T20:44:51+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2023-07-31T17:54:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=deb0ea3e479ad1cc840d6d4c3dca852250f041b7'/>
<id>deb0ea3e479ad1cc840d6d4c3dca852250f041b7</id>
<content type='text'>
Differential Revision: https://reviews.llvm.org/D156729
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Differential Revision: https://reviews.llvm.org/D156729
</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][NFC] Reorganize the ompx::mapping layer in the GPU runtime</title>
<updated>2023-07-31T20:44:51+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2023-07-26T22:20:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1f3a28d4e54649d1453eb951f570a8c1958d4a5c'/>
<id>1f3a28d4e54649d1453eb951f570a8c1958d4a5c</id>
<content type='text'>
This change makes the naming more consistent, I hope.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This change makes the naming more consistent, I hope.
</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][NFCI] Split assertion message from assertion expression</title>
<updated>2023-07-18T23:50:50+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2023-07-18T23:05:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=88a68de14cf66d067be54b8127c17b51192f2aa8'/>
<id>88a68de14cf66d067be54b8127c17b51192f2aa8</id>
<content type='text'>
We ended up with `llvm.assume(icmp ne ptr as(4) null, as(4) @str)`
because the string in address space 4 was not known to be non-null.
There is no need to create these assumes.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We ended up with `llvm.assume(icmp ne ptr as(4) null, as(4) @str)`
because the string in address space 4 was not known to be non-null.
There is no need to create these assumes.
</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Added memory scope to atomic::inc API and used the device scope in reduction.</title>
<updated>2023-06-30T19:05:01+00:00</updated>
<author>
<name>Dhruva Chakrabarti</name>
<email>Dhruva.Chakrabarti@amd.com</email>
</author>
<published>2023-06-30T18:02:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6a1d1f7eefe81faa1f7c6c47e8b9da0bfeb8c2e8'/>
<id>6a1d1f7eefe81faa1f7c6c47e8b9da0bfeb8c2e8</id>
<content type='text'>
With https://reviews.llvm.org/D137524, memory scope and ordering
attributes are being used to generate the required instructions for
atomic inc/dec on AMDGPU. This patch adds the memory scope attribute to
the atomic::inc API and uses the device scope in reduction. Without
the device scope in atomic_inc, the default system scope leads to
unnecessary L2 write-backs/invalidates.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154172
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With https://reviews.llvm.org/D137524, memory scope and ordering
attributes are being used to generate the required instructions for
atomic inc/dec on AMDGPU. This patch adds the memory scope attribute to
the atomic::inc API and uses the device scope in reduction. Without
the device scope in atomic_inc, the default system scope leads to
unnecessary L2 write-backs/invalidates.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154172
</pre>
</div>
</content>
</entry>
</feed>
