<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/offload/DeviceRTL/src/Workshare.cpp, branch users/nico/python-2</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[OpenMP] Fix num_iters in __kmpc_*_loop DeviceRTL functions (#133435)</title>
<updated>2025-04-01T09:29:08+00:00</updated>
<author>
<name>Sergio Afonso</name>
<email>safonsof@amd.com</email>
</author>
<published>2025-04-01T09:29:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=66fca0674d83254c70af4a6289496b8acc4377df'/>
<id>66fca0674d83254c70af4a6289496b8acc4377df</id>
<content type='text'>
This patch removes the addition of 1 to the number of iterations when
calling the following DeviceRTL functions:
- `__kmpc_distribute_for_static_loop*`
- `__kmpc_distribute_static_loop*`
- `__kmpc_for_static_loop*`

Calls to these functions are currently only produced by the OMPIRBuilder
from flang, which already passes the correct number of iterations to
these functions. By adding 1 to the received `num_iters` variable,
worksharing can produce incorrect results. This impacts flang OpenMP
offloading of `do`, `distribute` and `distribute parallel do`
constructs.

Expecting the application to pass `tripcount - 1` as the argument seems
unexpected as well, so rather than updating flang I think it makes more
sense to update the runtime.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch removes the addition of 1 to the number of iterations when
calling the following DeviceRTL functions:
- `__kmpc_distribute_for_static_loop*`
- `__kmpc_distribute_static_loop*`
- `__kmpc_for_static_loop*`

Calls to these functions are currently only produced by the OMPIRBuilder
from flang, which already passes the correct number of iterations to
these functions. By adding 1 to the received `num_iters` variable,
worksharing can produce incorrect results. This impacts flang OpenMP
offloading of `do`, `distribute` and `distribute parallel do`
constructs.

Expecting the application to pass `tripcount - 1` as the argument seems
unexpected as well, so rather than updating flang I think it makes more
sense to update the runtime.</pre>
</div>
</content>
</entry>
<entry>
<title>[offload] Remove bad assert in StaticLoopChunker::Distribute (#132705)</title>
<updated>2025-03-28T09:53:00+00:00</updated>
<author>
<name>macurtis-amd</name>
<email>macurtis@amd.com</email>
</author>
<published>2025-03-28T09:53:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=21a8c63cdc36abbf38b4402c9eb34a26598b8476'/>
<id>21a8c63cdc36abbf38b4402c9eb34a26598b8476</id>
<content type='text'>
When building with asserts enabled, this can actually cause strange
miscompilations because an incorrect llvm.assume is generated at the
point of the assertion.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When building with asserts enabled, this can actually cause strange
miscompilations because an incorrect llvm.assume is generated at the
point of the assertion.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Replace use of target address space with &lt;gpuintrin.h&gt; local (#126119)</title>
<updated>2025-02-09T16:25:25+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-02-09T16:25:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ed9107f2d71804f6bedff6cd05b1f1a4750eb112'/>
<id>ed9107f2d71804f6bedff6cd05b1f1a4750eb112</id>
<content type='text'>
Summary:
This definition is more portable since it defines the correct value for
the target. I got rid of the helper mostly because I think it's easy
enough to use now that it's a type and being explicit about what's
`undef` or `poison` is good.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
This definition is more portable since it defines the correct value for
the target. I got rid of the helper mostly because I think it's easy
enough to use now that it's a type and being explicit about what's
`undef` or `poison` is good.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Port the OpenMP device runtime to direct C++ compilation (#123673)</title>
<updated>2025-02-05T14:18:52+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-02-05T14:18:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=bb7ab2557c485e004e619570cca7e2204b98a71b'/>
<id>bb7ab2557c485e004e619570cca7e2204b98a71b</id>
<content type='text'>
Summary:
This removes the use of OpenMP offloading to build the device runtime.
The main benefit here is that we no longer need to rely on offloading
semantics to build a device only runtime. Things like variants are now
no longer needed and can just be simple if-defs. In the future, I will
remove most of the special handling here and fold it into calls to the
`&lt;gpuintrin.h&gt;` functions instead. Additionally I will rework the
compilation to make this a separate runtime.

The current plan is to have this, but make including OpenMP and
offloading either automatically add it, or print a warning if it's
missing. This will allow us to use a normal CMake workflow and delete
all the weird 'lets pull the clang binary out of the build' business.
```
-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=offload
-DLLVM_RUNTIME_TARGETS=amdgcn-amd-amdhsa
```

After that, linking the OpenMP device runtime will be `-Xoffload-linker
-lomp`. I.e. no more fat binary business.

Only look at the most recent commit since this includes the two
dependencies
(fix to AMDGPUEmitPrintfBinding and the PointerToMember bug).</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
This removes the use of OpenMP offloading to build the device runtime.
The main benefit here is that we no longer need to rely on offloading
semantics to build a device only runtime. Things like variants are now
no longer needed and can just be simple if-defs. In the future, I will
remove most of the special handling here and fold it into calls to the
`&lt;gpuintrin.h&gt;` functions instead. Additionally I will rework the
compilation to make this a separate runtime.

The current plan is to have this, but make including OpenMP and
offloading either automatically add it, or print a warning if it's
missing. This will allow us to use a normal CMake workflow and delete
all the weird 'lets pull the clang binary out of the build' business.
```
-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=offload
-DLLVM_RUNTIME_TARGETS=amdgcn-amd-amdhsa
```

After that, linking the OpenMP device runtime will be `-Xoffload-linker
-lomp`. I.e. no more fat binary business.

Only look at the most recent commit since this includes the two
dependencies
(fix to AMDGPUEmitPrintfBinding and the PointerToMember bug).</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload][NFC] Fix typos discovered by codespell (#125119)</title>
<updated>2025-01-31T15:35:29+00:00</updated>
<author>
<name>Christian Clauss</name>
<email>cclauss@me.com</email>
</author>
<published>2025-01-31T15:35:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1f56bb3137827d66093b66aa3a6447fdaba61783'/>
<id>1f56bb3137827d66093b66aa3a6447fdaba61783</id>
<content type='text'>
https://github.com/codespell-project/codespell

% `codespell
--ignore-words-list=archtype,hsa,identty,inout,iself,nd,te,ths,vertexes
--write-changes`</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
https://github.com/codespell-project/codespell

% `codespell
--ignore-words-list=archtype,hsa,identty,inout,iself,nd,te,ths,vertexes
--write-changes`</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload][NFC] Reorganize `utils::` and make Device/Host/Shared clearer (#100280)</title>
<updated>2024-09-05T20:36:26+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2024-09-05T20:36:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=08533a3ee8f3a09a59cf6ac3be59198b26b7f739'/>
<id>08533a3ee8f3a09a59cf6ac3be59198b26b7f739</id>
<content type='text'>
We had three `utils::` namespaces, all with different "meaning" (host,
device, hsa_utils). We should, when we can, keep "include/Shared"
accessible from host and device, thus RefCountTy has been moved to a
separate header. `hsa_utils` was introduced to make `utils::` less
overloaded. And common functionality was de-duplicated, e.g.,
`utils::advance` and `utils::advanceVoidPtr` -&gt; `utils:advancePtr`. Type
punning now checks for the size of the result to make sure it matches
the source type.

No functional change was intended.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We had three `utils::` namespaces, all with different "meaning" (host,
device, hsa_utils). We should, when we can, keep "include/Shared"
accessible from host and device, thus RefCountTy has been moved to a
separate header. `hsa_utils` was introduced to make `utils::` less
overloaded. And common functionality was de-duplicated, e.g.,
`utils::advance` and `utils::advanceVoidPtr` -&gt; `utils:advancePtr`. Type
punning now checks for the size of the result to make sure it matches
the source type.

No functional change was intended.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][offload] Fix dynamic schedule tracking (#97065)</title>
<updated>2024-07-01T14:23:11+00:00</updated>
<author>
<name>Gheorghe-Teodor Bercea</name>
<email>doru.bercea@amd.com</email>
</author>
<published>2024-07-01T14:23:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1a478a69bc4f2781c14fd045c202838786fb9062'/>
<id>1a478a69bc4f2781c14fd045c202838786fb9062</id>
<content type='text'>
This patch fixes the dynamic schedule tracking.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch fixes the dynamic schedule tracking.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Move `/openmp/libomptarget` to `/offload` (#75125)</title>
<updated>2024-04-22T16:51:33+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2024-04-22T16:51:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=330d8983d25d08580fc1642fea48b2473f47a9da'/>
<id>330d8983d25d08580fc1642fea48b2473f47a9da</id>
<content type='text'>
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -&gt; `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -&gt; `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam &lt;Saiyedul.Islam@amd.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -&gt; `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -&gt; `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam &lt;Saiyedul.Islam@amd.com&gt;</pre>
</div>
</content>
</entry>
</feed>
