<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/offload/DeviceRTL, branch users/nico/python-2</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[Offload] Fix cmake warning (#145488)</title>
<updated>2025-06-24T12:42:03+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-06-24T12:42:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=478583214410305fbaaafd78f5c4d0899b260522'/>
<id>478583214410305fbaaafd78f5c4d0899b260522</id>
<content type='text'>
Cmake was unhappy that there was no space between arguments, now it
is.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cmake was unhappy that there was no space between arguments, now it
is.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][GPU][FIX] Enable generic barriers in single threaded contexts (#140786)</title>
<updated>2025-05-21T02:33:54+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2025-05-21T02:33:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=57a90edacdf4ef14c6a95531681e8218cd23c4ab'/>
<id>57a90edacdf4ef14c6a95531681e8218cd23c4ab</id>
<content type='text'>
The generic GPU barrier implementation checked if it was the main thread
in generic mode to identify single threaded regions. This doesn't work
since inside of a non-active (=sequential) parallel, that thread becomes
the main thread of a team, and is not the main thread in generic mode.
At least that is the implementation of the APIs today.

To identify single threaded regions we now check the team size
explicitly.

This exposed three other issues; one is, for now, expected and not a
bug, the second one is a bug and has a FIXME in the
single_threaded_for_barrier_hang_1.c file, and the final one is also
benign as described in the end.

The non-bug issue comes up if we ever initialize a thread state.
Afterwards we will never run any region in parallel. This is a little
conservative, but I guess thread states are really bad for performance
anyway.

The bug comes up if we optimize single_threaded_for_barrier_hang_1 and
execute it in Generic-SPMD mode. For some reason we loose all the
updates to b. This looks very much like a compiler bug, but could also
be another logic issue in the runtime. Needs to be investigated.

Issue number 3 comes up if we have nested parallels inside of a target
region. The clang SPMD-check logic gets confused, determines SPMD (which
is fine) but picks an unreasonable thread count. This is all benign, I
think, just weird:

```
  #pragma omp target teams
  #pragma omp parallel num_threads(64)
  #pragma omp parallel num_threads(10)
  {}
```
Was launched with 10 threads, not 64.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The generic GPU barrier implementation checked if it was the main thread
in generic mode to identify single threaded regions. This doesn't work
since inside of a non-active (=sequential) parallel, that thread becomes
the main thread of a team, and is not the main thread in generic mode.
At least that is the implementation of the APIs today.

To identify single threaded regions we now check the team size
explicitly.

This exposed three other issues; one is, for now, expected and not a
bug, the second one is a bug and has a FIXME in the
single_threaded_for_barrier_hang_1.c file, and the final one is also
benign as described in the end.

The non-bug issue comes up if we ever initialize a thread state.
Afterwards we will never run any region in parallel. This is a little
conservative, but I guess thread states are really bad for performance
anyway.

The bug comes up if we optimize single_threaded_for_barrier_hang_1 and
execute it in Generic-SPMD mode. For some reason we loose all the
updates to b. This looks very much like a compiler bug, but could also
be another logic issue in the runtime. Needs to be investigated.

Issue number 3 comes up if we have nested parallels inside of a target
region. The clang SPMD-check logic gets confused, determines SPMD (which
is fine) but picks an unreasonable thread count. This is all benign, I
think, just weird:

```
  #pragma omp target teams
  #pragma omp parallel num_threads(64)
  #pragma omp parallel num_threads(10)
  {}
```
Was launched with 10 threads, not 64.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Fix PowerPC builds that pass -mcpu (#138327)</title>
<updated>2025-05-06T19:14:16+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-05-06T19:14:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dbe070eb3e09b04f312b238c40e016b97dacbbd2'/>
<id>dbe070eb3e09b04f312b238c40e016b97dacbbd2</id>
<content type='text'>
Summary:
Another hacky fix done until
https://github.com/llvm/llvm-project/pull/136729 lands. This time for
`-mcpu`.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
Another hacky fix done until
https://github.com/llvm/llvm-project/pull/136729 lands. This time for
`-mcpu`.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Add pre sm_70 load hack back in (#138589)</title>
<updated>2025-05-05T21:33:41+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-05-05T21:33:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dfcb8cb2a92c9f72ddde5ea08dadf2f640197d32'/>
<id>dfcb8cb2a92c9f72ddde5ea08dadf2f640197d32</id>
<content type='text'>
Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes https://github.com/llvm/llvm-project/issues/138560</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.

Fixes https://github.com/llvm/llvm-project/issues/138560</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Fix dependency issue #126143 in CMake</title>
<updated>2025-05-05T05:38:48+00:00</updated>
<author>
<name>Ye Luo</name>
<email>yeluo@anl.gov</email>
</author>
<published>2025-05-05T05:35:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dcb43307ce05974a06006a890933b63a0c3925d1'/>
<id>dcb43307ce05974a06006a890933b63a0c3925d1</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Override linker for device build (#137246)</title>
<updated>2025-04-25T15:22:07+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-04-25T15:22:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=346792aafb483a53fb5e3274298d85bc2dde4a35'/>
<id>346792aafb483a53fb5e3274298d85bc2dde4a35</id>
<content type='text'>
Summary:
Override the default linker in case the user is passing it separately.
This requires `lld` but it always did. This will be fixed *properly*
when https://github.com/llvm/llvm-project/pull/136729 lands.

Fixes https://github.com/llvm/llvm-project/issues/136822</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
Override the default linker in case the user is passing it separately.
This requires `lld` but it always did. This will be fixed *properly*
when https://github.com/llvm/llvm-project/pull/136729 lands.

Fixes https://github.com/llvm/llvm-project/issues/136822</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Update the bitcode library install and search path (#136754)</title>
<updated>2025-04-23T13:20:15+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-04-23T13:20:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6d0d50f0ac0cb108a06558cb178a68fb78cfa06f'/>
<id>6d0d50f0ac0cb108a06558cb178a68fb78cfa06f</id>
<content type='text'>
Summary:
This was accidentally kept in the old location when we moved to the
new `lib/&lt;triple&gt;/` location for the DeviceRTL. Move this to reduce the
delta with https://github.com/llvm/llvm-project/pull/136729.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
This was accidentally kept in the old location when we moved to the
new `lib/&lt;triple&gt;/` location for the DeviceRTL. Move this to reduce the
delta with https://github.com/llvm/llvm-project/pull/136729.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Fix handling of 'bare' mode when environment missing (#136794)</title>
<updated>2025-04-23T13:16:39+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-04-23T13:16:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=92bba68634ec48c738d45bc86b05b1390aa82f4b'/>
<id>92bba68634ec48c738d45bc86b05b1390aa82f4b</id>
<content type='text'>
Summary:
We treated the missing kernel environment as a unique mode, but it was
kind of this random bool that was doing the same thing and it explicitly
expects the kernel environment to be zero. It broke after the previous
change since it used to default to SPMD and didn't handle zero in any of
the other cases despite being used. This fixes that and queries for it
without needing to consume an error.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
We treated the missing kernel environment as a unique mode, but it was
kind of this random bool that was doing the same thing and it explicitly
expects the kernel environment to be zero. It broke after the previous
change since it used to default to SPMD and didn't handle zero in any of
the other cases despite being used. This fixes that and queries for it
without needing to consume an error.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Remove dependency on LLVM include directory from DeviceRTL (#136359)</title>
<updated>2025-04-21T20:21:47+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-04-21T20:21:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=56bf0e720288ae6ba6e8635c7ec12f1e7d6f783b'/>
<id>56bf0e720288ae6ba6e8635c7ec12f1e7d6f783b</id>
<content type='text'>
Summary:
Currently we depend on a single LLVM include directory. This is actually
only required to define one enum, which is highly unlikely to change.
THis patch makes the `Environment.h` include directory more hermetic so
we no long depend on other libraries. In exchange, we get a simpler
dependency list for the price of hard-coding `1` somewhere. I think it's
a valid trade considering that this flag is highly unlikely to change at
this point.

@ronlieb AMD version
https://gist.github.com/jhuber6/3313e6f957be14dc79fe85e5126d2cb3</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
Currently we depend on a single LLVM include directory. This is actually
only required to define one enum, which is highly unlikely to change.
THis patch makes the `Environment.h` include directory more hermetic so
we no long depend on other libraries. In exchange, we get a simpler
dependency list for the price of hard-coding `1` somewhere. I think it's
a valid trade considering that this flag is highly unlikely to change at
this point.

@ronlieb AMD version
https://gist.github.com/jhuber6/3313e6f957be14dc79fe85e5126d2cb3</pre>
</div>
</content>
</entry>
<entry>
<title>[offload] Unset `-march` when building GPU libraries (#136442)</title>
<updated>2025-04-20T04:16:19+00:00</updated>
<author>
<name>Michał Górny</name>
<email>mgorny@gentoo.org</email>
</author>
<published>2025-04-20T04:16:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ac8fc09688e10e983b99224b5dc5cbbeeedb1879'/>
<id>ac8fc09688e10e983b99224b5dc5cbbeeedb1879</id>
<content type='text'>
Unset `-march` when invoking the compiler and linker to build the GPU
libraries. These libraries use GPU targets rather than the CPU targets,
and an incidental `-march=native` causes Clang to be able to determine
the GPU used — which causes the build to fail when there is no GPU
available. Resetting `-march=` should suffice to revert to building
generic code for the time being.

See the discussion in:
https://github.com/llvm/llvm-project/pull/126143#issuecomment-2816718492</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Unset `-march` when invoking the compiler and linker to build the GPU
libraries. These libraries use GPU targets rather than the CPU targets,
and an incidental `-march=native` causes Clang to be able to determine
the GPU used — which causes the build to fail when there is no GPU
available. Resetting `-march=` should suffice to revert to building
generic code for the time being.

See the discussion in:
https://github.com/llvm/llvm-project/pull/126143#issuecomment-2816718492</pre>
</div>
</content>
</entry>
</feed>
