<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/offload/plugins-nextgen/cuda/src/rtl.cpp, branch users/boomanaiden154/main.ci-make-premerge-uploadwrite-comments</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[OpenMP] Fix tests relying on the heap size variable</title>
<updated>2025-11-06T19:00:26+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-11-06T18:37:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=aaddd8d38aa06f096cdf5ebe0d36c8e2b9a63265'/>
<id>aaddd8d38aa06f096cdf5ebe0d36c8e2b9a63265</id>
<content type='text'>
Summary:
I made that an unimplemented error, but forgot that it was used for this
environment variable.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
I made that an unimplemented error, but forgot that it was used for this
environment variable.
</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Remove handling for device memory pool (#163629)</title>
<updated>2025-11-06T16:15:18+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-11-06T16:15:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=670c453aeb1931fbecd0be31ea9cb1cca113c1af'/>
<id>670c453aeb1931fbecd0be31ea9cb1cca113c1af</id>
<content type='text'>
Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Add device UID (#164391)</title>
<updated>2025-11-04T19:15:47+00:00</updated>
<author>
<name>Robert Imschweiler</name>
<email>robert.imschweiler@amd.com</email>
</author>
<published>2025-11-04T19:15:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dc94f2cbadfd192fe3d43bd00fd5a1d0ead5ab8d'/>
<id>dc94f2cbadfd192fe3d43bd00fd5a1d0ead5ab8d</id>
<content type='text'>
Introduced in OpenMP 6.0, the device UID shall be a unique identifier of
a device on a given system. (Not necessarily a UUID.) Since it is not
guaranteed that the (U)UIDs defined by the device vendor libraries, such
as HSA, do not overlap with those of other vendors, the device UIDs in
offload are always combined with the offload plugin name. In case the
vendor library does not specify any device UID for a given device, we
fall back to the offload-internal device ID.
The device UID can be retrieved using the `llvm-offload-device-info`
tool.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduced in OpenMP 6.0, the device UID shall be a unique identifier of
a device on a given system. (Not necessarily a UUID.) Since it is not
guaranteed that the (U)UIDs defined by the device vendor libraries, such
as HSA, do not overlap with those of other vendors, the device UIDs in
offload are always combined with the offload plugin name. In case the
vendor library does not specify any device UID for a given device, we
fall back to the offload-internal device ID.
The device UID can be retrieved using the `llvm-offload-device-info`
tool.</pre>
</div>
</content>
</entry>
<entry>
<title>[OFFLOAD] Remove unused init_device_info plugin interface (#162650)</title>
<updated>2025-10-09T13:38:24+00:00</updated>
<author>
<name>Alex Duran</name>
<email>alejandro.duran@intel.com</email>
</author>
<published>2025-10-09T13:38:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=45757b9284cf491072c8c477cd606df8a19061df'/>
<id>45757b9284cf491072c8c477cd606df8a19061df</id>
<content type='text'>
This was used for the old interop code. It's dead code after #143491</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was used for the old interop code. It's dead code after #143491</pre>
</div>
</content>
</entry>
<entry>
<title>[OFFLOAD] Restore interop functionality (#161429)</title>
<updated>2025-10-02T19:48:31+00:00</updated>
<author>
<name>Alex Duran</name>
<email>alejandro.duran@intel.com</email>
</author>
<published>2025-10-02T19:48:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=902fe02e8722b8f51cb3897226c97fbff353c890'/>
<id>902fe02e8722b8f51cb3897226c97fbff353c890</id>
<content type='text'>
This implements two pieces to restore the interop functionality (that I
broke) when the 6.0 interfaces were added:

* A set of wrappers that support the old interfaces on top of the new
ones
* The same level of interop support for the CUDA amd AMD plugins</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This implements two pieces to restore the interop functionality (that I
broke) when the 6.0 interfaces were added:

* A set of wrappers that support the old interfaces on top of the new
ones
* The same level of interop support for the CUDA amd AMD plugins</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Use Error for allocating/deallocating in plugins (#160811)</title>
<updated>2025-09-26T18:50:00+00:00</updated>
<author>
<name>Kevin Sala Penades</name>
<email>salapenades1@llnl.gov</email>
</author>
<published>2025-09-26T18:50:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=01d761a7769772242fb583eec8f17f6d26d27b6a'/>
<id>01d761a7769772242fb583eec8f17f6d26d27b6a</id>
<content type='text'>
Co-authored-by: Joseph Huber &lt;huberjn@outlook.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Co-authored-by: Joseph Huber &lt;huberjn@outlook.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Remove non-blocking allocation type (#159851)</title>
<updated>2025-09-20T14:07:14+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-09-20T14:07:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=23efc67e194222a9c14da8e99f183f98cb126c8a'/>
<id>23efc67e194222a9c14da8e99f183f98cb126c8a</id>
<content type='text'>
Summary:
This was originally added in as a hack to work around CUDA's limitation
on allocation. The `libc` implementation now isn't even used for CUDA so
this code is never hit. Even if this case, this code never truly worked.

A true solution would be to use CUDA's virtual memory API instead to
allocate 2MiB slabs independenctly from the normal memory management
done in the stream.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
This was originally added in as a hack to work around CUDA's limitation
on allocation. The `libc` implementation now isn't even used for CUDA so
this code is never hit. Even if this case, this code never truly worked.

A true solution would be to use CUDA's virtual memory API instead to
allocate 2MiB slabs independenctly from the normal memory management
done in the stream.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][NFC] Clean up a bunch of warnings and clang-tidy messages (#159831)</title>
<updated>2025-09-19T19:09:33+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-09-19T19:09:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=580860e8b7341783e8e53114f26b9a9659a3a3e1'/>
<id>580860e8b7341783e8e53114f26b9a9659a3a3e1</id>
<content type='text'>
Summary:
I made the GPU flags accept more of the default LLVM warnings, which
triggered some new cases. Clean those up and fix some other ones while
I'm at it.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
I made the GPU flags accept more of the default LLVM warnings, which
triggered some new cases. Clean those up and fix some other ones while
I'm at it.</pre>
</div>
</content>
</entry>
<entry>
<title>[LLVM] Fix offload and update CUDA ABI for all SM values (#159354)</title>
<updated>2025-09-17T19:39:39+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-09-17T19:39:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dffd7f3d9a3294d21205251b986e76ec841cc750'/>
<id>dffd7f3d9a3294d21205251b986e76ec841cc750</id>
<content type='text'>
Summary:
Turns out the new CUDA ABI now applies retroactively to all the other
SMs if you upgrade to CUDA 13.0. This patch changes the scheme, keeping
all the SM flags consistent but using an offset.

Fixes: https://github.com/llvm/llvm-project/issues/159088</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
Turns out the new CUDA ABI now applies retroactively to all the other
SMs if you upgrade to CUDA 13.0. This patch changes the scheme, keeping
all the SM flags consistent but using an offset.

Fixes: https://github.com/llvm/llvm-project/issues/159088</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Copy loaded images into managed storage (#158748)</title>
<updated>2025-09-16T13:57:28+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-09-16T13:57:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=e7101dac9cbdad08696a05b4b73ed76c20a6f2fc'/>
<id>e7101dac9cbdad08696a05b4b73ed76c20a6f2fc</id>
<content type='text'>
Summary:
Currently we have this `__tgt_device_image` indirection which just takes
a reference to some pointers. This was all find and good when the only
usage of this was from a section of GPU code that came from an ELF
constant section. However, we have expanded beyond that and now need to
worry about managing lifetimes. We have code that references the image
even after it was loaded internally. This patch changes the
implementation to instaed copy the memory buffer and manage it locally.

This PR reworks the JIT and other image handling to directly manage its
own memory. We now don't need to duplicate this behavior externally at
the Offload API level. Also we actually free these if the user unloads
them.

Upside, less likely to crash and burn. Downside, more latency when
loading an image.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
Currently we have this `__tgt_device_image` indirection which just takes
a reference to some pointers. This was all find and good when the only
usage of this was from a section of GPU code that came from an ELF
constant section. However, we have expanded beyond that and now need to
worry about managing lifetimes. We have code that references the image
even after it was loaded internally. This patch changes the
implementation to instaed copy the memory buffer and manage it locally.

This PR reworks the JIT and other image handling to directly manage its
own memory. We now don't need to duplicate this behavior externally at
the Offload API level. Also we actually free these if the user unloads
them.

Upside, less likely to crash and burn. Downside, more latency when
loading an image.</pre>
</div>
</content>
</entry>
</feed>
