llvm-project.git/offload/plugins-nextgen/cuda/src/rtl.cpp, branch users/boomanaiden154/main.ci-make-premerge-uploadwrite-comments

[OpenMP] Fix tests relying on the heap size variable

2025-11-06T19:00:26+00:00

Summary:
I made that an unimplemented error, but forgot that it was used for this
environment variable.

[Offload] Remove handling for device memory pool (#163629)

2025-11-06T16:15:18+00:00

Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.

[Offload] Add device UID (#164391)

2025-11-04T19:15:47+00:00

Introduced in OpenMP 6.0, the device UID shall be a unique identifier of
a device on a given system. (Not necessarily a UUID.) Since it is not
guaranteed that the (U)UIDs defined by the device vendor libraries, such
as HSA, do not overlap with those of other vendors, the device UIDs in
offload are always combined with the offload plugin name. In case the
vendor library does not specify any device UID for a given device, we
fall back to the offload-internal device ID.
The device UID can be retrieved using the `llvm-offload-device-info`
tool.

[OFFLOAD] Remove unused init_device_info plugin interface (#162650)

2025-10-09T13:38:24+00:00

This was used for the old interop code. It's dead code after #143491

[OFFLOAD] Restore interop functionality (#161429)

2025-10-02T19:48:31+00:00

This implements two pieces to restore the interop functionality (that I
broke) when the 6.0 interfaces were added:

* A set of wrappers that support the old interfaces on top of the new
ones
* The same level of interop support for the CUDA amd AMD plugins

[Offload] Use Error for allocating/deallocating in plugins (#160811)

2025-09-26T18:50:00+00:00

Co-authored-by: Joseph Huber

[Offload] Remove non-blocking allocation type (#159851)

2025-09-20T14:07:14+00:00

Summary:
This was originally added in as a hack to work around CUDA's limitation
on allocation. The `libc` implementation now isn't even used for CUDA so
this code is never hit. Even if this case, this code never truly worked.

A true solution would be to use CUDA's virtual memory API instead to
allocate 2MiB slabs independenctly from the normal memory management
done in the stream.

[OpenMP][NFC] Clean up a bunch of warnings and clang-tidy messages (#159831)

2025-09-19T19:09:33+00:00

Summary:
I made the GPU flags accept more of the default LLVM warnings, which
triggered some new cases. Clean those up and fix some other ones while
I'm at it.

[LLVM] Fix offload and update CUDA ABI for all SM values (#159354)

2025-09-17T19:39:39+00:00

Summary:
Turns out the new CUDA ABI now applies retroactively to all the other
SMs if you upgrade to CUDA 13.0. This patch changes the scheme, keeping
all the SM flags consistent but using an offset.

Fixes: https://github.com/llvm/llvm-project/issues/159088

[Offload] Copy loaded images into managed storage (#158748)

2025-09-16T13:57:28+00:00

Summary:
Currently we have this `__tgt_device_image` indirection which just takes
a reference to some pointers. This was all find and good when the only
usage of this was from a section of GPU code that came from an ELF
constant section. However, we have expanded beyond that and now need to
worry about managing lifetimes. We have code that references the image
even after it was loaded internally. This patch changes the
implementation to instaed copy the memory buffer and manage it locally.

This PR reworks the JIT and other image handling to directly manage its
own memory. We now don't need to duplicate this behavior externally at
the Offload API level. Also we actually free these if the user unloads
them.

Upside, less likely to crash and burn. Downside, more latency when
loading an image.