llvm-project.git/offload/plugins-nextgen/host, branch users/makslevental/ptr-dialectpython

[Offload] Remove handling for device memory pool (#163629)

2025-11-06T16:15:18+00:00

Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.

[OFFLOAD] Remove unused init_device_info plugin interface (#162650)

2025-10-09T13:38:24+00:00

This was used for the old interop code. It's dead code after #143491

[Offload] Use Error for allocating/deallocating in plugins (#160811)

2025-09-26T18:50:00+00:00

Co-authored-by: Joseph Huber

[Offload] Remove non-blocking allocation type (#159851)

2025-09-20T14:07:14+00:00

Summary:
This was originally added in as a hack to work around CUDA's limitation
on allocation. The `libc` implementation now isn't even used for CUDA so
this code is never hit. Even if this case, this code never truly worked.

A true solution would be to use CUDA's virtual memory API instead to
allocate 2MiB slabs independenctly from the normal memory management
done in the stream.

[Offload] Copy loaded images into managed storage (#158748)

2025-09-16T13:57:28+00:00

Summary:
Currently we have this `__tgt_device_image` indirection which just takes
a reference to some pointers. This was all find and good when the only
usage of this was from a section of GPU code that came from an ELF
constant section. However, we have expanded beyond that and now need to
worry about managing lifetimes. We have code that references the image
even after it was loaded internally. This patch changes the
implementation to instaed copy the memory buffer and manage it locally.

This PR reworks the JIT and other image handling to directly manage its
own memory. We now don't need to duplicate this behavior externally at
the Offload API level. Also we actually free these if the user unloads
them.

Upside, less likely to crash and burn. Downside, more latency when
loading an image.

[OpenMP] Move `__omp_rtl_data_environment' handling to OpenMP (#157182)

2025-09-08T14:58:38+00:00

Summary:
This operation is done every time we load a binary, this behavior should
be moved into OpenMP since it concerns an OpenMP specific data struct.
This is a little messy, because ideally we should only be using public
APIs, but more can be extracted later.

[Offload] Implement olMemFill (#154102)

2025-08-22T13:31:16+00:00

Implement olMemFill to support filling device memory with arbitrary
length patterns. AMDGPU support will be added in a follow-up PR.

[Offload] `OL_EVENT_INFO_IS_COMPLETE` (#153194)

2025-08-22T12:40:31+00:00

A simple info query for events that returns whether the event is
complete or not.

[Offload] Add olCalculateOptimalOccupancy (#142950)

2025-08-19T14:16:47+00:00

This is equivalent to `cuOccupancyMaxPotentialBlockSize`. It is
currently
only implemented on Cuda; AMDGPU and Host return unsupported.

---------

Co-authored-by: Callum Fare

[Offload] Introduce dataFence plugin interface. (#153793)

2025-08-15T18:49:35+00:00

The purpose of this fence is to ensure that any `dataSubmit`s inserted
into a queue before a `dataFence` finish before finish before any
`dataSubmit`s
inserted after it begin.

This is a no-op for most queues, since they are in-order, and by design
any operations inserted into them occur in order.

But the interface is supposed to be functional for out-of-order queues.

The addition of the interface means that any operations that rely on
such ordering (like ATTACH map-type support in #149036) can invoke it,
without worrying about whether the underlying queue is in-order or
out-of-order.

Once a plugin supports out-of-order queues, the plugin can implement
this function, without requiring any change at the libomptarget level.

---------

Co-authored-by: Alex Duran