llvm-project.git/offload/liboffload/API, branch main

[Offload] Add device info for shared memory (#167817)

2025-11-13T19:00:12+00:00

[Offload] Add device UID (#164391)

2025-11-04T19:15:47+00:00

Introduced in OpenMP 6.0, the device UID shall be a unique identifier of
a device on a given system. (Not necessarily a UUID.) Since it is not
guaranteed that the (U)UIDs defined by the device vendor libraries, such
as HSA, do not overlap with those of other vendors, the device UIDs in
offload are always combined with the offload plugin name. In case the
vendor library does not specify any device UID for a given device, we
fall back to the offload-internal device ID.
The device UID can be retrieved using the `llvm-offload-device-info`
tool.

[Offload] Lazily initialize platforms in the Offloading API (#163272)

2025-10-14T14:35:53+00:00

Summary:
The Offloading library wraps around the underlying plugins. The problem
is that we currently initialize all plugins we find, even if they are
not needed for the program. This is very expensive for trivial uses, as
fully heterogenous usage is quite rare. In practice this means that you
will always pay a 200 ms penalty for having CUDA installed.

This patch changes the behavior to provide accessors into the plugins
and devices that allows them to be initialized lazily. We use a
once_flag, this should properly take a fast-path check while still
blocking on concurrent use.

Making full use of this will require a way to filter platforms more
specifically. I'm thinking of what this would look like as an API.
I'm thinking that we either have an extra iterate function that takes a
callback on the platform, or we just provide a helper to find all the
devices that can run a given image. Maybe both?

Fixes: https://github.com/llvm/llvm-project/issues/159636

[Offload] Add olGetMemInfo with platform-less API (#159581)

2025-09-24T11:17:57+00:00

[Offload] Re-allocate overlapping memory (#159567)

2025-09-23T12:59:52+00:00

If olMemAlloc happens to allocate memory that was already allocated
elsewhere (possibly by another device on another platform), it is now
thrown away and a new allocation generated.

A new `AllocBases` vector is now available, which is an ordered list
of allocation start addresses.

[Offload] Implement 'olIsValidBinary' in offload and clean up (#159658)

2025-09-19T17:15:57+00:00

Summary:
This exposes the 'isDeviceCompatible' routine for checking if a binary
*can* be loaded. This is useful if people don't want to consume errors
everywhere when figuring out which image to put to what device.

I don't know if this is a good name, I was thining like `olIsCompatible`
or whatever. Let me know what you think.

Long term I'd like to be able to do something similar to what OpenMP
does where we can conditionally only initialize devices if we need them.
That's going to be support needed if we want this to be more
generic.

[Offload] Add `OL_DEVICE_INFO_MAX_WORK_SIZE[_PER_DIMENSION]` (#155823)

2025-08-29T08:39:18+00:00

This is the total number of work items that the device supports (the
equivalent work group properties are for only a single work group).

[Offload] Add PRODUCT_NAME device info (#155632)

2025-08-28T14:16:17+00:00

On my system, this will be "Radeon RX 7900 GRE" rather than "gfx1100". For Nvidia, the product name and device name are identical.

[Offload] Fix definition of olMemFill (#154947)

2025-08-22T13:48:00+00:00

Fix regression introduced by #154102 - the way offload-tblgen handles
names has changed

[Offload] Implement olMemFill (#154102)

2025-08-22T13:31:16+00:00

Implement olMemFill to support filling device memory with arbitrary
length patterns. AMDGPU support will be added in a follow-up PR.