summaryrefslogtreecommitdiff
path: root/offload/libomptarget
AgeCommit message (Collapse)Author
2025-11-20[OFFLOAD] Add support for more fine grained debug messages control (#165416)Alex Duran
This PR introduces new debug macros that allow a more fined control of which debug message to output and introduce C++ stream style for debug messages. Changing existing messages (except a few that I changed for testing) will come in subsequent PRs. I also think that we should make debug enabling OpenMP agnostic but, for now, I prioritized maintaing the current libomptarget behavior for now, and we might need more changes further down the line as we we decouple libomptarget.
2025-11-18Revert "[OpenMP] Implement omp_get_uid_from_device() / ↵Robert Imschweiler
omp_get_device_from_uid()" (#168547) Reverts llvm/llvm-project#164392 due to fortran issues
2025-11-18[OpenMP] Implement omp_get_uid_from_device() / omp_get_device_from_uid() ↵Robert Imschweiler
(#164392) Use the implementation in libomptarget. If libomptarget is not available, always return the UID / device number of the host / the initial device.
2025-10-24[OFFLOAD] Remove weak from __kmpc_* calls and gather them in one header ↵Alex Duran
(#164613) Follow-up from #162652 --------- Co-authored-by: Michael Klemm <michael.klemm@amd.com>
2025-10-22[OpenMP] Adds omp_target_is_accessible routine (#138294)Nicole Aschenbrenner
Adds omp_target_is_accessible routine. Refactors common code from omp_target_is_present to work for both routines. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>
2025-10-17[OFFLOAD] Interop fixes for Windows (#162652)Alex Duran
On Windows, for a reason I don't fully understand boolean bits get extra padding (even when asking for packed structures) in the structures that messes the offsets between the compiler and the runtime. Also, "weak" works differently on Windows than Linux (i.e., the "local" routine has preference) which causes it to crash as we don't really have an alternate implementation of __kmpc_omp_wait_deps. Given this, it doesn't make sense to mark it as "weak" for Linux either.
2025-10-12[Offload] Silence warning via maybe unused (NFC) (#163076)Jan Patrick Lehr
2025-10-02[OFFLOAD] Restore interop functionality (#161429)Alex Duran
This implements two pieces to restore the interop functionality (that I broke) when the 6.0 interfaces were added: * A set of wrappers that support the old interfaces on top of the new ones * The same level of interop support for the CUDA amd AMD plugins
2025-09-29[OpenMP][Offload] Support `PRIVATE | ATTACH` maps for ↵Abhinav Gaba
corresponding-pointer-initialization. (#160760) `PRIVATE | ATTACH` maps can be used to represent firstprivate pointers that should be initialized by doing doing the pointee's device address, if its lookup succeeds, or retain the original host pointee's address otherwise. With this, for a test like the following: ```f90 integer, pointer :: p(:) !$omp target map(p(1)) ... print*, p(1) !$omp end target ``` The codegen can look like: ```llvm ; maps for p: ; &p(1), &p(1), sizeof(p(1)), TO|FROM //(1) ; &ref_ptr(p), &p(1), sizeof(ref_ptr(p)), ATTACH //(2) ; &ref_ptr(p), &p(1), sizeof(ref_ptr(p)), PRIVATE|ATTACH|PARAM //(3) call... @__omp_outlined...(ptr %ref_ptr_of_p) ``` * `(1)` maps the pointee `p(1)`. * `(2)` attaches it to the (previously) mapped `ref_ptr(p)`, if present. It can be controlled via OpenMP 6.1's `attach(auto/always/never)` map-type modifiers. * `(3)` privatizes and initializes the local `ref_ptr(p)`, which gets passed in as the kernel argument `%ref_ptr_of_p`. Can be skipped if p is not referenced directly within the region. While similar mapping can be used for C/C++, it's more important/useful for Fortran as we can avoid creating another argument for passing the descriptor, and use that to initialize the private copy in the body of the kernel.
2025-09-19[Offload] Implement 'olIsValidBinary' in offload and clean up (#159658)Joseph Huber
Summary: This exposes the 'isDeviceCompatible' routine for checking if a binary *can* be loaded. This is useful if people don't want to consume errors everywhere when figuring out which image to put to what device. I don't know if this is a good name, I was thining like `olIsCompatible` or whatever. Let me know what you think. Long term I'd like to be able to do something similar to what OpenMP does where we can conditionally only initialize devices if we need them. That's going to be support needed if we want this to be more generic.
2025-09-15[Offload][OpenMP] Support shadow-pointer tracking for Fortran descriptors. ↵Abhinav Gaba
(#158370) This change adds support for saving full contents of attached Fortran descriptors, and not just their pointee address, in the shadow-pointer table. With this, we now support: * comparing full contents of descriptors to check whether a previous shadow-pointer entry is stale; * restoring the full contents of descriptors And with that, we can now use ATTACH map-types (added in #149036) for mapping Fortran pointer/allocatable arrays, and array-sections on them. e.g.: ```f90 integer, allocatable :: x(:) !$omp target enter data map(to: x(:)) ``` as: ``` void* addr_of_pointee = allocated(x) ? &x(1) : nullptr; int64_t sizeof_pointee = allocated(x) ? sizeof(x(:)) : 0 addr_of_pointee, addr_of_pointee, sizeof_pointee, TO addr_of_descriptor, addr_of_pointee, size_of_descriptor, ATTACH ```
2025-09-08[OpenMP] Move `__omp_rtl_data_environment' handling to OpenMP (#157182)Joseph Huber
Summary: This operation is done every time we load a binary, this behavior should be moved into OpenMP since it concerns an OpenMP specific data struct. This is a little messy, because ideally we should only be using public APIs, but more can be extracted later.
2025-08-28[Offload] Update LIBOMPTARGET_INFO text for `attach` map-type. (#155509)Abhinav Gaba
Also adds two debug dumps regarding pointer-attachment.
2025-08-17[Offload] Introduce ATTACH map-type support for pointer attachment. (#149036)Abhinav Gaba
This patch introduces libomptarget support for the ATTACH map-type, which can be used to implement OpenMP conditional compliant pointer attachment, based on whether the pointer/pointee is newly mapped on a given construct. For example, for the following: ```c int *p; #pragma omp target enter data map(p[1:10]) ``` The following maps can be emitted by clang: ``` (A) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM &p, &p[1], sizeof(p), ATTACH ``` Without this map-type, these two possible maps could be emitted by clang: ``` (B) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM (C) &p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ ```` (B) does not perform any pointer attachment, while (C) also maps the pointer p, which are both incorrect. In terms of implementation, maps with the ATTACH map-type are handled after all other maps have been processed, as it requires knowledge of which new allocations happened as part of the construct. As per OpenMP 5.0, an attachment should happen only when either the pointer or the pointee was newly mapped while handling the construct. Maps with ATTACH map-type-bit do not increase/decrease the ref-count. With OpenMP 6.1, `attach(always/never)` can be used to force/prevent attachment. For `attach(always)`, the compiler will insert the ALWAYS map-type, which would let libomptarget bypass the check about one of the pointer/pointee being new. With `attach(never)`, the ATTACH map will not be emitted at all. The size argument of the ATTACH map-type can specify values greater than `sizeof(void*)` which can be used to support pointer attachment on Fortran descriptors. Note that this also requires shadow-pointer tracking to also support them. That has not been implemented in this patch. This was worked upon in coordination with Ravi Narayanaswamy, who has since retired. Happy retirement, Ravi! --------- Co-authored-by: Alex Duran <alejandro.duran@intel.com>
2025-08-08[Offload] Make olLaunchKernel test thread safe (#149497)Ross Brunton
This sprinkles a few mutexes around the plugin interface so that the olLaunchKernel CTS test now passes when ran on multiple threads. Part of this also involved changing the interface for device synchronise so that it can optionally not free the underlying queue (which introduced a race condition in liboffload).
2025-08-06[OFFLOAD][OPENMP] 6.0 compatible interop interface (#143491)Alex Duran
The following patch introduces a new interop interface implementation with the following characteristics: * It supports the new 6.0 prefer_type specification * It supports both explicit objects (from interop constructs) and implicit objects (from variant calls). * Implements a per-thread reuse mechanism for implicit objects to reduce overheads. * It provides a plugin interface that allows selecting the supported interop types, and managing all the backend related interop operations (init, sync, ...). * It enables cooperation with the OpenMP runtime to allow progress on OpenMP synchronizations. * It cleanups some vendor/fr_id mismatchs from the current query routines. * It supports extension to define interop callbacks for library cleanup.
2025-06-29[OpenMP] Fix crash with duplicate mapping on target directive (#146136)Julian Brown
OpenMP allows duplicate mappings, i.e. in OpenMP 6.0, 7.9.6 "map Clause": Two list items of the map clauses on the same construct must not share original storage unless one of the following is true: they are the same list item [or other omitted reasons]" Duplicate mappings can arise as a result of user-defined mapper processing (which I think is a separate bug, and is not addressed here), but also in straightforward cases such as: #pragma omp target map(tofrom: s.mem[0:10]) map(tofrom: s.mem[0:10]) Both these cases cause crashes at runtime at present, due to an unfortunate interaction between reference counting behaviour and shadow pointer handling for blocks. This is what happens: 1. The member "s.mem" is copied to the target 2. A shadow pointer is created, modifying the pointer on the target 3. The member "s.mem" is copied to the target again 4. The previous shadow pointer metadata is still present, so the runtime doesn't modify the target pointer a second time. The fix is to disable step 3 if we've already done step 2 for a given block that has the "is new" flag set.
2025-06-11[OpenMP][Offload] Update the Logic for Configuring Auto Zero-Copy (#143638)Kewen12
Summary: Currently the Auto Zero-Copy is enabled by checking every initialized device to ensure that no dGPU is attached to an APU. However, an APU is designed to comprise a homogeneous set of GPUs, therefore, it should be sufficient to check any device for configuring Auto Zero-Copy. In this PR, it checks the first initialized device in the list. The changes in this PR are to clearly reflect the design and logic of enabling the feature for further improving the readibility.
2025-05-27[Offload][NFCI] Remove coupling to `omp` target for version scripting (#141637)Joseph Huber
Summary: This is a weird dependency on libomp just for testing if version scripts work. We shouldn't need to do this because LLVM already checks for this. I believe this should be available as well in standalone when we call `addLLVM` but I did not test that directly.
2025-05-20[Offload] Use new error code handling mechanism and lower-case messages ↵Ross Brunton
(#139275) [Offload] Use new error code handling mechanism This removes the old ErrorCode-less error method and requires every user to provide a concrete error code. All calls have been updated. In addition, for consistency with error messages elsewhere in LLVM, all messages have been made to start lower case.
2025-05-09[Offload] Do not load images from the same descriptor on the same device ↵Joseph Huber
(#139147) Summary: Right now we generally assume that we have one image per device. The binary descriptor represents a single 'compilation'. This means that each image is going to contain the same code built for different architectures when used through the OpenMP interface. This is problematic when we have cases where the same code will then be loaded multiple times (like wiht sm_80, sm_89 or the generic GFX ISAs). This patch is the quick and dirty slution, we just prevent this from happening at all. This means we use the first one we find, which might not be overly optimal, but it should be better than the alternative. Note that this does not affect shared library loads as it is per binary descriptor, not per device.
2025-03-28[OFFLOAD] Stricter enforcement of user offload disable (#133470)Alex
If user specifies offload is disabled (e.g., OMP_TARGET_OFFLOAD=disable), disable library almost completely. This reduces resources spent to a minimum and ensures all APIs behave as if the only available device is the host device. Currently some of the APIs behave as if there were devices avaible for offload even when under OMP_TARGET_OFFLOAD=disable. --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-02-18[offload] Remove redundant checks in MappingInfoTy::lookupMapping (#127638)Krzysztof Parzyszek
Also add some clarifying comments.
2025-02-10[Offload][NFC] Rename `src/` -> `libomptarget/` (#126573)Joseph Huber
Summary: The name `src` is confusing when combined with the plugins and the newly added `liboffload`.