summaryrefslogtreecommitdiff
path: root/offload/liboffload/src/OffloadImpl.cpp
AgeCommit message (Collapse)Author
2025-07-02[Offload] Add `MAX_WORK_GROUP_SIZE` device info query (#143718)Ross Brunton
This adds a new device info query for the maximum workgroup/block size for each dimension.
2025-06-30[Offload] Refactor device/platform info queries (#146345)Ross Brunton
This makes several small changes to how the platform and device info queries are handled: * ReturnHelper has been replaced with InfoWriter which is more explicit in how it is invoked. * InfoWriter consumes `llvm::Expected` rather than values directly, and will early exit if it returns an error. * As a result of the above, `GetInfoString` now correctly returns errors rather than empty strings. * The host device now has its own dedicated "getInfo" function rather than being checked in multiple places.
2025-06-30[Offload] Implement `olShutDown` (#144055)Ross Brunton
`olShutDown` was not properly calling deinit on the platforms, resulting in random segfaults on AMD devices. As part of this, `olInit` and `olShutDown` now alloc and free the offload context rather than it being static. This allows `olShutDown` to be called within a destructor of a static object (like the tests do) without having to worry about destructor ordering.
2025-06-27[Offload] Store device info tree in device handle (#145913)Ross Brunton
Rather than creating a new device info tree for each call to `olGetDeviceInfo`, we instead do it on device initialisation. As well as improving performance, this fixes a few lifetime issues with returned strings. This does unfortunately mean that device information is immutable, but hopefully that shouldn't be a problem for any queries we want to implement. This also meant allowing offload initialization to fail, which it can now do.
2025-06-25[Offload] Add an `unloadBinary` interface to PluginInterface (#143873)Ross Brunton
This allows removal of a specific Image from a Device, rather than requiring all image data to outlive the device they were created for. This is required for `ol_program_handle_t`s, which now specify the lifetime of the buffer used to create the program.
2025-06-24[Offload] Properly report errors when jit compiling (#145498)Ross Brunton
Previously, if a binary failed to load due to failures when jit compiling, the function would return success with nullptr. Now it returns a new plugin error, `COMPILE_FAILURE`.
2025-06-20[Offload] Add type information to device info nodes (#144535)Ross Brunton
Rather than being "stringly typed", store values as a std::variant that can hold various types. This means that liboffload doesn't have to do any string parsing for integer/bool device info keys.
2025-06-20[Offload] Check for initialization (#144370)Ross Brunton
All entry points (except olInit) now check that offload has been initialized. If not, a new `OL_ERRC_UNINITIALIZED` error is returned.
2025-06-19[Offload] Move (most) global state to an `OffloadContext` struct (#144494)Ross Brunton
Rather than having a number of static local variables, we now use a single `OffloadContext` struct to store global state. This is initialised by `olInit`, but is never deleted (de-initialization of Offload isn't yet implemented). The error reporting mechanism has not been moved to the struct, since that's going to cause issues with teardown (error messages must outlive liboffload).
2025-06-13[Offload] Replace device info queue with a tree (#144050)Ross Brunton
Previously, device info was returned as a queue with each element having a "Level" field indicating its nesting level. This replaces this queue with a more traditional tree-like structure. This should not result in a change to the output of `llvm-offload-device-info`.
2025-06-12[Offload] Add `ol_dimensions_t` and convert ranges from size_t -> uint32_t ↵Ross Brunton
(#143901) This is a three element x, y, z size_t vector that can be used any place where a 3D vector is required. This ensures that all vectors across liboffload are the same and don't require any resizing/reordering dances.
2025-06-06[Offload] Make olMemcpy src parameter const (#143161)Callum Fare
2025-05-29[Offload] Fix Error checking (#141939)Ross Brunton
All errors must be checked - this includes the local variable we were using to increase the lifetime of `Res`. As we were not explicitly checking it, it resulted in an `abort` in debug builds.
2025-05-28[Offload] Add specifier for the host type (#141635)Joseph Huber
Summary: We use this sepcial type to indicate a host value, this will be refined later but for now it's used as a stand-in device for transfers and queues. It needs a special kind because it is not a device target as the other ones so we need to differentiate it between a CPU and GPU type. Fixes: https://github.com/llvm/llvm-project/issues/141436
2025-05-27[Offload] Fix segfault when looking for host device name (#141632)Joseph Huber
Summary: This is done using the generic device into pointe, but no such thing exists for the host device, leading to a segfault. This patch fixes that for now, but in the future we should probably be more careful in general handling the possibility that the handle is null everywhere. Fixes: https://github.com/llvm/llvm-project/issues/141434
2025-05-27[Offload] Use llvm::Error throughout liboffload internals (#140879)Ross Brunton
This removes the `ol_impl_result_t` helper class, replacing it with `llvm::Error`. In addition, some internal functions that returned `ol_errc_t` now return `llvm::Error` (with a fancy message).
2025-05-20[Offload] Use new error code handling mechanism and lower-case messages ↵Ross Brunton
(#139275) [Offload] Use new error code handling mechanism This removes the old ErrorCode-less error method and requires every user to provide a concrete error code. All calls have been updated. In addition, for consistency with error messages elsewhere in LLVM, all messages have been made to start lower case.
2025-05-02[Offload] Ensure all `llvm::Error`s are handled (#137339)Ross Brunton
`llvm::Error`s containing errors must be explicitly handled or an assert will be raised. With this change, `ol_impl_result_t` can accept and consume an `llvm::Error` for errors raised by PluginInterface that have multiple causes and other places now call `llvm::consumeError`. Note that there is currently no facility for PluginInterface to communicate exact error codes, but the constructor is designed in such a way that it can be easily added later. This MR is to convert a crash into an error code. A new test was added, however due to the aforementioned issue with error codes, it does not pass and instead is marked as a skip.
2025-04-30[Offload] Adding missing Offload unit tests for event entry points (#137315)Callum Fare
A couple of liboffload entry points were missed out from the tests, and unsurprisingly a crash in one of them made it in. Add the tests and fix the unchecked error in `olDestroyEvent`.
2025-04-29[Offload] Add check-offload-unit for liboffload unittests (#137312)Callum Fare
Adds a `check-offload-unit` target for running the liboffload unit test suite. This unit test binary runs the tests for every available device. This can optionally filtered to devices from a single platform, but the check target runs on everything. The target is not part of `check-offload` and does not get propagated to the top level build. I'm not sure if either of these things are desirable, but I'm happy to look into it if we want. Also remove the `offload/unittests/Plugins` test as it's dead code and doesn't build.
2025-04-22[Offload] Implement the remaining initial Offload API (#122106)Callum Fare
Implement the complete initial version of the Offload API, to the extent that is usable for simple offloading programs. Tested with a basic SYCL program. As far as possible, these are simple wrappers over existing functionality in the plugins. * Allocating and freeing memory (host, device, shared). * Creating a program * Creating a queue (wrapper over asynchronous stream resource) * Enqueuing memcpy operations * Enqueuing kernel executions * Waiting on (optional) output events from the enqueue operations * Waiting on a queue to finish Objects created with the API have reference counting semantics to handle their lifetime. They are created with an initial reference count of 1, which can be incremented and decremented with retain and release functions. They are freed when their reference count reaches 0. Platform and device objects are not reference counted, as they are expected to persist as long as the library is in use, and it's not meaningful for users to create or destroy them. Tests have been added to `offload.unittests`, including device code for testing program and kernel related functionality. The API should still be considered unstable and it's very likely we will need to change the existing entry points.
2024-12-05Reland #118503: [Offload] Introduce offload-tblgen and initial new API ↵Callum Fare
implementation (#118614) Reland #118503. Added a fix for builds with `-DBUILD_SHARED_LIBS=ON` (see last commit). Otherwise the changes are identical. --- ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.
2024-12-03Revert "Reland of #108413: [Offload] Introduce offload-tblgen and initial ↵Jan Patrick Lehr
new API implementation" (#118541) Reverts llvm/llvm-project#118503 Broke bot https://lab.llvm.org/staging/#/builders/131/builds/9701/steps/5/logs/stdio
2024-12-03Reland of #108413: [Offload] Introduce offload-tblgen and initial new API ↵Callum Fare
implementation (#118503) This is another attempt to reland the changes from #108413 The previous two attempts introduced regressions and were reverted. This PR has been more thoroughly tested with various configurations so shouldn't cause any problems this time. If anyone is aware of any likely remaining problems then please let me know. The changes are identical other than the fixes contained in the last 5 commits. --- ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.
2024-11-28Revert "Reland #2 - [Offload] Introduce offload-tblgen and initial new API ↵Jan Patrick Lehr
implementation (#108413. #117704)" (#117995) Reverts llvm/llvm-project#117894 Buildbot failures in OpenMP/Offload bots. https://lab.llvm.org/buildbot/#/builders/30/builds/11193
2024-11-28Reland #2 - [Offload] Introduce offload-tblgen and initial new API ↵Callum Fare
implementation (#108413. #117704) (#117894) Relands #117704, which relanded changes from #108413 - this was reverted due to build issues. The new offload library did not build with `LIBOMPTARGET_OMPT_SUPPORT` enabled, which was not picked up by pre-merge testing. The last commit contains the fix; everything else is otherwise identical to the approved PR. ___ ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.
2024-11-27Revert "Reland - [Offload] Introduce offload-tblgen and initial new API ↵Fraser Cormack
implementation (#108413) (#117704)" This reverts commit c979ec05642f292737d250c6682d85ed49bc7b6e. This showed failures in the post-merge CI.
2024-11-27Reland - [Offload] Introduce offload-tblgen and initial new API ↵Callum Fare
implementation (#108413) (#117704) Relands changes from #108413 - this was reverted due to build issues. The problem was just that the `offload-tblgen` tool was behind recent changes to tablegen that ensure `const` records. This has been fixed and the PR is otherwise identical. ___ ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.
2024-11-25Revert "[Offload] Introduce offload-tblgen and initial new API ↵Joseph Huber
implementation (#108413)" This reverts commit 8a2311c4bf9993230e37dc20b57973dc917f2338.
2024-11-25[Offload] Introduce offload-tblgen and initial new API implementation (#108413)Callum Fare
Introduce `offload-tblgen` and an initial implementation of a subset of the new API. The tablegen files are intended to be the single source of truth for the new API, with the header files, documentation, and others bits of source all automatically generated. **TODO** (based on review feedback so far): - [x] Check in the generated headers - [x] Add an `offload-generate` target to trigger the generation rather than building them every time - [x] Decide how error handling should work - [x] Finish up new error handling implementation - [x] Decide naming convention - [x] Add testing for the new API - [x] Add tablegen specific testing - [x] clang-tidy and use llvm:: types when possible - [x] Add optional code location arguments - [x] Avoid multiple returns from one function ### offload-tblgen See the included [README](https://github.com/callumfare/llvm-project/blob/d80db06491d85444bb6f7e59d8068a22cef3a6b4/offload/new-api/API/README.md) for more information on how the API definition and generation works. I'm happy to answer any questions about it and plan to walk through it in a future LLVM Offload call. It should be noted that struct definitions have not been fully implemented/tested as they aren't used by the initial API definitions, but finishing that off in the future shouldn't be too much work. The tablegen tooling has been designed to be easily extended with new backends, using the classes in `RecordTypes.hpp` to abstract over the tablegen records. ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh $ git clone -b offload_adapter https://github.com/callumfare/unified-runtime.git $ cd unified-runtime $ mkdir build $ cd build $ cmake .. -GNinja -DUR_BUILD_ADAPTER_OFFLOAD=ON \ -DUR_OFFLOAD_INSTALL_DIR=<offload build dir containing liboffload_new.so> \ -DUR_OFFLOAD_INCLUDE_DIR=<offload build dir containing 'offload' headers directory> $ ninja urinfo export LD_LIBRARY_PATH=<offload build dir containing offload plugin libraries> $ UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo [cuda:gpu][cuda:0] CUDA, NVIDIA GeForce GT 1030 [12030] # Demo with tracing $ OFFLOAD_TRACE=1 UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo ---> offloadPlatformGet(.NumEntries = 0, .phPlatforms = {}, .pNumPlatforms = 0x7ffd05e4d6e0 (2))-> OFFLOAD_RESULT_SUCCESS ---> offloadPlatformGet(.NumEntries = 2, .phPlatforms = {0x564bf4040220, 0x564bf4040240}, .pNumPlatforms = nullptr)-> OFFLOAD_RESULT_SUCCESS ... ``` ### Open questions and future work * The new API is implemented in a separate library (`liboffload_new.so`). It could just as easily be part of the existing `libomptarget` library - I have no strong feelings on which is better. * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.