<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/offload/unittests/OffloadAPI/device/olGetDeviceInfoSize.cpp, branch users/mingmingl-llvm/samplefdo-profile-format</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[Offload] Add `OL_DEVICE_INFO_MAX_WORK_SIZE[_PER_DIMENSION]` (#155823)</title>
<updated>2025-08-29T08:39:18+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-29T08:39:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ffb756dff2d2a7a7131d2edaa4437c03745c532d'/>
<id>ffb756dff2d2a7a7131d2edaa4437c03745c532d</id>
<content type='text'>
This is the total number of work items that the device supports (the
equivalent work group properties are for only a single work group).</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the total number of work items that the device supports (the
equivalent work group properties are for only a single work group).</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Add PRODUCT_NAME device info (#155632)</title>
<updated>2025-08-28T14:16:17+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-28T14:16:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=41fed2d048ff67ff80c186992f98644764f26bac'/>
<id>41fed2d048ff67ff80c186992f98644764f26bac</id>
<content type='text'>
On my system, this will be "Radeon RX 7900 GRE" rather than "gfx1100". For Nvidia, the product name and device name are identical.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On my system, this will be "Radeon RX 7900 GRE" rather than "gfx1100". For Nvidia, the product name and device name are identical.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Define additional device info properties (#152533)</title>
<updated>2025-08-19T12:02:01+00:00</updated>
<author>
<name>Rafal Bielski</name>
<email>rafal.bielski@codeplay.com</email>
</author>
<published>2025-08-19T12:02:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=9c9d9e4cb6dfd8a3cada7fb6c8b4dc2b77b5514c'/>
<id>9c9d9e4cb6dfd8a3cada7fb6c8b4dc2b77b5514c</id>
<content type='text'>
Add the following properties in Offload device info:
* VENDOR_ID
* NUM_COMPUTE_UNITS
* [SINGLE|DOUBLE|HALF]_FP_CONFIG
* NATIVE_VECTOR_WIDTH_[CHAR|SHORT|INT|LONG|FLOAT|DOUBLE|HALF]
* MAX_CLOCK_FREQUENCY
* MEMORY_CLOCK_RATE
* ADDRESS_BITS
* MAX_MEM_ALLOC_SIZE
* GLOBAL_MEM_SIZE

Add a bitfield option to enumerators, allowing the values to be
bit-shifted instead of incremented. Generate the per-type enums using
`foreach` to reduce code duplication.

Use macros in unit test definitions to reduce code duplication.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the following properties in Offload device info:
* VENDOR_ID
* NUM_COMPUTE_UNITS
* [SINGLE|DOUBLE|HALF]_FP_CONFIG
* NATIVE_VECTOR_WIDTH_[CHAR|SHORT|INT|LONG|FLOAT|DOUBLE|HALF]
* MAX_CLOCK_FREQUENCY
* MEMORY_CLOCK_RATE
* ADDRESS_BITS
* MAX_MEM_ALLOC_SIZE
* GLOBAL_MEM_SIZE

Add a bitfield option to enumerators, allowing the values to be
bit-shifted instead of incremented. Generate the per-type enums using
`foreach` to reduce code duplication.

Use macros in unit test definitions to reduce code duplication.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Rework `MAX_WORK_GROUP_SIZE` (#151926)</title>
<updated>2025-08-04T14:21:24+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-04T14:21:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d03692a00ee2404aa5c98a5f6c447f66d77a4663'/>
<id>d03692a00ee2404aa5c98a5f6c447f66d77a4663</id>
<content type='text'>
`MAX_WORK_GROUP_SIZE` now represents the maximum total number of work
groups the device can allocate, rather than the maximum per dimension.
`MAX_WORK_GROUP_SIZE_PER_DIMENSION` has been added, which has the old
behaviour.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
`MAX_WORK_GROUP_SIZE` now represents the maximum total number of work
groups the device can allocate, rather than the maximum per dimension.
`MAX_WORK_GROUP_SIZE_PER_DIMENSION` has been added, which has the old
behaviour.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Add `MAX_WORK_GROUP_SIZE` device info query (#143718)</title>
<updated>2025-07-02T15:33:54+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-07-02T15:33:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=7d52b0983e0bee3c1d5dbe04ae2adfd33f0265e5'/>
<id>7d52b0983e0bee3c1d5dbe04ae2adfd33f0265e5</id>
<content type='text'>
This adds a new device info query for the maximum workgroup/block size
for each dimension.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds a new device info query for the maximum workgroup/block size
for each dimension.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Add check-offload-unit for liboffload unittests (#137312)</title>
<updated>2025-04-29T16:21:59+00:00</updated>
<author>
<name>Callum Fare</name>
<email>callum@codeplay.com</email>
</author>
<published>2025-04-29T16:21:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6022a5214b597561d23c26a839a210f41c246c47'/>
<id>6022a5214b597561d23c26a839a210f41c246c47</id>
<content type='text'>
Adds a `check-offload-unit` target for running the liboffload unit test
suite. This unit test binary runs the tests for every available device.
This can optionally filtered to devices from a single platform, but the
check target runs on everything.

The target is not part of `check-offload` and does not get propagated to
the top level build. I'm not sure if either of these things are
desirable, but I'm happy to look into it if we want.

Also remove the `offload/unittests/Plugins` test as it's dead code and
doesn't build.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adds a `check-offload-unit` target for running the liboffload unit test
suite. This unit test binary runs the tests for every available device.
This can optionally filtered to devices from a single platform, but the
check target runs on everything.

The target is not part of `check-offload` and does not get propagated to
the top level build. I'm not sure if either of these things are
desirable, but I'm happy to look into it if we want.

Also remove the `offload/unittests/Plugins` test as it's dead code and
doesn't build.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Implement the remaining initial Offload API (#122106)</title>
<updated>2025-04-22T18:27:50+00:00</updated>
<author>
<name>Callum Fare</name>
<email>callum@codeplay.com</email>
</author>
<published>2025-04-22T18:27:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=800d949bb315349a116a980e99d0f36645ffefd3'/>
<id>800d949bb315349a116a980e99d0f36645ffefd3</id>
<content type='text'>
Implement the complete initial version of the Offload API, to the extent
that is usable for simple offloading programs. Tested with a basic SYCL
program.

As far as possible, these are simple wrappers over existing
functionality in the plugins.

* Allocating and freeing memory (host, device, shared).
* Creating a program 
* Creating a queue (wrapper over asynchronous stream resource)
* Enqueuing memcpy operations
* Enqueuing kernel executions
* Waiting on (optional) output events from the enqueue operations
* Waiting on a queue to finish

Objects created with the API have reference counting semantics to handle
their lifetime. They are created with an initial reference count of 1,
which can be incremented and decremented with retain and release
functions. They are freed when their reference count reaches 0. Platform
and device objects are not reference counted, as they are expected to
persist as long as the library is in use, and it's not meaningful for
users to create or destroy them.

Tests have been added to `offload.unittests`, including device code for
testing program and kernel related functionality.

The API should still be considered unstable and it's very likely we will
need to change the existing entry points.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement the complete initial version of the Offload API, to the extent
that is usable for simple offloading programs. Tested with a basic SYCL
program.

As far as possible, these are simple wrappers over existing
functionality in the plugins.

* Allocating and freeing memory (host, device, shared).
* Creating a program 
* Creating a queue (wrapper over asynchronous stream resource)
* Enqueuing memcpy operations
* Enqueuing kernel executions
* Waiting on (optional) output events from the enqueue operations
* Waiting on a queue to finish

Objects created with the API have reference counting semantics to handle
their lifetime. They are created with an initial reference count of 1,
which can be incremented and decremented with retain and release
functions. They are freed when their reference count reaches 0. Platform
and device objects are not reference counted, as they are expected to
persist as long as the library is in use, and it's not meaningful for
users to create or destroy them.

Tests have been added to `offload.unittests`, including device code for
testing program and kernel related functionality.

The API should still be considered unstable and it's very likely we will
need to change the existing entry points.</pre>
</div>
</content>
</entry>
<entry>
<title>Reland #118503: [Offload] Introduce offload-tblgen and initial new API implementation (#118614)</title>
<updated>2024-12-05T08:34:04+00:00</updated>
<author>
<name>Callum Fare</name>
<email>callum@codeplay.com</email>
</author>
<published>2024-12-05T08:34:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=fd3907ccb583df99e9c19d2fe84e4e7c52d75de9'/>
<id>fd3907ccb583df99e9c19d2fe84e4e7c52d75de9</id>
<content type='text'>
Reland #118503. Added a fix for builds with `-DBUILD_SHARED_LIBS=ON`
(see last commit). Otherwise the changes are identical.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reland #118503. Added a fix for builds with `-DBUILD_SHARED_LIBS=ON`
(see last commit). Otherwise the changes are identical.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "Reland of #108413: [Offload] Introduce offload-tblgen and initial new API implementation" (#118541)</title>
<updated>2024-12-03T20:42:38+00:00</updated>
<author>
<name>Jan Patrick Lehr</name>
<email>JanPatrick.Lehr@amd.com</email>
</author>
<published>2024-12-03T20:42:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=51553227f010e3a9a439b2e57af15a6797f69a90'/>
<id>51553227f010e3a9a439b2e57af15a6797f69a90</id>
<content type='text'>
Reverts llvm/llvm-project#118503

Broke bot
https://lab.llvm.org/staging/#/builders/131/builds/9701/steps/5/logs/stdio</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reverts llvm/llvm-project#118503

Broke bot
https://lab.llvm.org/staging/#/builders/131/builds/9701/steps/5/logs/stdio</pre>
</div>
</content>
</entry>
<entry>
<title>Reland of #108413: [Offload] Introduce offload-tblgen and initial new API implementation (#118503)</title>
<updated>2024-12-03T16:28:35+00:00</updated>
<author>
<name>Callum Fare</name>
<email>callum@codeplay.com</email>
</author>
<published>2024-12-03T16:28:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8da490320f6dcb99b4efef2cdb3d21002db1d2f7'/>
<id>8da490320f6dcb99b4efef2cdb3d21002db1d2f7</id>
<content type='text'>
This is another attempt to reland the changes from #108413

The previous two attempts introduced regressions and were reverted. This
PR has been more thoroughly tested with various configurations so
shouldn't cause any problems this time. If anyone is aware of any likely
remaining problems then please let me know.

The changes are identical other than the fixes contained in the last 5
commits.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is another attempt to reland the changes from #108413

The previous two attempts introduced regressions and were reverted. This
PR has been more thoroughly tested with various configurations so
shouldn't cause any problems this time. If anyone is aware of any likely
remaining problems then please let me know.

The changes are identical other than the fixes contained in the last 5
commits.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.</pre>
</div>
</content>
</entry>
</feed>
