<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/offload/liboffload/src/OffloadImpl.cpp, branch users/mingmingl-llvm/samplefdo-profile-format</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[Offload] Add `OL_DEVICE_INFO_MAX_WORK_SIZE[_PER_DIMENSION]` (#155823)</title>
<updated>2025-08-29T08:39:18+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-29T08:39:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ffb756dff2d2a7a7131d2edaa4437c03745c532d'/>
<id>ffb756dff2d2a7a7131d2edaa4437c03745c532d</id>
<content type='text'>
This is the total number of work items that the device supports (the
equivalent work group properties are for only a single work group).</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the total number of work items that the device supports (the
equivalent work group properties are for only a single work group).</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Improve `olDestroyQueue` logic (#153041)</title>
<updated>2025-08-29T08:39:00+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-29T08:39:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=9e5d8bd3d1ecd512be3dfa233bc41f32e26c750a'/>
<id>9e5d8bd3d1ecd512be3dfa233bc41f32e26c750a</id>
<content type='text'>
Previously, `olDestroyQueue` would not actually destroy the queue,
instead leaving it for the device to clean up when it was destroyed.
Now, the queue is either released immediately if it is complete or put
into a list of "pending" queues if it is not. Whenever we create a new
queue, we check this list to see if any are now completed. If there are
any we release their resources and use them instead of pulling from
the pool.

This prevents long running programs that create and drop many queues
without syncing them from leaking memory all over the place.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously, `olDestroyQueue` would not actually destroy the queue,
instead leaving it for the device to clean up when it was destroyed.
Now, the queue is either released immediately if it is complete or put
into a list of "pending" queues if it is not. Whenever we create a new
queue, we check this list to see if any are now completed. If there are
any we release their resources and use them instead of pulling from
the pool.

This prevents long running programs that create and drop many queues
without syncing them from leaking memory all over the place.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Add PRODUCT_NAME device info (#155632)</title>
<updated>2025-08-28T14:16:17+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-28T14:16:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=41fed2d048ff67ff80c186992f98644764f26bac'/>
<id>41fed2d048ff67ff80c186992f98644764f26bac</id>
<content type='text'>
On my system, this will be "Radeon RX 7900 GRE" rather than "gfx1100". For Nvidia, the product name and device name are identical.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On my system, this will be "Radeon RX 7900 GRE" rather than "gfx1100". For Nvidia, the product name and device name are identical.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Implement olMemFill (#154102)</title>
<updated>2025-08-22T13:31:16+00:00</updated>
<author>
<name>Callum Fare</name>
<email>callum@codeplay.com</email>
</author>
<published>2025-08-22T13:31:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=0b18d2da70096fcd28e82dbce8f853232454856e'/>
<id>0b18d2da70096fcd28e82dbce8f853232454856e</id>
<content type='text'>
Implement olMemFill to support filling device memory with arbitrary
length patterns. AMDGPU support will be added in a follow-up PR.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement olMemFill to support filling device memory with arbitrary
length patterns. AMDGPU support will be added in a follow-up PR.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] `OL_EVENT_INFO_IS_COMPLETE` (#153194)</title>
<updated>2025-08-22T12:40:31+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-22T12:40:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4c0c295775cff0dcfc6439c3f51991ffac0345d8'/>
<id>4c0c295775cff0dcfc6439c3f51991ffac0345d8</id>
<content type='text'>
A simple info query for events that returns whether the event is
complete or not.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A simple info query for events that returns whether the event is
complete or not.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Fix `OL_DEVICE_INFO_MAX_MEM_ALLOC_SIZE` on AMD (#154521)</title>
<updated>2025-08-21T08:37:58+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-21T08:37:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=273ca1f77be57f1b14b5533b632b37c3b4ee63e9'/>
<id>273ca1f77be57f1b14b5533b632b37c3b4ee63e9</id>
<content type='text'>
This wasn't handled with the normal info API, so needs special handling.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This wasn't handled with the normal info API, so needs special handling.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Guard olMemAlloc/Free with a mutex (#153786)</title>
<updated>2025-08-20T12:23:57+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-20T12:23:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=c8986d1ecbcd50a00ecdb7887f7d43141de3196a'/>
<id>c8986d1ecbcd50a00ecdb7887f7d43141de3196a</id>
<content type='text'>
Both these functions update an `AllocInfoMap` structure in the context,
however they did not use any locks, causing random failures in threaded
code. Now they use a mutex.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Both these functions update an `AllocInfoMap` structure in the context,
however they did not use any locks, causing random failures in threaded
code. Now they use a mutex.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Add olCalculateOptimalOccupancy (#142950)</title>
<updated>2025-08-19T14:16:47+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-19T14:16:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2c11a83691b7089d7a79e9f122dc521e6ea7e51e'/>
<id>2c11a83691b7089d7a79e9f122dc521e6ea7e51e</id>
<content type='text'>
This is equivalent to `cuOccupancyMaxPotentialBlockSize`. It is
currently
only implemented on Cuda; AMDGPU and Host return unsupported.

---------

Co-authored-by: Callum Fare &lt;callum@codeplay.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is equivalent to `cuOccupancyMaxPotentialBlockSize`. It is
currently
only implemented on Cuda; AMDGPU and Host return unsupported.

---------

Co-authored-by: Callum Fare &lt;callum@codeplay.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Define additional device info properties (#152533)</title>
<updated>2025-08-19T12:02:01+00:00</updated>
<author>
<name>Rafal Bielski</name>
<email>rafal.bielski@codeplay.com</email>
</author>
<published>2025-08-19T12:02:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=9c9d9e4cb6dfd8a3cada7fb6c8b4dc2b77b5514c'/>
<id>9c9d9e4cb6dfd8a3cada7fb6c8b4dc2b77b5514c</id>
<content type='text'>
Add the following properties in Offload device info:
* VENDOR_ID
* NUM_COMPUTE_UNITS
* [SINGLE|DOUBLE|HALF]_FP_CONFIG
* NATIVE_VECTOR_WIDTH_[CHAR|SHORT|INT|LONG|FLOAT|DOUBLE|HALF]
* MAX_CLOCK_FREQUENCY
* MEMORY_CLOCK_RATE
* ADDRESS_BITS
* MAX_MEM_ALLOC_SIZE
* GLOBAL_MEM_SIZE

Add a bitfield option to enumerators, allowing the values to be
bit-shifted instead of incremented. Generate the per-type enums using
`foreach` to reduce code duplication.

Use macros in unit test definitions to reduce code duplication.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the following properties in Offload device info:
* VENDOR_ID
* NUM_COMPUTE_UNITS
* [SINGLE|DOUBLE|HALF]_FP_CONFIG
* NATIVE_VECTOR_WIDTH_[CHAR|SHORT|INT|LONG|FLOAT|DOUBLE|HALF]
* MAX_CLOCK_FREQUENCY
* MEMORY_CLOCK_RATE
* ADDRESS_BITS
* MAX_MEM_ALLOC_SIZE
* GLOBAL_MEM_SIZE

Add a bitfield option to enumerators, allowing the values to be
bit-shifted instead of incremented. Generate the per-type enums using
`foreach` to reduce code duplication.

Use macros in unit test definitions to reduce code duplication.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] `olLaunchHostFunction` (#152482)</title>
<updated>2025-08-15T08:39:48+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-15T08:39:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=30c79511360de82c57cf9a78fff9fb10a8ccc58a'/>
<id>30c79511360de82c57cf9a78fff9fb10a8ccc58a</id>
<content type='text'>
Add an `olLaunchHostFunction` method that allows enqueueing host work
to the stream.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add an `olLaunchHostFunction` method that allows enqueueing host work
to the stream.</pre>
</div>
</content>
</entry>
</feed>
