<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/offload/plugins-nextgen/amdgpu/src/rtl.cpp, branch users/mingmingl-llvm/samplefdo-profile-format</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[Offload] Add `OL_DEVICE_INFO_MAX_WORK_SIZE[_PER_DIMENSION]` (#155823)</title>
<updated>2025-08-29T08:39:18+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-29T08:39:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ffb756dff2d2a7a7131d2edaa4437c03745c532d'/>
<id>ffb756dff2d2a7a7131d2edaa4437c03745c532d</id>
<content type='text'>
This is the total number of work items that the device supports (the
equivalent work group properties are for only a single work group).</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the total number of work items that the device supports (the
equivalent work group properties are for only a single work group).</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Add PRODUCT_NAME device info (#155632)</title>
<updated>2025-08-28T14:16:17+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-28T14:16:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=41fed2d048ff67ff80c186992f98644764f26bac'/>
<id>41fed2d048ff67ff80c186992f98644764f26bac</id>
<content type='text'>
On my system, this will be "Radeon RX 7900 GRE" rather than "gfx1100". For Nvidia, the product name and device name are identical.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On my system, this will be "Radeon RX 7900 GRE" rather than "gfx1100". For Nvidia, the product name and device name are identical.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Full AMD support for olMemFill (#154958)</title>
<updated>2025-08-26T10:49:12+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-26T10:49:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1b6875ea1ff2b5a7ba3ff83482132ad99f3aaf1b'/>
<id>1b6875ea1ff2b5a7ba3ff83482132ad99f3aaf1b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Implement olMemFill (#154102)</title>
<updated>2025-08-22T13:31:16+00:00</updated>
<author>
<name>Callum Fare</name>
<email>callum@codeplay.com</email>
</author>
<published>2025-08-22T13:31:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=0b18d2da70096fcd28e82dbce8f853232454856e'/>
<id>0b18d2da70096fcd28e82dbce8f853232454856e</id>
<content type='text'>
Implement olMemFill to support filling device memory with arbitrary
length patterns. AMDGPU support will be added in a follow-up PR.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement olMemFill to support filling device memory with arbitrary
length patterns. AMDGPU support will be added in a follow-up PR.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] `OL_EVENT_INFO_IS_COMPLETE` (#153194)</title>
<updated>2025-08-22T12:40:31+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-22T12:40:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4c0c295775cff0dcfc6439c3f51991ffac0345d8'/>
<id>4c0c295775cff0dcfc6439c3f51991ffac0345d8</id>
<content type='text'>
A simple info query for events that returns whether the event is
complete or not.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A simple info query for events that returns whether the event is
complete or not.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Add olCalculateOptimalOccupancy (#142950)</title>
<updated>2025-08-19T14:16:47+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-19T14:16:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2c11a83691b7089d7a79e9f122dc521e6ea7e51e'/>
<id>2c11a83691b7089d7a79e9f122dc521e6ea7e51e</id>
<content type='text'>
This is equivalent to `cuOccupancyMaxPotentialBlockSize`. It is
currently
only implemented on Cuda; AMDGPU and Host return unsupported.

---------

Co-authored-by: Callum Fare &lt;callum@codeplay.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is equivalent to `cuOccupancyMaxPotentialBlockSize`. It is
currently
only implemented on Cuda; AMDGPU and Host return unsupported.

---------

Co-authored-by: Callum Fare &lt;callum@codeplay.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Define additional device info properties (#152533)</title>
<updated>2025-08-19T12:02:01+00:00</updated>
<author>
<name>Rafal Bielski</name>
<email>rafal.bielski@codeplay.com</email>
</author>
<published>2025-08-19T12:02:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=9c9d9e4cb6dfd8a3cada7fb6c8b4dc2b77b5514c'/>
<id>9c9d9e4cb6dfd8a3cada7fb6c8b4dc2b77b5514c</id>
<content type='text'>
Add the following properties in Offload device info:
* VENDOR_ID
* NUM_COMPUTE_UNITS
* [SINGLE|DOUBLE|HALF]_FP_CONFIG
* NATIVE_VECTOR_WIDTH_[CHAR|SHORT|INT|LONG|FLOAT|DOUBLE|HALF]
* MAX_CLOCK_FREQUENCY
* MEMORY_CLOCK_RATE
* ADDRESS_BITS
* MAX_MEM_ALLOC_SIZE
* GLOBAL_MEM_SIZE

Add a bitfield option to enumerators, allowing the values to be
bit-shifted instead of incremented. Generate the per-type enums using
`foreach` to reduce code duplication.

Use macros in unit test definitions to reduce code duplication.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the following properties in Offload device info:
* VENDOR_ID
* NUM_COMPUTE_UNITS
* [SINGLE|DOUBLE|HALF]_FP_CONFIG
* NATIVE_VECTOR_WIDTH_[CHAR|SHORT|INT|LONG|FLOAT|DOUBLE|HALF]
* MAX_CLOCK_FREQUENCY
* MEMORY_CLOCK_RATE
* ADDRESS_BITS
* MAX_MEM_ALLOC_SIZE
* GLOBAL_MEM_SIZE

Add a bitfield option to enumerators, allowing the values to be
bit-shifted instead of incremented. Generate the per-type enums using
`foreach` to reduce code duplication.

Use macros in unit test definitions to reduce code duplication.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Introduce dataFence plugin interface. (#153793)</title>
<updated>2025-08-15T18:49:35+00:00</updated>
<author>
<name>Abhinav Gaba</name>
<email>abhinav.gaba@intel.com</email>
</author>
<published>2025-08-15T18:49:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=79cf877627ec341c62f64e25a44f3ba340edad1e'/>
<id>79cf877627ec341c62f64e25a44f3ba340edad1e</id>
<content type='text'>
The purpose of this fence is to ensure that any `dataSubmit`s inserted
into a queue before a `dataFence` finish before finish before any
`dataSubmit`s
inserted after it begin.

This is a no-op for most queues, since they are in-order, and by design
any operations inserted into them occur in order.

But the interface is supposed to be functional for out-of-order queues.

The addition of the interface means that any operations that rely on
such ordering (like ATTACH map-type support in #149036) can invoke it,
without worrying about whether the underlying queue is in-order or
out-of-order.

Once a plugin supports out-of-order queues, the plugin can implement
this function, without requiring any change at the libomptarget level.

---------

Co-authored-by: Alex Duran &lt;alejandro.duran@intel.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The purpose of this fence is to ensure that any `dataSubmit`s inserted
into a queue before a `dataFence` finish before finish before any
`dataSubmit`s
inserted after it begin.

This is a no-op for most queues, since they are in-order, and by design
any operations inserted into them occur in order.

But the interface is supposed to be functional for out-of-order queues.

The addition of the interface means that any operations that rely on
such ordering (like ATTACH map-type support in #149036) can invoke it,
without worrying about whether the underlying queue is in-order or
out-of-order.

Once a plugin supports out-of-order queues, the plugin can implement
this function, without requiring any change at the libomptarget level.

---------

Co-authored-by: Alex Duran &lt;alejandro.duran@intel.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] `olLaunchHostFunction` (#152482)</title>
<updated>2025-08-15T08:39:48+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-15T08:39:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=30c79511360de82c57cf9a78fff9fb10a8ccc58a'/>
<id>30c79511360de82c57cf9a78fff9fb10a8ccc58a</id>
<content type='text'>
Add an `olLaunchHostFunction` method that allows enqueueing host work
to the stream.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add an `olLaunchHostFunction` method that allows enqueueing host work
to the stream.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Make olLaunchKernel test thread safe (#149497)</title>
<updated>2025-08-08T09:57:04+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-08-08T09:57:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=910d7e90bfc6aef5f974f0cf4b3fc034a2f4849a'/>
<id>910d7e90bfc6aef5f974f0cf4b3fc034a2f4849a</id>
<content type='text'>
This sprinkles a few mutexes around the plugin interface so that the
olLaunchKernel CTS test now passes when ran on multiple threads.

Part of this also involved changing the interface for device synchronise
so that it can optionally not free the underlying queue (which
introduced a race condition in liboffload).</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This sprinkles a few mutexes around the plugin interface so that the
olLaunchKernel CTS test now passes when ran on multiple threads.

Part of this also involved changing the interface for device synchronise
so that it can optionally not free the underlying queue (which
introduced a race condition in liboffload).</pre>
</div>
</content>
</entry>
</feed>
