<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/offload/plugins-nextgen/common/src/GlobalHandler.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[PGO][Offload] Fix offload coverage mapping  (#143490)</title>
<updated>2025-06-11T01:19:38+00:00</updated>
<author>
<name>Ethan Luis McDonough</name>
<email>ethanluismcdonough@gmail.com</email>
</author>
<published>2025-06-11T01:19:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=67ff66e67734c0b283ec676899e5b89b67fdafcb'/>
<id>67ff66e67734c0b283ec676899e5b89b67fdafcb</id>
<content type='text'>
This pull request fixes coverage mapping on GPU targets. 

- It adds an address space cast to the coverage mapping generation pass.
- It reads the profiled function names from the ELF directly. Reading it
from public globals was causing issues in cases where multiple
device-code object files are linked together.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This pull request fixes coverage mapping on GPU targets. 

- It adds an address space cast to the coverage mapping generation pass.
- It reads the profiled function names from the ELF directly. Reading it
from public globals was causing issues in cases where multiple
device-code object files are linked together.</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload] Use new error code handling mechanism and lower-case messages (#139275)</title>
<updated>2025-05-20T13:50:20+00:00</updated>
<author>
<name>Ross Brunton</name>
<email>ross@codeplay.com</email>
</author>
<published>2025-05-20T13:50:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=050892d2f879d278b3342edde028f62bf77d00d2'/>
<id>050892d2f879d278b3342edde028f62bf77d00d2</id>
<content type='text'>
[Offload] Use new error code handling mechanism

This removes the old ErrorCode-less error method and requires
every user to provide a concrete error code. All calls have been
updated.

In addition, for consistency with error messages elsewhere in LLVM, all
messages have been made to start lower case.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[Offload] Use new error code handling mechanism

This removes the old ErrorCode-less error method and requires
every user to provide a concrete error code. All calls have been
updated.

In addition, for consistency with error messages elsewhere in LLVM, all
messages have been made to start lower case.</pre>
</div>
</content>
</entry>
<entry>
<title>[PGO][Offload] Allow PGO flags to be used on GPU targets (#94268)</title>
<updated>2025-03-20T00:01:38+00:00</updated>
<author>
<name>Ethan Luis McDonough</name>
<email>ethanluismcdonough@gmail.com</email>
</author>
<published>2025-03-20T00:01:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=c50d39f073f3b0143b781be80764d183ddab88d4'/>
<id>c50d39f073f3b0143b781be80764d183ddab88d4</id>
<content type='text'>
This pull request is the third part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on
https://github.com/llvm/llvm-project/pull/93365. This PR makes the
following changes:

- Allows PGO flags to be supplied to GPU targets
- Pulls version global from device
- Modifies `__llvm_write_custom_profile` and `lprofWriteDataImpl` to
allow the PGO version to be overridden</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This pull request is the third part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on
https://github.com/llvm/llvm-project/pull/93365. This PR makes the
following changes:

- Allows PGO flags to be supplied to GPU targets
- Pulls version global from device
- Modifies `__llvm_write_custom_profile` and `lprofWriteDataImpl` to
allow the PGO version to be overridden</pre>
</div>
</content>
</entry>
<entry>
<title> [PGO][Offload] Profile profraw generation for GPU instrumentation #76587  (#93365)</title>
<updated>2025-02-12T05:30:54+00:00</updated>
<author>
<name>Ethan Luis McDonough</name>
<email>ethanluismcdonough@gmail.com</email>
</author>
<published>2025-02-12T05:30:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=9e5c136d5a1a8acee9e7acfbe14cc6d4490dee2e'/>
<id>9e5c136d5a1a8acee9e7acfbe14cc6d4490dee2e</id>
<content type='text'>
This pull request is the second part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on #76587. This PR makes
the following changes:

- Introduces `__llvm_write_custom_profile` to PGO compiler-rt library.
This is an external function that can be used to write profiles with
custom data to target-specific files.
- Adds `__llvm_write_custom_profile` as weak symbol to libomptarget so
that it can write the collected data to a profraw file.
- Adds `PGODump` debug flag and only displays dump when the
aforementioned flag is set</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This pull request is the second part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on #76587. This PR makes
the following changes:

- Introduces `__llvm_write_custom_profile` to PGO compiler-rt library.
This is an external function that can be used to write profiles with
custom data to target-specific files.
- Adds `__llvm_write_custom_profile` as weak symbol to libomptarget so
that it can write the collected data to a profraw file.
- Adds `PGODump` debug flag and only displays dump when the
aforementioned flag is set</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload][NFC] Fix typos discovered by codespell (#125119)</title>
<updated>2025-01-31T15:35:29+00:00</updated>
<author>
<name>Christian Clauss</name>
<email>cclauss@me.com</email>
</author>
<published>2025-01-31T15:35:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1f56bb3137827d66093b66aa3a6447fdaba61783'/>
<id>1f56bb3137827d66093b66aa3a6447fdaba61783</id>
<content type='text'>
https://github.com/codespell-project/codespell

% `codespell
--ignore-words-list=archtype,hsa,identty,inout,iself,nd,te,ths,vertexes
--write-changes`</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
https://github.com/codespell-project/codespell

% `codespell
--ignore-words-list=archtype,hsa,identty,inout,iself,nd,te,ths,vertexes
--write-changes`</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload][PGO] Fix dump of array in ProfData (#122039)</title>
<updated>2025-01-14T20:46:27+00:00</updated>
<author>
<name>Jinsong Ji</name>
<email>jinsong.ji@intel.com</email>
</author>
<published>2025-01-14T20:46:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8d1d67ec4dc957ce15a06f782c6746281e66e559'/>
<id>8d1d67ec4dc957ce15a06f782c6746281e66e559</id>
<content type='text'>
Exposed by -Warray-bounds:

In file included from
../../../../../../../llvm/offload/plugins-nextgen/common/src/GlobalHandler.cpp:252:

../../../../../../../llvm/llvm/include/llvm/ProfileData/InstrProfData.inc:109:1:
error: array index 4 is past the end of the array (that has type 'const
std::remove_const&lt;const uint16_t&gt;::type[4]' (aka 'const unsigned
short[4]')) [-Werror,-Warray-bounds]
109 | INSTR_PROF_DATA(const uint16_t, Int16ArrayTy,
NumValueSites[IPVK_Last+1], \
| ^ ~~~~~~~~~~~

../../../../../../../llvm/offload/plugins-nextgen/common/src/GlobalHandler.cpp:250:15:
note: expanded from macro 'INSTR_PROF_DATA'
250 | outs() &lt;&lt; ProfData.Name &lt;&lt; " "; \
      |               ^        ~~~~

../../../../../../../llvm/llvm/include/llvm/ProfileData/InstrProfData.inc:109:1:
note: array 'NumValueSites' declared here
109 | INSTR_PROF_DATA(const uint16_t, Int16ArrayTy,
NumValueSites[IPVK_Last+1], \
      | ^

../../../../../../../llvm/offload/plugins-nextgen/common/include/GlobalHandler.h:62:3:
note: expanded from macro 'INSTR_PROF_DATA'
   62 |   std::remove_const&lt;Type&gt;::type Name;

Avoid accessing out-of-bound data, but skip printing array data for now.
As there is no simple way to do this without hardcoding the
NumValueSites field.

---------

Co-authored-by: Ethan Luis McDonough &lt;ethanluismcdonough@gmail.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Exposed by -Warray-bounds:

In file included from
../../../../../../../llvm/offload/plugins-nextgen/common/src/GlobalHandler.cpp:252:

../../../../../../../llvm/llvm/include/llvm/ProfileData/InstrProfData.inc:109:1:
error: array index 4 is past the end of the array (that has type 'const
std::remove_const&lt;const uint16_t&gt;::type[4]' (aka 'const unsigned
short[4]')) [-Werror,-Warray-bounds]
109 | INSTR_PROF_DATA(const uint16_t, Int16ArrayTy,
NumValueSites[IPVK_Last+1], \
| ^ ~~~~~~~~~~~

../../../../../../../llvm/offload/plugins-nextgen/common/src/GlobalHandler.cpp:250:15:
note: expanded from macro 'INSTR_PROF_DATA'
250 | outs() &lt;&lt; ProfData.Name &lt;&lt; " "; \
      |               ^        ~~~~

../../../../../../../llvm/llvm/include/llvm/ProfileData/InstrProfData.inc:109:1:
note: array 'NumValueSites' declared here
109 | INSTR_PROF_DATA(const uint16_t, Int16ArrayTy,
NumValueSites[IPVK_Last+1], \
      | ^

../../../../../../../llvm/offload/plugins-nextgen/common/include/GlobalHandler.h:62:3:
note: expanded from macro 'INSTR_PROF_DATA'
   62 |   std::remove_const&lt;Type&gt;::type Name;

Avoid accessing out-of-bound data, but skip printing array data for now.
As there is no simple way to do this without hardcoding the
NumValueSites field.

---------

Co-authored-by: Ethan Luis McDonough &lt;ethanluismcdonough@gmail.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload][NFC] Reorganize `utils::` and make Device/Host/Shared clearer (#100280)</title>
<updated>2024-09-05T20:36:26+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2024-09-05T20:36:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=08533a3ee8f3a09a59cf6ac3be59198b26b7f739'/>
<id>08533a3ee8f3a09a59cf6ac3be59198b26b7f739</id>
<content type='text'>
We had three `utils::` namespaces, all with different "meaning" (host,
device, hsa_utils). We should, when we can, keep "include/Shared"
accessible from host and device, thus RefCountTy has been moved to a
separate header. `hsa_utils` was introduced to make `utils::` less
overloaded. And common functionality was de-duplicated, e.g.,
`utils::advance` and `utils::advanceVoidPtr` -&gt; `utils:advancePtr`. Type
punning now checks for the size of the result to make sure it matches
the source type.

No functional change was intended.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We had three `utils::` namespaces, all with different "meaning" (host,
device, hsa_utils). We should, when we can, keep "include/Shared"
accessible from host and device, thus RefCountTy has been moved to a
separate header. `hsa_utils` was introduced to make `utils::` less
overloaded. And common functionality was de-duplicated, e.g.,
`utils::advance` and `utils::advanceVoidPtr` -&gt; `utils:advancePtr`. Type
punning now checks for the size of the result to make sure it matches
the source type.

No functional change was intended.</pre>
</div>
</content>
</entry>
<entry>
<title>[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587) (#102691)</title>
<updated>2024-08-22T06:10:54+00:00</updated>
<author>
<name>Ethan Luis McDonough</name>
<email>ethanluismcdonough@gmail.com</email>
</author>
<published>2024-08-22T06:10:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=fde2d23ee2a204050a210f2f7b290643a272f737'/>
<id>fde2d23ee2a204050a210f2f7b290643a272f737</id>
<content type='text'>
This pull request is a revised version of #76587. This pull request
fixes some build issues that were present in the previous version of
this change.

&gt; This pull request is the first part of an ongoing effort to extends
PGO instrumentation to GPU device code. This PR makes the following
changes:
&gt;
&gt; - Adds blank registration functions to device RTL
&gt; - Gives PGO globals protected visibility when targeting a supported
GPU
&gt; - Handles any addrspace casts for PGO calls
&gt; - Implements PGO global extraction in GPU plugins (currently only
dumps info)
&gt;
&gt; These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This pull request is a revised version of #76587. This pull request
fixes some build issues that were present in the previous version of
this change.

&gt; This pull request is the first part of an ongoing effort to extends
PGO instrumentation to GPU device code. This PR makes the following
changes:
&gt;
&gt; - Adds blank registration functions to device RTL
&gt; - Gives PGO globals protected visibility when targeting a supported
GPU
&gt; - Handles any addrspace casts for PGO calls
&gt; - Implements PGO global extraction in GPU plugins (currently only
dumps info)
&gt;
&gt; These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "[PGO][OpenMP] Instrumentation for GPU devices (#76587)"</title>
<updated>2024-06-28T17:30:45+00:00</updated>
<author>
<name>Ethan Luis McDonough</name>
<email>ethanluismcdonough@gmail.com</email>
</author>
<published>2024-06-28T17:30:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2c8b912f630f9ec647a4870b9c5ee922c2ec1298'/>
<id>2c8b912f630f9ec647a4870b9c5ee922c2ec1298</id>
<content type='text'>
This reverts commit 5fd2af38e461445c583d7ffc2fe23858966eee76. It caused build issues and broke the buildbot.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 5fd2af38e461445c583d7ffc2fe23858966eee76. It caused build issues and broke the buildbot.
</pre>
</div>
</content>
</entry>
<entry>
<title>[PGO][OpenMP] Instrumentation for GPU devices (#76587)</title>
<updated>2024-06-28T15:42:19+00:00</updated>
<author>
<name>Ethan Luis McDonough</name>
<email>ethanluismcdonough@gmail.com</email>
</author>
<published>2024-06-28T15:42:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=5fd2af38e461445c583d7ffc2fe23858966eee76'/>
<id>5fd2af38e461445c583d7ffc2fe23858966eee76</id>
<content type='text'>
This pull request is the first part of an ongoing effort to extends PGO
instrumentation to GPU device code. This PR makes the following changes:

- Adds blank registration functions to device RTL
- Gives PGO globals protected visibility when targeting a supported GPU
- Handles any addrspace casts for PGO calls
- Implements PGO global extraction in GPU plugins (currently only dumps
info)

These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This pull request is the first part of an ongoing effort to extends PGO
instrumentation to GPU device code. This PR makes the following changes:

- Adds blank registration functions to device RTL
- Gives PGO globals protected visibility when targeting a supported GPU
- Handles any addrspace casts for PGO calls
- Implements PGO global extraction in GPU plugins (currently only dumps
info)

These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.</pre>
</div>
</content>
</entry>
</feed>
