<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/offload/DeviceRTL/src/DeviceUtils.cpp, branch release/20</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[OpenMP] Replace utilities with 'gpuintrin.h' definitions (#131644)</title>
<updated>2025-03-19T15:47:21+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-03-19T15:47:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=cb493d2bab40a480e3daba22ffee54d024817f05'/>
<id>cb493d2bab40a480e3daba22ffee54d024817f05</id>
<content type='text'>
Summary:
Port more instructions. AMD version is at
https://gist.github.com/jhuber6/235d7ee95f747c75f9a3cfd8eedac6aa</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
Port more instructions. AMD version is at
https://gist.github.com/jhuber6/235d7ee95f747c75f9a3cfd8eedac6aa</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Port the OpenMP device runtime to direct C++ compilation (#123673)</title>
<updated>2025-02-05T14:18:52+00:00</updated>
<author>
<name>Joseph Huber</name>
<email>huberjn@outlook.com</email>
</author>
<published>2025-02-05T14:18:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=bb7ab2557c485e004e619570cca7e2204b98a71b'/>
<id>bb7ab2557c485e004e619570cca7e2204b98a71b</id>
<content type='text'>
Summary:
This removes the use of OpenMP offloading to build the device runtime.
The main benefit here is that we no longer need to rely on offloading
semantics to build a device only runtime. Things like variants are now
no longer needed and can just be simple if-defs. In the future, I will
remove most of the special handling here and fold it into calls to the
`&lt;gpuintrin.h&gt;` functions instead. Additionally I will rework the
compilation to make this a separate runtime.

The current plan is to have this, but make including OpenMP and
offloading either automatically add it, or print a warning if it's
missing. This will allow us to use a normal CMake workflow and delete
all the weird 'lets pull the clang binary out of the build' business.
```
-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=offload
-DLLVM_RUNTIME_TARGETS=amdgcn-amd-amdhsa
```

After that, linking the OpenMP device runtime will be `-Xoffload-linker
-lomp`. I.e. no more fat binary business.

Only look at the most recent commit since this includes the two
dependencies
(fix to AMDGPUEmitPrintfBinding and the PointerToMember bug).</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
This removes the use of OpenMP offloading to build the device runtime.
The main benefit here is that we no longer need to rely on offloading
semantics to build a device only runtime. Things like variants are now
no longer needed and can just be simple if-defs. In the future, I will
remove most of the special handling here and fold it into calls to the
`&lt;gpuintrin.h&gt;` functions instead. Additionally I will rework the
compilation to make this a separate runtime.

The current plan is to have this, but make including OpenMP and
offloading either automatically add it, or print a warning if it's
missing. This will allow us to use a normal CMake workflow and delete
all the weird 'lets pull the clang binary out of the build' business.
```
-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=offload
-DLLVM_RUNTIME_TARGETS=amdgcn-amd-amdhsa
```

After that, linking the OpenMP device runtime will be `-Xoffload-linker
-lomp`. I.e. no more fat binary business.

Only look at the most recent commit since this includes the two
dependencies
(fix to AMDGPUEmitPrintfBinding and the PointerToMember bug).</pre>
</div>
</content>
</entry>
<entry>
<title>[Offload][NFC] Reorganize `utils::` and make Device/Host/Shared clearer (#100280)</title>
<updated>2024-09-05T20:36:26+00:00</updated>
<author>
<name>Johannes Doerfert</name>
<email>johannes@jdoerfert.de</email>
</author>
<published>2024-09-05T20:36:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=08533a3ee8f3a09a59cf6ac3be59198b26b7f739'/>
<id>08533a3ee8f3a09a59cf6ac3be59198b26b7f739</id>
<content type='text'>
We had three `utils::` namespaces, all with different "meaning" (host,
device, hsa_utils). We should, when we can, keep "include/Shared"
accessible from host and device, thus RefCountTy has been moved to a
separate header. `hsa_utils` was introduced to make `utils::` less
overloaded. And common functionality was de-duplicated, e.g.,
`utils::advance` and `utils::advanceVoidPtr` -&gt; `utils:advancePtr`. Type
punning now checks for the size of the result to make sure it matches
the source type.

No functional change was intended.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We had three `utils::` namespaces, all with different "meaning" (host,
device, hsa_utils). We should, when we can, keep "include/Shared"
accessible from host and device, thus RefCountTy has been moved to a
separate header. `hsa_utils` was introduced to make `utils::` less
overloaded. And common functionality was de-duplicated, e.g.,
`utils::advance` and `utils::advanceVoidPtr` -&gt; `utils:advancePtr`. Type
punning now checks for the size of the result to make sure it matches
the source type.

No functional change was intended.</pre>
</div>
</content>
</entry>
</feed>
