<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/AMDGPU/AMDGPULowerIntrinsics.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[AMDGPU] Add s_cluster_barrier on gfx1250 (#159175)</title>
<updated>2025-09-16T21:49:48+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2025-09-16T21:49:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4ab8dabc252f802134bfea6193f9a274f0bdc143'/>
<id>4ab8dabc252f802134bfea6193f9a274f0bdc143</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Refactor lowering of s_barrier to split barriers (#154648)</title>
<updated>2025-08-28T14:01:20+00:00</updated>
<author>
<name>Nicolai Hähnle</name>
<email>nicolai.haehnle@amd.com</email>
</author>
<published>2025-08-28T14:01:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=353b5e43c64770d1726e8cac5f28dedf6cc7ad40'/>
<id>353b5e43c64770d1726e8cac5f28dedf6cc7ad40</id>
<content type='text'>
Let's do the lowering of non-split into split barriers in a new IR pass,
AMDGPULowerIntrinsics. That way, there is no code duplication between
SelectionDAG and GlobalISel. This simplifies some upcoming extensions to
the code.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Let's do the lowering of non-split into split barriers in a new IR pass,
AMDGPULowerIntrinsics. That way, there is no code duplication between
SelectionDAG and GlobalISel. This simplifies some upcoming extensions to
the code.</pre>
</div>
</content>
</entry>
<entry>
<title>CodeGen: Expand memory intrinsics in PreISelIntrinsicLowering</title>
<updated>2023-06-10T01:04:37+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2023-06-07T13:03:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3c848194f28decca41b7362f9dd35d4939797724'/>
<id>3c848194f28decca41b7362f9dd35d4939797724</id>
<content type='text'>
Expand large or unknown size memory intrinsics into loops in the
default lowering pipeline if the target doesn't have the corresponding
libfunc. Previously AMDGPU had a custom pass which existed to call the
expansion utilities.

With a default no-libcall option, we can remove the libfunc checks in
LoopIdiomRecognize for these, which never made any sense. This also
provides a path to lifting the immarg restriction on
llvm.memcpy.inline.

There seems to be a bug where TLI reports functions as available if
you use -march and not -mtriple.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Expand large or unknown size memory intrinsics into loops in the
default lowering pipeline if the target doesn't have the corresponding
libfunc. Previously AMDGPU had a custom pass which existed to call the
expansion utilities.

With a default no-libcall option, we can remove the libfunc checks in
LoopIdiomRecognize for these, which never made any sense. This also
provides a path to lifting the immarg restriction on
llvm.memcpy.inline.

There seems to be a bug where TLI reports functions as available if
you use -march and not -mtriple.
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Remove r600 local id annotations in AMDGPULowerIntrinsics</title>
<updated>2023-06-07T18:55:55+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2023-06-07T18:27:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8eae660ae0bd0eaded7fca2bdc6002a3444aabaf'/>
<id>8eae660ae0bd0eaded7fca2bdc6002a3444aabaf</id>
<content type='text'>
With these dropped and memory intrinsic moved into a generic pass, we
can drop the whole pass.

No tests fail with this removed. The new amdgcn intrinsics are
annotated in clang up front.  Theoretically may regress r600, but that
would need new testing and support work (r600 ideally would also
follow the clang handling). The regression would be any IR passes
making use of known bits between this point and codegen. The DAG
computeKnownBits understand the intrinsics directly now.

If we wanted to refine these values, a better place would be in
AMDGPUAttributor.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With these dropped and memory intrinsic moved into a generic pass, we
can drop the whole pass.

No tests fail with this removed. The new amdgcn intrinsics are
annotated in clang up front.  Theoretically may regress r600, but that
would need new testing and support work (r600 ideally would also
follow the clang handling). The regression would be any IR passes
making use of known bits between this point and codegen. The DAG
computeKnownBits understand the intrinsics directly now.

If we wanted to refine these values, a better place would be in
AMDGPUAttributor.
</pre>
</div>
</content>
</entry>
<entry>
<title>[iwyu] Handle regressions in libLLVM header include</title>
<updated>2022-05-04T06:32:38+00:00</updated>
<author>
<name>serge-sans-paille</name>
<email>sguelton@redhat.com</email>
</author>
<published>2022-05-03T12:15:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=7030654296a0416bd9402a0278dbd42f1bf268b2'/>
<id>7030654296a0416bd9402a0278dbd42f1bf268b2</id>
<content type='text'>
Running iwyu-diff on LLVM codebase since fa5a4e1b95c8f37796 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D124847
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Running iwyu-diff on LLVM codebase since fa5a4e1b95c8f37796 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D124847
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Directly implement computeKnownBits for workitem intrinsics</title>
<updated>2022-04-22T14:49:50+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2022-04-15T02:40:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=794a0bb547484ec33c13bd6c7c04b1dbd03d040a'/>
<id>794a0bb547484ec33c13bd6c7c04b1dbd03d040a</id>
<content type='text'>
Currently metadata is inserted in a late pass which is lowered
to an AssertZext. The metadata would be more useful if it was
inserted earlier after inlining, but before codegen.

Probably shouldn't change anything now. Just replacing the
late metadata annotation needs more work, since we lose
out on optimizations after these are lowered to CopyFromReg.

Seems to be slightly better than relying on the AssertZext from the
metadata. The test change in cvt_f32_ubyte.ll is a quirk from it using
-start-before=amdgpu-isel instead of running the usual codegen
pipeline.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently metadata is inserted in a late pass which is lowered
to an AssertZext. The metadata would be more useful if it was
inserted earlier after inlining, but before codegen.

Probably shouldn't change anything now. Just replacing the
late metadata annotation needs more work, since we lose
out on optimizations after these are lowered to CopyFromReg.

Seems to be slightly better than relying on the AssertZext from the
metadata. The test change in cvt_f32_ubyte.ll is a quirk from it using
-start-before=amdgpu-isel instead of running the usual codegen
pipeline.
</pre>
</div>
</content>
</entry>
<entry>
<title>[AArch64, AMDGPU] Use make_early_inc_range (NFC)</title>
<updated>2021-11-03T16:22:51+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2021-11-03T16:22:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4bef0304e153c757c9f42c2001d4c56e8f99929e'/>
<id>4bef0304e153c757c9f42c2001d4c56e8f99929e</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets</title>
<updated>2021-01-20T19:22:45+00:00</updated>
<author>
<name>dfukalov</name>
<email>daniil.fukalov@amd.com</email>
</author>
<published>2021-01-20T12:48:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=560d7e04113bf43ed0928a1fbdf328818194141e'/>
<id>560d7e04113bf43ed0928a1fbdf328818194141e</id>
<content type='text'>
... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036
</pre>
</div>
</content>
</entry>
<entry>
<title>[NFC][AMDGPU] Reduce include files dependency.</title>
<updated>2021-01-07T19:22:05+00:00</updated>
<author>
<name>dfukalov</name>
<email>daniil.fukalov@amd.com</email>
</author>
<published>2020-12-25T15:52:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6a87e9b08bf093ba3ccba8650b89f4d337c497f4'/>
<id>6a87e9b08bf093ba3ccba8650b89f4d337c497f4</id>
<content type='text'>
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Use caller subtarget, not intrinsic declaration</title>
<updated>2020-08-27T20:42:09+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2020-08-27T15:14:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a1bc37c9e54e0163bc6ccb7a438a68047310ccff'/>
<id>a1bc37c9e54e0163bc6ccb7a438a68047310ccff</id>
<content type='text'>
Intrinsic declarations use the default subtarget, but this should be
using the subtarget for the calling function. I haven't been able to
come up with a case where it matters though.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Intrinsic declarations use the default subtarget, but this should be
using the subtarget for the calling function. I haven't been able to
come up with a case where it matters though.
</pre>
</div>
</content>
</entry>
</feed>
