<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>Revert "[RegAlloc] Fix the terminal rule check for interfere with DstReg (#168661)"</title>
<updated>2025-11-23T05:17:45+00:00</updated>
<author>
<name>Aiden Grossman</name>
<email>aidengrossman@google.com</email>
</author>
<published>2025-11-23T05:17:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d5f3ab8ec97786476a077b0c8e35c7c337dfddf2'/>
<id>d5f3ab8ec97786476a077b0c8e35c7c337dfddf2</id>
<content type='text'>
This reverts commit 0859ac5866a0228f5607dd329f83f4a9622dedcc.

This caused a couple test failures, likely due to a mid-air collision.
Reverting for now to get the tree back to green and allow the original
author to run UTC/friends and verify the output.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 0859ac5866a0228f5607dd329f83f4a9622dedcc.

This caused a couple test failures, likely due to a mid-air collision.
Reverting for now to get the tree back to green and allow the original
author to run UTC/friends and verify the output.
</pre>
</div>
</content>
</entry>
<entry>
<title>[RegAlloc] Fix the terminal rule check for interfere with DstReg (#168661)</title>
<updated>2025-11-23T02:11:24+00:00</updated>
<author>
<name>hstk30-hw</name>
<email>hanwei62@huawei.com</email>
</author>
<published>2025-11-23T02:11:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=0859ac5866a0228f5607dd329f83f4a9622dedcc'/>
<id>0859ac5866a0228f5607dd329f83f4a9622dedcc</id>
<content type='text'>
This maybe a bug which is introduced by commit
6749ae36b4a33769e7a77cf812d7cd0a908ae3b9, and has been present ever
since.
In this case, `OtherReg` always overlaps with `DstReg` cause they from
the `Copy` all.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This maybe a bug which is introduced by commit
6749ae36b4a33769e7a77cf812d7cd0a908ae3b9, and has been present ever
since.
In this case, `OtherReg` always overlaps with `DstReg` cause they from
the `Copy` all.</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Use v_mov_b32 to implement divergent zext i32-&gt;i64 (#168166)</title>
<updated>2025-11-15T04:19:24+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-11-15T04:19:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=0fa6a67a4200ea1516f56e298df4a671af8a0642'/>
<id>0fa6a67a4200ea1516f56e298df4a671af8a0642</id>
<content type='text'>
Some cases are relying on SIFixSGPRCopies to force VALU
reg_sequence inputs with SGPR inputs to use all VGPR inputs,
but this doesn't always happen if the reg_sequence isn't
invalid. Make sure we use a vgpr up-front here so we don't
rely on something later.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some cases are relying on SIFixSGPRCopies to force VALU
reg_sequence inputs with SGPR inputs to use all VGPR inputs,
but this doesn't always happen if the reg_sequence isn't
invalid. Make sure we use a vgpr up-front here so we don't
rely on something later.</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Relax shouldCoalesce to allow more register tuple widening (#166475)</title>
<updated>2025-11-11T21:50:57+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-11-11T21:50:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=bbde792786dc93fc07cf245dd118f9d8b018de11'/>
<id>bbde792786dc93fc07cf245dd118f9d8b018de11</id>
<content type='text'>
Allow widening up to 128-bit registers or if the new register class
is at least as large as one of the existing register classes.

This was artificially limiting. In particular this was doing the wrong
thing with sequences involving copies between VGPRs and AV registers.
Nearly all test changes are improvements.

The coalescer does not just widen registers out of nowhere. If it's
trying
to "widen" a register, it's generally packing a register into an
existing
register tuple, or in a situation where the constraints imply the wider
class anyway. 067a11015 addressed the allocation failure concern by
rejecting coalescing if there are no available registers. The original
change in a4e63ead4b didn't include a realistic testcase to judge if
this is harmful for pressure. I would expect any issues from this to
be of garden variety subreg handling issue. We could use more dynamic
state information here if it really is an issue.

I get the best results by removing this override completely. This is
a smaller step for patch splitting purposes.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow widening up to 128-bit registers or if the new register class
is at least as large as one of the existing register classes.

This was artificially limiting. In particular this was doing the wrong
thing with sequences involving copies between VGPRs and AV registers.
Nearly all test changes are improvements.

The coalescer does not just widen registers out of nowhere. If it's
trying
to "widen" a register, it's generally packing a register into an
existing
register tuple, or in a situation where the constraints imply the wider
class anyway. 067a11015 addressed the allocation failure concern by
rejecting coalescing if there are no available registers. The original
change in a4e63ead4b didn't include a realistic testcase to judge if
this is harmful for pressure. I would expect any issues from this to
be of garden variety subreg handling issue. We could use more dynamic
state information here if it really is an issue.

I get the best results by removing this override completely. This is
a smaller step for patch splitting purposes.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Rework GFX11 VALU Mask Write Hazard (#138663)</title>
<updated>2025-10-28T07:09:28+00:00</updated>
<author>
<name>Carl Ritson</name>
<email>carl.ritson@amd.com</email>
</author>
<published>2025-10-28T07:09:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=385c12134aa6b3215d92ce6034d99fd7aec45dd7'/>
<id>385c12134aa6b3215d92ce6034d99fd7aec45dd7</id>
<content type='text'>
Apply additional counter waits to address VALU writes to SGPRs. Rework
expiry detection and apply wait coalescing to mitigate some of the
additional waits.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Apply additional counter waits to address VALU writes to SGPRs. Rework
expiry detection and apply wait coalescing to mitigate some of the
additional waits.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Reland "Remove redundant s_cmp_lg_* sX, 0" (#164201)</title>
<updated>2025-10-22T13:42:29+00:00</updated>
<author>
<name>LU-JOHN</name>
<email>John.Lu@amd.com</email>
</author>
<published>2025-10-22T13:42:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=9abbec66bfa34922521ef88fad1d6fcd43c1c462'/>
<id>9abbec66bfa34922521ef88fad1d6fcd43c1c462</id>
<content type='text'>
Reland PR https://github.com/llvm/llvm-project/pull/162352. Fix by
excluding SI_PC_ADD_REL_OFFSET from instructions that set SCC = DST!=0.
Passes check-libc-amdgcn-amd-amdhsa now.

Distribution of instructions that allowed a redundant S_CMP to be
deleted in check-libc-amdgcn-amd-amdhsa test:

```
S_AND_B32      485
S_AND_B64      47
S_ANDN2_B32    42
S_ANDN2_B64    277492
S_CSELECT_B64  17631
S_LSHL_B32     6
S_OR_B64       11
```

---------

Signed-off-by: John Lu &lt;John.Lu@amd.com&gt;
Co-authored-by: Matt Arsenault &lt;arsenm2@gmail.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reland PR https://github.com/llvm/llvm-project/pull/162352. Fix by
excluding SI_PC_ADD_REL_OFFSET from instructions that set SCC = DST!=0.
Passes check-libc-amdgcn-amd-amdhsa now.

Distribution of instructions that allowed a redundant S_CMP to be
deleted in check-libc-amdgcn-amd-amdhsa test:

```
S_AND_B32      485
S_AND_B64      47
S_ANDN2_B32    42
S_ANDN2_B64    277492
S_CSELECT_B64  17631
S_LSHL_B32     6
S_OR_B64       11
```

---------

Signed-off-by: John Lu &lt;John.Lu@amd.com&gt;
Co-authored-by: Matt Arsenault &lt;arsenm2@gmail.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "[AMDGPU] Remove redundant s_cmp_lg_* sX, 0 " (#164116)</title>
<updated>2025-10-18T20:38:14+00:00</updated>
<author>
<name>Jan Patrick Lehr</name>
<email>JanPatrick.Lehr@amd.com</email>
</author>
<published>2025-10-18T20:38:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=023b1f6a8ed79b9a0d415522dbb3032a5c5df791'/>
<id>023b1f6a8ed79b9a0d415522dbb3032a5c5df791</id>
<content type='text'>
Reverts llvm/llvm-project#162352

Broke our buildbot:
https://lab.llvm.org/buildbot/#/builders/10/builds/15674
To reproduce

cd llvm-project
cmake -S llvm -B thebuild -C offload/cmake/caches/AMDGPULibcBot.cmake
-GNinja
cd thebuild
ninja
ninja check-libc-amdgcn-amd-amdhsa</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reverts llvm/llvm-project#162352

Broke our buildbot:
https://lab.llvm.org/buildbot/#/builders/10/builds/15674
To reproduce

cd llvm-project
cmake -S llvm -B thebuild -C offload/cmake/caches/AMDGPULibcBot.cmake
-GNinja
cd thebuild
ninja
ninja check-libc-amdgcn-amd-amdhsa</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Remove redundant s_cmp_lg_* sX, 0  (#162352)</title>
<updated>2025-10-18T14:33:47+00:00</updated>
<author>
<name>LU-JOHN</name>
<email>John.Lu@amd.com</email>
</author>
<published>2025-10-18T14:33:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8e5f6dd37cc7d5312a00c24af42026d239c1e9f8'/>
<id>8e5f6dd37cc7d5312a00c24af42026d239c1e9f8</id>
<content type='text'>
Remove redundant s_cmp_lg_* sX, 0 if SALU instruction already sets SCC
if sX!=0.

---------

Signed-off-by: John Lu &lt;John.Lu@amd.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove redundant s_cmp_lg_* sX, 0 if SALU instruction already sets SCC
if sX!=0.

---------

Signed-off-by: John Lu &lt;John.Lu@amd.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>PeepholeOpt: Fix losing subregister indexes on full copies (#161310)</title>
<updated>2025-10-02T04:36:47+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-10-02T04:36:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=c6e280e7ed9e120ba5e8c141bd5c4fd116d076a2'/>
<id>c6e280e7ed9e120ba5e8c141bd5c4fd116d076a2</id>
<content type='text'>
Previously if we had a subregister extract reading from a
full copy, the no-subregister incoming copy would overwrite
the DefSubReg index of the folding context.

There's one ugly rvv regression, but it's a downstream
issue of this; an unnecessary same class reg-to-reg full copy
was avoided.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously if we had a subregister extract reading from a
full copy, the no-subregister incoming copy would overwrite
the DefSubReg index of the folding context.

There's one ugly rvv regression, but it's a downstream
issue of this; an unnecessary same class reg-to-reg full copy
was avoided.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][True16][Codegen] remove another build_vector pattern from true16 (#149861)</title>
<updated>2025-09-04T22:08:18+00:00</updated>
<author>
<name>Brox Chen</name>
<email>guochen2@amd.com</email>
</author>
<published>2025-09-04T22:08:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dfdfc4e490f211155ffd581ea0ba8458e4290db6'/>
<id>dfdfc4e490f211155ffd581ea0ba8458e4290db6</id>
<content type='text'>
Remove another build_vector pattern which takes a i16 but placed in a
VGPR_32 from true16 mode. This stop isel from generating illegal
"vgpr_32 = COPY vgpr_16".

ISel will use vgpr16 build vector pattern in true16 mode instead</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove another build_vector pattern which takes a i16 but placed in a
VGPR_32 from true16 mode. This stop isel from generating illegal
"vgpr_32 = COPY vgpr_16".

ISel will use vgpr16 build vector pattern in true16 mode instead</pre>
</div>
</content>
</entry>
</feed>
