<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp, branch users/fmayer/spr/main.flowsensitive-statusor-2n-add-minimal-model</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[AMDGPU][NFC] Use `getScoreUB` for XCNT insertion. (#162448)</title>
<updated>2025-10-13T05:37:05+00:00</updated>
<author>
<name>Aaditya</name>
<email>115080342+easyonaadit@users.noreply.github.com</email>
</author>
<published>2025-10-13T05:37:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=982c9e6ac52a13483a08fdcf007a565d41cf4615'/>
<id>982c9e6ac52a13483a08fdcf007a565d41cf4615</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Account for implicit XCNT insertion (#160812)</title>
<updated>2025-10-03T08:08:37+00:00</updated>
<author>
<name>Aaditya</name>
<email>115080342+easyonaadit@users.noreply.github.com</email>
</author>
<published>2025-10-03T08:08:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=19cd5bd350b730da35629de5095764861f70ecee'/>
<id>19cd5bd350b730da35629de5095764861f70ecee</id>
<content type='text'>
Hardware inserts an implicit `S_WAIT_XCNT 0` between 
alternate SMEM and VMEM instructions, so there are 
never outstanding address translations for both SMEM 
and VMEM at the same time.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Hardware inserts an implicit `S_WAIT_XCNT 0` between 
alternate SMEM and VMEM instructions, so there are 
never outstanding address translations for both SMEM 
and VMEM at the same time.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][SIInsertWaitCnts] Remove redundant TII/TRI/MRI arguments (NFC) (#161357)</title>
<updated>2025-10-01T10:08:12+00:00</updated>
<author>
<name>Pierre van Houtryve</name>
<email>pierre.vanhoutryve@amd.com</email>
</author>
<published>2025-10-01T10:08:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a3f667bc08cb2f547ff2139c975dd684cf005885'/>
<id>a3f667bc08cb2f547ff2139c975dd684cf005885</id>
<content type='text'>
WaitCntBrackets already has a pointer to its SIInsertWaitCnt instance.
With a small change, it can directly access TII/TRI/MRI that way.
This simplifies a lot of call sites which make the code easier to
follow.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
WaitCntBrackets already has a pointer to its SIInsertWaitCnt instance.
With a small change, it can directly access TII/TRI/MRI that way.
This simplifies a lot of call sites which make the code easier to
follow.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][SIInsertWaitCnts] De-duplicate code (NFC) (#161161)</title>
<updated>2025-10-01T08:53:32+00:00</updated>
<author>
<name>Pierre van Houtryve</name>
<email>pierre.vanhoutryve@amd.com</email>
</author>
<published>2025-10-01T08:53:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=88c668d050aceb27c161f82474efa0004eced9b2'/>
<id>88c668d050aceb27c161f82474efa0004eced9b2</id>
<content type='text'>
I'm reading through the pass over and over again to try and learn how it works. I noticed some code duplication here and there while doing that.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I'm reading through the pass over and over again to try and learn how it works. I noticed some code duplication here and there while doing that.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][InsertWaitCnts] Refactor some helper functions, NFC (#161160)</title>
<updated>2025-10-01T08:51:00+00:00</updated>
<author>
<name>Pierre van Houtryve</name>
<email>pierre.vanhoutryve@amd.com</email>
</author>
<published>2025-10-01T08:51:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=14fcd81861aa1576d204b9146345c4426d81fc49'/>
<id>14fcd81861aa1576d204b9146345c4426d81fc49</id>
<content type='text'>
- Remove one-line wrappers around a simple function call when they're
only used once or twice.
- Move very generic helpers into SIInstrInfo
- Delete unused functions

The goal is simply to reduce the noise in SIInsertWaitCnts without
hiding functionality. I focused on moving trivial helpers, or helpers
with very descriptive/verbose names (so it doesn't hide too much logic
away from the pass), and that have some reusability potential.

I'm also trying to make the code style more consistent. It doesn't make
sense to see a function call `TII-&gt;isXXX` then suddenly call a random
`isY` method that just wraps around `TII-&gt;isY`.

The context of this work is that I'm trying to learn how this pass
works, and while going through the code I noticed some little things
here and there that I thought would be good to fix.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Remove one-line wrappers around a simple function call when they're
only used once or twice.
- Move very generic helpers into SIInstrInfo
- Delete unused functions

The goal is simply to reduce the noise in SIInsertWaitCnts without
hiding functionality. I focused on moving trivial helpers, or helpers
with very descriptive/verbose names (so it doesn't hide too much logic
away from the pass), and that have some reusability potential.

I'm also trying to make the code style more consistent. It doesn't make
sense to see a function call `TII-&gt;isXXX` then suddenly call a random
`isY` method that just wraps around `TII-&gt;isY`.

The context of this work is that I'm trying to learn how this pass
works, and while going through the code I noticed some little things
here and there that I thought would be good to fix.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Skip debug uses in SIInsertWaitcnts::shouldFlushVmCnt (#160818)</title>
<updated>2025-09-26T07:39:11+00:00</updated>
<author>
<name>Jay Foad</name>
<email>jay.foad@amd.com</email>
</author>
<published>2025-09-26T07:39:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8cd917bc80eb9882cdc38d49ed82d855820d7e6c'/>
<id>8cd917bc80eb9882cdc38d49ed82d855820d7e6c</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][SIInsertWaitcnts] Track SCC. Insert KM_CNT waits for SCC writes. (#157843)</title>
<updated>2025-09-18T12:41:01+00:00</updated>
<author>
<name>Petar Avramovic</name>
<email>Petar.Avramovic@amd.com</email>
</author>
<published>2025-09-18T12:41:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2ec7959b96ecc85fef3dcb50bab54b9f76f603d4'/>
<id>2ec7959b96ecc85fef3dcb50bab54b9f76f603d4</id>
<content type='text'>
Add new event SCC_WRITE for s_barrier_signal_isfirst and s_barrier_leave,
instructions that write to SCC, counter is KM_CNT.
Also start tracking SCC for reads and writes.
s_barrier_wait on the same barrier guarantees that the SCC write from
s_barrier_signal_isfirst has landed, no need to insert s_wait_kmcnt.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add new event SCC_WRITE for s_barrier_signal_isfirst and s_barrier_leave,
instructions that write to SCC, counter is KM_CNT.
Also start tracking SCC for reads and writes.
s_barrier_wait on the same barrier guarantees that the SCC write from
s_barrier_signal_isfirst has landed, no need to insert s_wait_kmcnt.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][CodeGen][True16] Track waitcnt as vgpr32 instead of vgpr16 for D16 Instructions in GFX11 (#157795)</title>
<updated>2025-09-17T14:09:06+00:00</updated>
<author>
<name>Brox Chen</name>
<email>guochen2@amd.com</email>
</author>
<published>2025-09-17T14:09:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2b2b580c8d4560e724cca7ca856ce7171c3a4628'/>
<id>2b2b580c8d4560e724cca7ca856ce7171c3a4628</id>
<content type='text'>
It seems the VMEM access on hi/lo half could interfere the other half.
Track waitcnt of vgpr32 instead of vgpr16 for 16bit reg in GFX11.

---------

Co-authored-by: Joe Nash &lt;joseph.nash@amd.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It seems the VMEM access on hi/lo half could interfere the other half.
Track waitcnt of vgpr32 instead of vgpr16 for 16bit reg in GFX11.

---------

Co-authored-by: Joe Nash &lt;joseph.nash@amd.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Remove scope check in SIInsertWaitcnts::generateWaitcntInstBefore (#157821)</title>
<updated>2025-09-12T18:51:36+00:00</updated>
<author>
<name>choikwa</name>
<email>5455710+choikwa@users.noreply.github.com</email>
</author>
<published>2025-09-12T18:51:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ef7de8d1447c822dec72d685d85053216936b895'/>
<id>ef7de8d1447c822dec72d685d85053216936b895</id>
<content type='text'>
This change was motivated by CK where many VMCNT(0)'s were generated due
to instructions lacking !alias.scope metadata. The two causes of this
were:
1) LowerLDSModule not tacking on scope metadata on a single LDS variable
2) IPSCCP pass before inliner replacing noalias ptr derivative with a
global value, which made inliner unable to track it back to the noalias
   ptr argument.

However, it turns out that IPSCCP losing the scope information was
largely ineffectual as ScopedNoAliasAA was able to handle asymmetric
condition, where one MemLoc was missing scope, and still return NoAlias
result.

AMDGPU however was checking for existence of scope in SIInsertWaitcnts
and conservatively treating it as aliasing all and inserted VMCNT(0)
before DS_READs, forcing it to wait for all previous LDS DMA
instructions.

Since we know that ScopedNoAliasAA can handle asymmetry, we should also
allow AA query to determine if two MIs may alias.

Passed PSDB.

Previous attempt to address the issue in IPSCCP, likely stalled:
https://github.com/llvm/llvm-project/pull/154522
This solution may be preferrable over that as issue only affects AMDGPU.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This change was motivated by CK where many VMCNT(0)'s were generated due
to instructions lacking !alias.scope metadata. The two causes of this
were:
1) LowerLDSModule not tacking on scope metadata on a single LDS variable
2) IPSCCP pass before inliner replacing noalias ptr derivative with a
global value, which made inliner unable to track it back to the noalias
   ptr argument.

However, it turns out that IPSCCP losing the scope information was
largely ineffectual as ScopedNoAliasAA was able to handle asymmetric
condition, where one MemLoc was missing scope, and still return NoAlias
result.

AMDGPU however was checking for existence of scope in SIInsertWaitcnts
and conservatively treating it as aliasing all and inserted VMCNT(0)
before DS_READs, forcing it to wait for all previous LDS DMA
instructions.

Since we know that ScopedNoAliasAA can handle asymmetry, we should also
allow AA query to determine if two MIs may alias.

Passed PSDB.

Previous attempt to address the issue in IPSCCP, likely stalled:
https://github.com/llvm/llvm-project/pull/154522
This solution may be preferrable over that as issue only affects AMDGPU.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Define 1024 VGPRs on gfx1250 (#156765)</title>
<updated>2025-09-03T23:25:18+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2025-09-03T23:25:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6aebbb0a85a6a675f58e4e727e2e161e03e6e13a'/>
<id>6aebbb0a85a6a675f58e4e727e2e161e03e6e13a</id>
<content type='text'>
This is a baseline support, it is not useable yet.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a baseline support, it is not useable yet.</pre>
</div>
</content>
</entry>
</feed>
