<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/AMDGPU/AMDGPUInsertDelayAlu.cpp, branch users/mingmingl-llvm/samplefdo-profile-format</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>AMDGPU: Treat WMMA XDL ops as TRANS in S_DELAY_ALU insertion for gfx1250 (#149208)</title>
<updated>2025-07-17T00:07:48+00:00</updated>
<author>
<name>Changpeng Fang</name>
<email>changpeng.fang@amd.com</email>
</author>
<published>2025-07-17T00:07:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=b52cf756ced2aefd05b7e2f01026c941f9a04c47'/>
<id>b52cf756ced2aefd05b7e2f01026c941f9a04c47</id>
<content type='text'>
WMMA XDL instructions are tracked as TRANs ops and the compiler should
consider them the same as TRANS in S_DELAY_ALU insertion. We use a searchable
table for the InsertDelayAlu pass to recognize these WMMA XDL instructions.

Co-authored-by: Stefan Stipanovic &lt;Stefan.Stipanovic@amd.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
WMMA XDL instructions are tracked as TRANs ops and the compiler should
consider them the same as TRANS in S_DELAY_ALU insertion. We use a searchable
table for the InsertDelayAlu pass to recognize these WMMA XDL instructions.

Co-authored-by: Stefan Stipanovic &lt;Stefan.Stipanovic@amd.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Fix comment on DelayInfo::advance (#146718)</title>
<updated>2025-07-02T16:18:57+00:00</updated>
<author>
<name>Jay Foad</name>
<email>jay.foad@amd.com</email>
</author>
<published>2025-07-02T16:18:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=e717e503cae19e7e965972724a1476b2fa66ad2d'/>
<id>e717e503cae19e7e965972724a1476b2fa66ad2d</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Fix bad removal of s_delay_alu (#145728)</title>
<updated>2025-06-27T14:15:10+00:00</updated>
<author>
<name>Ana Mihajlovic</name>
<email>Ana.Mihajlovic@amd.com</email>
</author>
<published>2025-06-27T14:15:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=08d747c1ef659074549def24a9e92d6604e08e61'/>
<id>08d747c1ef659074549def24a9e92d6604e08e61</id>
<content type='text'>
instructionWaitsForSGPRWrites function covers ALL SALU instructions,
including those like s_waitcnt that don't read from sgpr. This results
in removing delay_alu instructions in cases like VALU-&gt;SGPR-&gt;VALU, which
results in performance regression. Change modifies the function so that
it checks if instruction also reads a sgpr.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
instructionWaitsForSGPRWrites function covers ALL SALU instructions,
including those like s_waitcnt that don't read from sgpr. This results
in removing delay_alu instructions in cases like VALU-&gt;SGPR-&gt;VALU, which
results in performance regression. Change modifies the function so that
it checks if instruction also reads a sgpr.</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Use *Set::insert_range (NFC) (#132509)</title>
<updated>2025-03-22T15:07:33+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-03-22T15:07:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1b189cab5e582a183f6946dcb3e20913add58476'/>
<id>1b189cab5e582a183f6946dcb3e20913add58476</id>
<content type='text'>
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch uses insert_range in
conjunction with llvm::{predecessors,successors} and
MachineBasicBlock::{predecessors,successors}.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch uses insert_range in
conjunction with llvm::{predecessors,successors} and
MachineBasicBlock::{predecessors,successors}.</pre>
</div>
</content>
</entry>
<entry>
<title>Reland "[AMDGPU] Remove s_delay_alu for VALU-&gt;SGPR-&gt;SALU (#127212)" (#131111)</title>
<updated>2025-03-13T09:26:20+00:00</updated>
<author>
<name>Ana Mihajlovic</name>
<email>Ana.Mihajlovic@amd.com</email>
</author>
<published>2025-03-13T09:26:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=459b4e3fe10805b110bc89aa857532715bfe54e0'/>
<id>459b4e3fe10805b110bc89aa857532715bfe54e0</id>
<content type='text'>
We have a VALU-&gt;SGPR-&gt;SALU (VALU writing to SGPR and SALU reading from
it). When VALU is issued, it increments internal counter VA_SDST used to
track use of this SGPR. SALU will not issue until VA_SDST is zero, that
is when VALU is finished writing. Therefore, delays added by s_delay_alu
are not needed in this situation.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have a VALU-&gt;SGPR-&gt;SALU (VALU writing to SGPR and SALU reading from
it). When VALU is issued, it increments internal counter VA_SDST used to
track use of this SGPR. SALU will not issue until VA_SDST is zero, that
is when VALU is finished writing. Therefore, delays added by s_delay_alu
are not needed in this situation.</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "[AMDGPU] Remove s_delay_alu for VALU-&gt;SGPR-&gt;SALU (#127212)"</title>
<updated>2025-03-12T19:09:09+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-03-12T19:09:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=aa008e00085a260b2ed130b3430cc4640144ab30'/>
<id>aa008e00085a260b2ed130b3430cc4640144ab30</id>
<content type='text'>
This reverts commit 71582c6667a6334c688734cae628e906b3c1ac1d.

Multiple buildbot failures have been reported:
https://github.com/llvm/llvm-project/pull/127212
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 71582c6667a6334c688734cae628e906b3c1ac1d.

Multiple buildbot failures have been reported:
https://github.com/llvm/llvm-project/pull/127212
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Remove s_delay_alu for VALU-&gt;SGPR-&gt;SALU (#127212)</title>
<updated>2025-03-12T16:33:07+00:00</updated>
<author>
<name>Ana Mihajlovic</name>
<email>Ana.Mihajlovic@amd.com</email>
</author>
<published>2025-03-12T16:33:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=71582c6667a6334c688734cae628e906b3c1ac1d'/>
<id>71582c6667a6334c688734cae628e906b3c1ac1d</id>
<content type='text'>
We have a VALU-&gt;SGPR-&gt;SALU (VALU writing to SGPR and SALU reading from
it). When VALU is issued, it increments internal counter VA_SDST used to
track use of this SGPR. SALU will not issue until VA_SDST is zero, that
is when VALU is finished writing. Therefore, delays added by s_delay_alu
are not needed in this situation.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have a VALU-&gt;SGPR-&gt;SALU (VALU writing to SGPR and SALU reading from
it). When VALU is issued, it increments internal counter VA_SDST used to
track use of this SGPR. SALU will not issue until VA_SDST is zero, that
is when VALU is finished writing. Therefore, delays added by s_delay_alu
are not needed in this situation.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Avoid repeated hash lookups (NFC) (#130235)</title>
<updated>2025-03-07T07:58:16+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-03-07T07:58:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6cb2f6de9b3cf0e72b7d45c9fc149457b3462ca3'/>
<id>6cb2f6de9b3cf0e72b7d45c9fc149457b3462ca3</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][NewPM] Port AMDGPUInsertDelayAlu to NPM (#128003)</title>
<updated>2025-02-26T04:20:09+00:00</updated>
<author>
<name>Akshat Oke</name>
<email>Akshat.Oke@amd.com</email>
</author>
<published>2025-02-26T04:20:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=852923822fd085d304988c24f9b02edebe5e7903'/>
<id>852923822fd085d304988c24f9b02edebe5e7903</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Use the SchedModel available in SIInstrInfo (#110859)</title>
<updated>2024-10-02T16:17:27+00:00</updated>
<author>
<name>Juan Manuel Martinez Caamaño</name>
<email>jmartinezcaamao@gmail.com</email>
</author>
<published>2024-10-02T16:17:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d617371375cba53e44eccedbac976f2b74df4f23'/>
<id>d617371375cba53e44eccedbac976f2b74df4f23</id>
<content type='text'>
Instead of allocating an initializing a new instance in
`GCNHazardRecognizer` and `AMDGPUInsertDelayAlu`.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of allocating an initializing a new instance in
`GCNHazardRecognizer` and `AMDGPUInsertDelayAlu`.</pre>
</div>
</content>
</entry>
</feed>
