<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/AMDGPU/AMDGPUPerfHintAnalysis.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[llvm] Remove unused includes of SmallSet.h (NFC) (#154893)</title>
<updated>2025-08-22T17:33:46+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-08-22T17:33:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=11b4f110e0f9f1a0f987faa121c594af1bcd43cc'/>
<id>11b4f110e0f9f1a0f987faa121c594af1bcd43cc</id>
<content type='text'>
We just replaced SmallSet&lt;T *, N&gt; with SmallPtrSet&lt;T *, N&gt;, bypassing
the redirection found in SmallSet.h.  With that, we no longer need to
include SmallSet.h in many files.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We just replaced SmallSet&lt;T *, N&gt; with SmallPtrSet&lt;T *, N&gt;, bypassing
the redirection found in SmallSet.h.  With that, we no longer need to
include SmallSet.h in many files.</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)</title>
<updated>2025-08-18T14:01:29+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-08-18T14:01:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=07eb7b76928d6873c60859a0339591ed9e0f512a'/>
<id>07eb7b76928d6873c60859a0339591ed9e0f512a</id>
<content type='text'>
This patch replaces SmallSet&lt;T *, N&gt; with SmallPtrSet&lt;T *, N&gt;.  Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:

  template &lt;typename PointeeType, unsigned N&gt;
class SmallSet&lt;PointeeType*, N&gt; : public SmallPtrSet&lt;PointeeType*, N&gt;
{};

We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch replaces SmallSet&lt;T *, N&gt; with SmallPtrSet&lt;T *, N&gt;.  Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:

  template &lt;typename PointeeType, unsigned N&gt;
class SmallSet&lt;PointeeType*, N&gt; : public SmallPtrSet&lt;PointeeType*, N&gt;
{};

We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Qualify auto. NFC. (#110878)</title>
<updated>2024-10-03T12:07:54+00:00</updated>
<author>
<name>Jay Foad</name>
<email>jay.foad@amd.com</email>
</author>
<published>2024-10-03T12:07:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8d13e7b8c382499c1cf0c2a3184b483e760f266b'/>
<id>8d13e7b8c382499c1cf0c2a3184b483e760f266b</id>
<content type='text'>
Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)</pre>
</div>
</content>
</entry>
<entry>
<title>NewPM/AMDGPU: Port AMDGPUPerfHintAnalysis to new pass manager (#102645)</title>
<updated>2024-08-11T11:11:10+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2024-08-11T11:11:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dd094b2647976050295384ce7e72582607090602'/>
<id>dd094b2647976050295384ce7e72582607090602</id>
<content type='text'>
This was much more difficult than I anticipated. The pass is
not in a good state, with poor test coverage. The legacy PM
does seem to be relying on maintaining the map state between
different SCCs, which seems bad. The pass is going out of its
way to avoid putting the attributes it introduces onto non-callee
functions. If it just added them, we could use them directly
instead of relying on the map, I would think.

The NewPM path uses a ModulePass; I'm not sure if we should be
using CGSCC here but there seems to be some missing infrastructure
to support backend defined ones.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was much more difficult than I anticipated. The pass is
not in a good state, with poor test coverage. The legacy PM
does seem to be relying on maintaining the map state between
different SCCs, which seems bad. The pass is going out of its
way to avoid putting the attributes it introduces onto non-callee
functions. If it just added them, we could use them directly
instead of relying on the map, I would think.

The NewPM path uses a ModulePass; I'm not sure if we should be
using CGSCC here but there seems to be some missing infrastructure
to support backend defined ones.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Use member initializers. NFC.</title>
<updated>2024-07-16T14:29:10+00:00</updated>
<author>
<name>Jay Foad</name>
<email>jay.foad@amd.com</email>
</author>
<published>2024-07-16T14:29:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6bba44e8dcf5bb73e6e82068b194b14aa7d1880e'/>
<id>6bba44e8dcf5bb73e6e82068b194b14aa7d1880e</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Set amdgpu-memory-bound if a basic block has dense global memory access</title>
<updated>2022-07-19T09:46:28+00:00</updated>
<author>
<name>Abinav Puthan Purayil</name>
<email>abinavpp@gmail.com</email>
</author>
<published>2022-07-13T06:40:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=9fa425c1ab2f763da14953f22f005dd1ca581c1c'/>
<id>9fa425c1ab2f763da14953f22f005dd1ca581c1c</id>
<content type='text'>
AMDGPUPerfHintAnalysis doesn't set the memory bound attribute if
FuncInfo::InstCost outweighs MemInstCost even if we have a basic block
with relatively high global memory access. GCNSchedStrategy could revert
optimal scheduling in favour of occupancy which seems to degrade
performance for some kernels. This change introduces the
HasDenseGlobalMemAcc metric in the heuristic that makes the analysis
more conservative in these cases.

This fixes SWDEV-334259/SWDEV-343932

Differential Revision: https://reviews.llvm.org/D129759
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
AMDGPUPerfHintAnalysis doesn't set the memory bound attribute if
FuncInfo::InstCost outweighs MemInstCost even if we have a basic block
with relatively high global memory access. GCNSchedStrategy could revert
optimal scheduling in favour of occupancy which seems to degrade
performance for some kernels. This change introduces the
HasDenseGlobalMemAcc metric in the heuristic that makes the analysis
more conservative in these cases.

This fixes SWDEV-334259/SWDEV-343932

Differential Revision: https://reviews.llvm.org/D129759
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Use default member initialization (NFC)</title>
<updated>2022-07-16T19:44:35+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2022-07-16T19:44:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=deac0ac52395ebe81d528e6d2b638694d85061ae'/>
<id>deac0ac52395ebe81d528e6d2b638694d85061ae</id>
<content type='text'>
Identified with modernize-use-default-member-init.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Identified with modernize-use-default-member-init.
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][NFC] Remove isConstantAddr</title>
<updated>2022-06-17T00:49:29+00:00</updated>
<author>
<name>LiaoChunyu</name>
<email>chunyu@iscas.ac.cn</email>
</author>
<published>2022-06-16T12:58:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6181c192837d799c320030bb227b655b83577130'/>
<id>6181c192837d799c320030bb227b655b83577130</id>
<content type='text'>
fix isConstantAddr defined but not used

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D127959
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fix isConstantAddr defined but not used

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D127959
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Only count global-to-global as indirect accesses</title>
<updated>2022-04-01T12:48:13+00:00</updated>
<author>
<name>Jay Foad</name>
<email>jay.foad@amd.com</email>
</author>
<published>2022-03-31T12:39:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=c246b7bd4a5191d48f68ce12b50e03bfadd2a0b5'/>
<id>c246b7bd4a5191d48f68ce12b50e03bfadd2a0b5</id>
<content type='text'>
Previously any load (global, local or constant) feeding into a
global load or store would be counted as an indirect access. This
patch only counts global loads feeding into a global load or store.
The rationale is that the latency for global loads is generally
much larger than the other kinds.

As a side effect this makes it easier to write small kernels test
cases that are not counted as having indirect accesses, despite
the fact that arguments to the kernel are accessed with an SMEM
load.

Differential Revision: https://reviews.llvm.org/D122804
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously any load (global, local or constant) feeding into a
global load or store would be counted as an indirect access. This
patch only counts global loads feeding into a global load or store.
The rationale is that the latency for global loads is generally
much larger than the other kinds.

As a side effect this makes it easier to write small kernels test
cases that are not counted as having indirect accesses, despite
the fact that arguments to the kernel are accessed with an SMEM
load.

Differential Revision: https://reviews.llvm.org/D122804
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Return better Changed status from AMDGPUPerfHintAnalysis</title>
<updated>2022-02-17T09:31:42+00:00</updated>
<author>
<name>Jay Foad</name>
<email>jay.foad@amd.com</email>
</author>
<published>2022-02-16T11:00:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1822a5ecdd363ffcf465a7ad2e4e6fb92cab69f7'/>
<id>1822a5ecdd363ffcf465a7ad2e4e6fb92cab69f7</id>
<content type='text'>
Differential Revision: https://reviews.llvm.org/D119944
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Differential Revision: https://reviews.llvm.org/D119944
</pre>
</div>
</content>
</entry>
</feed>
