<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp, branch users/mingmingl-llvm/samplefdo-profile-format</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>AMDGPU/UniformityAnalysis: fix G_ZEXTLOAD and G_SEXTLOAD (#157845)</title>
<updated>2025-09-10T15:57:15+00:00</updated>
<author>
<name>Petar Avramovic</name>
<email>Petar.Avramovic@amd.com</email>
</author>
<published>2025-09-10T15:57:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=41c685975e17704b25e461744ebd57429cdd95f1'/>
<id>41c685975e17704b25e461744ebd57429cdd95f1</id>
<content type='text'>
Use same rules for G_ZEXTLOAD and G_SEXTLOAD as for G_LOAD.
Flat addrspace(0) and private addrspace(5) G_ZEXTLOAD and G_SEXTLOAD
should be always divergent.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use same rules for G_ZEXTLOAD and G_SEXTLOAD as for G_LOAD.
Flat addrspace(0) and private addrspace(5) G_ZEXTLOAD and G_SEXTLOAD
should be always divergent.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Restrict scale operands of WMMA to low 256 VGPRs (#157526)</title>
<updated>2025-09-08T22:44:51+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2025-09-08T22:44:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=b0ee92be94746e05e9c015fcc6f7533e6b222685'/>
<id>b0ee92be94746e05e9c015fcc6f7533e6b222685</id>
<content type='text'>
These cannot accept high registers.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These cannot accept high registers.</pre>
</div>
</content>
</entry>
<entry>
<title>CodeGen: Pass SubtargetInfo to TargetGenInstrInfo constructors (#157337)</title>
<updated>2025-09-08T03:12:19+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-09-08T03:12:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=727e9f5ea5b2bb9d2fa37619ad2f19b21af7ce4d'/>
<id>727e9f5ea5b2bb9d2fa37619ad2f19b21af7ce4d</id>
<content type='text'>
This will make it possible for tablegen to make subtarget
dependent decisions without adding new arguments to every
target.

---------

Co-authored-by: Sergei Barannikov &lt;barannikov88@gmail.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This will make it possible for tablegen to make subtarget
dependent decisions without adding new arguments to every
target.

---------

Co-authored-by: Sergei Barannikov &lt;barannikov88@gmail.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Allow folding multiple uses of some immediates into copies (#154757)</title>
<updated>2025-09-05T23:22:09+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-09-05T23:22:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=884130bf9309872d3d472d984a3c3bb90f454083'/>
<id>884130bf9309872d3d472d984a3c3bb90f454083</id>
<content type='text'>
In some cases this will require an avoidable re-defining of
a register, but it works out better most of the time. Also allow
folding 64-bit immediates into subregister extracts, unless it would
break an inline constant.

We could be more aggressive here, but this set of conditions seems
to do a reasonable job without introducing too many regressions.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In some cases this will require an avoidable re-defining of
a register, but it works out better most of the time. Also allow
folding 64-bit immediates into subregister extracts, unless it would
break an inline constant.

We could be more aggressive here, but this set of conditions seems
to do a reasonable job without introducing too many regressions.</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Remove flat special case in getRegClass (#156991)</title>
<updated>2025-09-05T22:42:16+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-09-05T22:42:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d096b1d48e0b433b503eebe61237b66169bcc16d'/>
<id>d096b1d48e0b433b503eebe61237b66169bcc16d</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] High VGPR lowering on gfx1250 (#156965)</title>
<updated>2025-09-04T23:20:47+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2025-09-04T23:20:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1f0f3473e60a7f0ce13ce30994d8ca66cdb02326'/>
<id>1f0f3473e60a7f0ce13ce30994d8ca66cdb02326</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][gfx1250] Add 128B cooperative atomics (#156418)</title>
<updated>2025-09-04T09:19:25+00:00</updated>
<author>
<name>Pierre van Houtryve</name>
<email>pierre.vanhoutryve@amd.com</email>
</author>
<published>2025-09-04T09:19:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=e2bd10cf16c3f90813de5b64f348ece035a6bb68'/>
<id>e2bd10cf16c3f90813de5b64f348ece035a6bb68</id>
<content type='text'>
- Add clang built-ins + sema/codegen
- Add IR Intrinsic + verifier
- Add DAG/GlobalISel codegen for the intrinsics
- Add lowering in SIMemoryLegalizer using a MMO flag.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Add clang built-ins + sema/codegen
- Add IR Intrinsic + verifier
- Add DAG/GlobalISel codegen for the intrinsics
- Add lowering in SIMemoryLegalizer using a MMO flag.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Tail call support for whole wave functions (#145860)</title>
<updated>2025-09-04T08:34:43+00:00</updated>
<author>
<name>Diana Picus</name>
<email>Diana-Magda.Picus@amd.com</email>
</author>
<published>2025-09-04T08:34:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=018dc1b3977bb249d55a6808bb45802a10f818fa'/>
<id>018dc1b3977bb249d55a6808bb45802a10f818fa</id>
<content type='text'>
Support tail calls to whole wave functions (trivial) and from whole wave
functions (slightly more involved because we need a new pseudo for the
tail call return, that patches up the EXEC mask).

Move the expansion of whole wave function return pseudos (regular and
tail call returns) to prolog epilog insertion, since that's where we
patch up the EXEC mask.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Support tail calls to whole wave functions (trivial) and from whole wave
functions (slightly more involved because we need a new pseudo for the
tail call return, that patches up the EXEC mask).

Move the expansion of whole wave function return pseudos (regular and
tail call returns) to prolog epilog insertion, since that's where we
patch up the EXEC mask.</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Remove the DS special case in getRegClass (#156696)</title>
<updated>2025-09-04T06:14:17+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-09-04T06:14:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a23a5b06839011569590af5c9bbfb5197b24261b'/>
<id>a23a5b06839011569590af5c9bbfb5197b24261b</id>
<content type='text'>
These instructions should now have proper representation
with separate instructions for operands which must be paired.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These instructions should now have proper representation
with separate instructions for operands which must be paired.</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Special case align requirement for AV_MOV_B64_IMM_PSEUDO</title>
<updated>2025-09-04T00:55:39+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-09-04T00:45:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dc170c7e315ee3f6a194ba81d044d7e8784b0221'/>
<id>dc170c7e315ee3f6a194ba81d044d7e8784b0221</id>
<content type='text'>
This should not require aligned registers. Fixes expensive_checks
test failure. I don't see a better way until the new system
to specify the alignment per register is done.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This should not require aligned registers. Fixes expensive_checks
test failure. I don't see a better way until the new system
to specify the alignment per register is done.
</pre>
</div>
</content>
</entry>
</feed>
