<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp, branch users/chapuni/cov/single/unify</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[AMDGPU] Do not fold into v_accvpr_mov/write/read (#120475)</title>
<updated>2025-01-07T15:25:01+00:00</updated>
<author>
<name>bcahoon</name>
<email>59846893+bcahoon@users.noreply.github.com</email>
</author>
<published>2025-01-07T15:25:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=17c8c1c5098bd1fa68809d686867d01d56d5e564'/>
<id>17c8c1c5098bd1fa68809d686867d01d56d5e564</id>
<content type='text'>
In SIFoldOperands, leave copies for moving between agpr and vgpr
registers. The register coalescer is able to handle the copies
more efficiently than v_accvgpr_mov, v_accvgpr_write, and
v_accvgpr_read. Otherwise, the compiler generates unneccesary
instructions such as v_accvgpr_mov a0, a0.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In SIFoldOperands, leave copies for moving between agpr and vgpr
registers. The register coalescer is able to handle the copies
more efficiently than v_accvgpr_mov, v_accvgpr_write, and
v_accvgpr_read. Otherwise, the compiler generates unneccesary
instructions such as v_accvgpr_mov a0, a0.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][True16][MC] true16 for v_fma_f16 (#119477)</title>
<updated>2025-01-06T20:02:04+00:00</updated>
<author>
<name>Brox Chen</name>
<email>guochen2@amd.com</email>
</author>
<published>2025-01-06T20:02:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ce831a231a7509b558121808ab03407916bf1dff'/>
<id>ce831a231a7509b558121808ab03407916bf1dff</id>
<content type='text'>
Support true16 format for v_fma_f16 in MC.

Since we are replacing v_fma_f16 to v_fma_f16_t16/v_fma_f16_fake16 in
Post-GFX11, have to update the CodeGen pattern for v_fma_f16_fake16 to
get CodeGen test passing. There is no pattern modified/created, but just
replacing the v_fma_f16 with fake16 format.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Support true16 format for v_fma_f16 in MC.

Since we are replacing v_fma_f16 to v_fma_f16_t16/v_fma_f16_fake16 in
Post-GFX11, have to update the CodeGen pattern for v_fma_f16_fake16 to
get CodeGen test passing. There is no pattern modified/created, but just
replacing the v_fma_f16 with fake16 format.</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Do not fold copy to physreg from operation on frame index (#115977)</title>
<updated>2024-11-13T05:35:51+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2024-11-13T05:35:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=5911fbb39d615b39f1bf6fd732503ab433de5f27'/>
<id>5911fbb39d615b39f1bf6fd732503ab433de5f27</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Fold more scalar operations on frame index to VALU (#115059)</title>
<updated>2024-11-08T03:02:20+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2024-11-08T03:02:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4fb43c47ddf0138bf5cb64ec64dfb530bc7db051'/>
<id>4fb43c47ddf0138bf5cb64ec64dfb530bc7db051</id>
<content type='text'>
Further extend workaround for the lack of proper regbankselect
for frame indexes.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Further extend workaround for the lack of proper regbankselect
for frame indexes.</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Fold copy of scalar add of frame index (#115058)</title>
<updated>2024-11-06T17:10:58+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2024-11-06T17:10:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=aa7941289ee5b7d9bdf47e1b0ebf2130a86d9522'/>
<id>aa7941289ee5b7d9bdf47e1b0ebf2130a86d9522</id>
<content type='text'>
This is a pre-optimization to avoid a regression in a future
commit. Currently we almost always emit frame index with
a v_mov_b32 and use vector adds for the pointer operations. We
need to consider the users of the frame index (or rather, the
transitive users of derived pointer operations) to know whether
the value will be used in a vector or scalar context. This saves
an sgpr-&gt;vgpr copy.

This optimization could be more general for any opcode that's
trivially convertible from a scalar to vector form (although this
is a workaround for a proper regbankselect).</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a pre-optimization to avoid a regression in a future
commit. Currently we almost always emit frame index with
a v_mov_b32 and use vector adds for the pointer operations. We
need to consider the users of the frame index (or rather, the
transitive users of derived pointer operations) to know whether
the value will be used in a vector or scalar context. This saves
an sgpr-&gt;vgpr copy.

This optimization could be more general for any opcode that's
trivially convertible from a scalar to vector form (although this
is a workaround for a proper regbankselect).</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][True16][MC] VOP2 update instructions with fake16 format (#114436)</title>
<updated>2024-11-05T21:12:49+00:00</updated>
<author>
<name>Brox Chen</name>
<email>guochen2@amd.com</email>
</author>
<published>2024-11-05T21:12:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=e8644e3b474136da43344a5afeeae63268f980e1'/>
<id>e8644e3b474136da43344a5afeeae63268f980e1</id>
<content type='text'>
Some old "t16" VOP2 instructions are actually in fake16 format. Correct
and update test file</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some old "t16" VOP2 instructions are actually in fake16 format. Correct
and update test file</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Fix machine verification failure after SIFoldOperandsImpl::tryFoldOMod (#113544)</title>
<updated>2024-10-29T14:59:37+00:00</updated>
<author>
<name>Jay Foad</name>
<email>jay.foad@amd.com</email>
</author>
<published>2024-10-29T14:59:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a156362e93eba9513611dc0989d516e9946cae48'/>
<id>a156362e93eba9513611dc0989d516e9946cae48</id>
<content type='text'>
Fixes #54201</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes #54201</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Handle folding frame indexes into add with immediate (#110738)</title>
<updated>2024-10-19T19:33:03+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2024-10-19T19:33:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ef91cd3f018411e0ba7989003d7617041e35f650'/>
<id>ef91cd3f018411e0ba7989003d7617041e35f650</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU/NewPM: Port SIFoldOperands to new pass manager (#105801)</title>
<updated>2024-08-29T06:04:54+00:00</updated>
<author>
<name>Akshat Oke</name>
<email>76596238+Akshat-Oke@users.noreply.github.com</email>
</author>
<published>2024-08-29T06:04:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2adc94cd6c3dd1fc713a6ba8301fc04f21908700'/>
<id>2adc94cd6c3dd1fc713a6ba8301fc04f21908700</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][True16][CodeGen] support v_mov_b16 and v_swap_b16 in true16 format (#102198)</title>
<updated>2024-08-08T20:52:59+00:00</updated>
<author>
<name>Brox Chen</name>
<email>broxigarchen@outlook.com</email>
</author>
<published>2024-08-08T20:52:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ae059a1f9f1e501b08a99cb636ec0869ec204c6f'/>
<id>ae059a1f9f1e501b08a99cb636ec0869ec204c6f</id>
<content type='text'>
support v_swap_b16 in true16 format.
update tableGen pattern and folding for v_mov_b16.

---------

Co-authored-by: guochen2 &lt;guochen2@amd.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
support v_swap_b16 in true16 format.
update tableGen pattern and folding for v_mov_b16.

---------

Co-authored-by: guochen2 &lt;guochen2@amd.com&gt;</pre>
</div>
</content>
</entry>
</feed>
