<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/CodeGen/SplitKit.h, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>SplitKit: Fix rematerialization undoing subclass based split (#122110)</title>
<updated>2025-03-04T03:04:14+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-03-04T03:04:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8476a5d480304bf7bd934c660a159e1c6906a69d'/>
<id>8476a5d480304bf7bd934c660a159e1c6906a69d</id>
<content type='text'>
This fixes an allocation failure in the new test.

In cases where getLargestLegalSuperClass can inflate the register class,
rematerialization could effectively undo a split which was done to
inflate
the register class, if the defining instruction can only write a
subclass
and the use can read the superclass.

Some of the x86 tests changes look like improvements, but some are
likely regressions.

I'm not entirely sure this is the correct place to fix this. It also
seems more complicated than necessary, but the decision to change
the register class is far removed from the point where the decision
to split the virtual register is made. I'm also also not sure if this
should be considering the register classes of all the use indexes
in getUseSlots, rather than just checking if this use index instruction
reads the register.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This fixes an allocation failure in the new test.

In cases where getLargestLegalSuperClass can inflate the register class,
rematerialization could effectively undo a split which was done to
inflate
the register class, if the defining instruction can only write a
subclass
and the use can read the superclass.

Some of the x86 tests changes look like improvements, but some are
likely regressions.

I'm not entirely sure this is the correct place to fix this. It also
seems more complicated than necessary, but the decision to change
the register class is far removed from the point where the decision
to split the virtual register is made. I'm also also not sure if this
should be considering the register classes of all the use indexes
in getUseSlots, rather than just checking if this use index instruction
reads the register.</pre>
</div>
</content>
</entry>
<entry>
<title>[GreedyRA] Improve RA for nested loop induction variables (#72093)</title>
<updated>2023-11-18T09:55:19+00:00</updated>
<author>
<name>David Green</name>
<email>david.green@arm.com</email>
</author>
<published>2023-11-18T09:55:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=303a7835ff833278a0de20cf5a70085b2ae8fee1'/>
<id>303a7835ff833278a0de20cf5a70085b2ae8fee1</id>
<content type='text'>
Imagine a loop of the form:
```
  preheader:
    %r = def
  header:
    bcc latch, inner
  inner1:
    ..
  inner2:
    b latch
  latch:
    %r = subs %r
    bcc header
```

It can be possible for code to spend a decent amount of time in the
header&lt;-&gt;latch loop, not going into the inner part of the loop as much.
The greedy register allocator can prefer to spill _around_ %r though,
adding spills around the subs in the loop, which can be very detrimental
for performance. (The case I am looking at is actually a very deeply
nested set of loops that repeat the header&lt;-&gt;latch pattern at multiple
different levels).

The greedy RA will apply a preference to spill to the IV, as it is live
through the header block. This patch attempts to add a heuristic to
prevent that in this case for variables that look like IVs, in a similar
regard to the extra spill weight that gets added to variables that look
like IVs, that are expensive to spill. That will mean spills are more
likely to be pushed into the inner blocks, where they are less likely to
be executed and not as expensive as spills around the IV.

This gives a 8% speedup in the exchange benchmark from spec2017 when
compiled with flang-new, whilst importantly stabilising the scores to be
less chaotic to other changes. Running ctmark showed no difference in
the compile time. I've tried to run a range of benchmarking for
performance, most of which were relatively flat not showing many large
differences. One matrix multiply case improved 21.3% due to removing a
cascading chains of spills, and some other knock-on effects happen which
usually cause small differences in the scores.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Imagine a loop of the form:
```
  preheader:
    %r = def
  header:
    bcc latch, inner
  inner1:
    ..
  inner2:
    b latch
  latch:
    %r = subs %r
    bcc header
```

It can be possible for code to spend a decent amount of time in the
header&lt;-&gt;latch loop, not going into the inner part of the loop as much.
The greedy register allocator can prefer to spill _around_ %r though,
adding spills around the subs in the loop, which can be very detrimental
for performance. (The case I am looking at is actually a very deeply
nested set of loops that repeat the header&lt;-&gt;latch pattern at multiple
different levels).

The greedy RA will apply a preference to spill to the IV, as it is live
through the header block. This patch attempts to add a heuristic to
prevent that in this case for variables that look like IVs, in a similar
regard to the extra spill weight that gets added to variables that look
like IVs, that are expensive to spill. That will mean spills are more
likely to be pushed into the inner blocks, where they are less likely to
be executed and not as expensive as spills around the IV.

This gives a 8% speedup in the exchange benchmark from spec2017 when
compiled with flang-new, whilst importantly stabilising the scores to be
less chaotic to other changes. Running ctmark showed no difference in
the compile time. I've tried to run a range of benchmarking for
performance, most of which were relatively flat not showing many large
differences. One matrix multiply case improved 21.3% due to removing a
cascading chains of spills, and some other knock-on effects happen which
usually cause small differences in the scores.</pre>
</div>
</content>
</entry>
<entry>
<title>Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"</title>
<updated>2023-08-01T00:15:45+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2023-07-28T17:05:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4d42e8b5d1fa87e49768d100dd1bc53515391e89'/>
<id>4d42e8b5d1fa87e49768d100dd1bc53515391e89</id>
<content type='text'>
This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.

The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work
around the underlying problem with SUBREG_TO_REG.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd.

The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work
around the underlying problem with SUBREG_TO_REG.
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting"</title>
<updated>2023-07-27T05:13:32+00:00</updated>
<author>
<name>Vitaly Buka</name>
<email>vitalybuka@google.com</email>
</author>
<published>2023-07-26T22:41:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a496c8be6e638ae58bb45f13113dbe3a4b7b23fd'/>
<id>a496c8be6e638ae58bb45f13113dbe3a4b7b23fd</id>
<content type='text'>
And dependent commits.

Details in D150388.

This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c.
This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e.
This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826.
This reverts commit b7836d856206ec39509d42529f958c920368166b.

No conflicts in the code, few tests had conflicts in autogenerated CHECKs:
llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll

Reviewed By: alexfh

Differential Revision: https://reviews.llvm.org/D156381
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
And dependent commits.

Details in D150388.

This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c.
This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e.
This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826.
This reverts commit b7836d856206ec39509d42529f958c920368166b.

No conflicts in the code, few tests had conflicts in autogenerated CHECKs:
llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll

Reviewed By: alexfh

Differential Revision: https://reviews.llvm.org/D156381
</pre>
</div>
</content>
</entry>
<entry>
<title>[CodeGen]Allow targets to use target specific COPY instructions for live range splitting</title>
<updated>2023-07-07T16:59:50+00:00</updated>
<author>
<name>Yashwant Singh</name>
<email>Yashwant.Singh@amd.com</email>
</author>
<published>2023-07-07T16:59:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=b7836d856206ec39509d42529f958c920368166b'/>
<id>b7836d856206ec39509d42529f958c920368166b</id>
<content type='text'>
Replacing D143754. Right now the LiveRangeSplitting during register allocation uses
TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a
problem as we have both vector and scalar copies. Vector copies perform a copy over
a vector register but only on the lanes(threads) that are active. This is mostly sufficient
however we do run into cases when we have to copy the entire vector register and
not just active lane data. One major place where we need that is live range splitting.

Allowing targets to use their own copy instructions(if defined) will provide a lot of
flexibility and ease to lower these pseudo instructions to correct MIR.

- Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range
 splitting.
- Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline.

Reviewed By: arsenm, cdevadas, kparzysz

Differential Revision: https://reviews.llvm.org/D150388
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replacing D143754. Right now the LiveRangeSplitting during register allocation uses
TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a
problem as we have both vector and scalar copies. Vector copies perform a copy over
a vector register but only on the lanes(threads) that are active. This is mostly sufficient
however we do run into cases when we have to copy the entire vector register and
not just active lane data. One major place where we need that is live range splitting.

Allowing targets to use their own copy instructions(if defined) will provide a lot of
flexibility and ease to lower these pseudo instructions to correct MIR.

- Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range
 splitting.
- Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline.

Reviewed By: arsenm, cdevadas, kparzysz

Differential Revision: https://reviews.llvm.org/D150388
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix uninitialized scalar members in CodeGen</title>
<updated>2023-04-21T04:22:34+00:00</updated>
<author>
<name>Akshay Khadse</name>
<email>akshayskhadse@gmail.com</email>
</author>
<published>2023-04-21T04:22:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=aab0ca3e794a212a2ac0a24a4a1a9ed18ad06267'/>
<id>aab0ca3e794a212a2ac0a24a4a1a9ed18ad06267</id>
<content type='text'>
This change fixes some static code analysis warnings.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148811
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This change fixes some static code analysis warnings.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148811
</pre>
</div>
</content>
</entry>
<entry>
<title>Correct typos (NFC)</title>
<updated>2022-12-16T18:51:26+00:00</updated>
<author>
<name>Sprite</name>
<email>SpriteOvO@gmail.com</email>
</author>
<published>2022-12-16T18:51:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a9f9f3dff474b7bdb19129eaf625d3ef0084a975'/>
<id>a9f9f3dff474b7bdb19129eaf625d3ef0084a975</id>
<content type='text'>
Just found some typos while reading the llvm/circt project.

compliment -&gt; complement
emitsd -&gt; emits
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Just found some typos while reading the llvm/circt project.

compliment -&gt; complement
emitsd -&gt; emits
</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Remove unused forward declarations (NFC)</title>
<updated>2022-11-20T17:59:36+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2022-11-20T17:59:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=7524db4d447aeee582291cb849174a052a62dfbc'/>
<id>7524db4d447aeee582291cb849174a052a62dfbc</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>CodeGen: Remove AliasAnalysis from regalloc</title>
<updated>2022-07-18T21:23:41+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2022-06-24T16:09:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8d0383eb694e13a999c9c95adc4b56771429e551'/>
<id>8d0383eb694e13a999c9c95adc4b56771429e551</id>
<content type='text'>
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
</pre>
</div>
</content>
</entry>
<entry>
<title>Cleanup codegen includes</title>
<updated>2022-03-16T07:43:00+00:00</updated>
<author>
<name>serge-sans-paille</name>
<email>sguelton@redhat.com</email>
</author>
<published>2022-03-15T09:54:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=989f1c72e0f4236ac35a35cc9998ea34bc62d5cd'/>
<id>989f1c72e0f4236ac35a35cc9998ea34bc62d5cd</id>
<content type='text'>
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
</pre>
</div>
</content>
</entry>
</feed>
