<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/CodeGen/TailDuplicator.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[CodeGen] Use DenseMap::try_emplace (NFC) (#165165)</title>
<updated>2025-10-26T20:34:15+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-10-26T20:34:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=160b72787cde6e9c0964cd1751af77e20696889b'/>
<id>160b72787cde6e9c0964cd1751af77e20696889b</id>
<content type='text'>
With try_emplace, we can pass the key and the arguments for the
value's constructor, which is a lot shorter than:

  Map.insert(std::make_pair(Key, ValueType(Arg1, Arg2)))</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With try_emplace, we can pass the key and the arguments for the
value's constructor, which is a lot shorter than:

  Map.insert(std::make_pair(Key, ValueType(Arg1, Arg2)))</pre>
</div>
</content>
</entry>
<entry>
<title>[CodeGen] Extract copy-paste on PHI MachineInstr income removal. (#158634)</title>
<updated>2025-09-25T05:59:36+00:00</updated>
<author>
<name>Afanasyev Ivan</name>
<email>ivafanas@gmail.com</email>
</author>
<published>2025-09-25T05:59:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3e639930d3ba3d6401992ab1d54dc625d5a299a5'/>
<id>3e639930d3ba3d6401992ab1d54dc625d5a299a5</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[CodeGen] Fix partial phi input removal in TailDuplicator. (#158265)</title>
<updated>2025-09-13T01:45:54+00:00</updated>
<author>
<name>Afanasyev Ivan</name>
<email>ivafanas@gmail.com</email>
</author>
<published>2025-09-13T01:45:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ffcaeca90a3c0965acace6645f775ab1d876fa6e'/>
<id>ffcaeca90a3c0965acace6645f775ab1d876fa6e</id>
<content type='text'>
Tail duplicator removes the first PHI income from the predecessor basic
block, while it should remove all operands for this block.

PHI instructions happen to have duplicated values for the same
predecessor block:
* `UnreachableMachineBlockElim` assumes that PHI instruction might have
duplicates:
https://github.com/llvm/llvm-project/blob/7289f2cd0c371b2539faa628ec0eea58fa61892c/llvm/lib/CodeGen/UnreachableBlockElim.cpp#L160
* `AArch64` directly states that PHI instruction might have duplicates:
https://github.com/llvm/llvm-project/blob/7289f2cd0c371b2539faa628ec0eea58fa61892c/llvm/lib/Target/AArch64/AArch64ConditionalCompares.cpp#L244
* And `Hexagon`:
https://github.com/llvm/llvm-project/blob/7289f2cd0c371b2539faa628ec0eea58fa61892c/llvm/lib/Target/Hexagon/HexagonConstPropagation.cpp#L844

We have caught the bug on custom out-of-tree backend. `TailDuplicator`
should remove all operands corresponding to the removing block.

Please note, that bug likely does not affect in-tree backends, because:
* It happens only in scenario of **partial** tail duplication (i.e. tail
block is duplicated in some predecessors, but not in all of them)
* It happens in **Pre-RA** tail duplication only (Post-RA does not
contain PHIs, obviously)
* The only backend (I know) uses Pre-RA tail duplication is X86. It uses
tail duplication via `early-tailduplication` pass which declines partial
tail duplication via `canCompletelyDuplicateBB` check, because it uses
`TailDuplicator::tailDuplicateBlocks` public API.

So, bug happens only in the case of pre-ra partial tail duplication if
backend uses `TailDuplicator::tailDuplicate` public API directly.

That's why I can not add reproducer test for in-tree backends.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Tail duplicator removes the first PHI income from the predecessor basic
block, while it should remove all operands for this block.

PHI instructions happen to have duplicated values for the same
predecessor block:
* `UnreachableMachineBlockElim` assumes that PHI instruction might have
duplicates:
https://github.com/llvm/llvm-project/blob/7289f2cd0c371b2539faa628ec0eea58fa61892c/llvm/lib/CodeGen/UnreachableBlockElim.cpp#L160
* `AArch64` directly states that PHI instruction might have duplicates:
https://github.com/llvm/llvm-project/blob/7289f2cd0c371b2539faa628ec0eea58fa61892c/llvm/lib/Target/AArch64/AArch64ConditionalCompares.cpp#L244
* And `Hexagon`:
https://github.com/llvm/llvm-project/blob/7289f2cd0c371b2539faa628ec0eea58fa61892c/llvm/lib/Target/Hexagon/HexagonConstPropagation.cpp#L844

We have caught the bug on custom out-of-tree backend. `TailDuplicator`
should remove all operands corresponding to the removing block.

Please note, that bug likely does not affect in-tree backends, because:
* It happens only in scenario of **partial** tail duplication (i.e. tail
block is duplicated in some predecessors, but not in all of them)
* It happens in **Pre-RA** tail duplication only (Post-RA does not
contain PHIs, obviously)
* The only backend (I know) uses Pre-RA tail duplication is X86. It uses
tail duplication via `early-tailduplication` pass which declines partial
tail duplication via `canCompletelyDuplicateBB` check, because it uses
`TailDuplicator::tailDuplicateBlocks` public API.

So, bug happens only in the case of pre-ra partial tail duplication if
backend uses `TailDuplicator::tailDuplicate` public API directly.

That's why I can not add reproducer test for in-tree backends.</pre>
</div>
</content>
</entry>
<entry>
<title>[TailDup] Delay aggressive computed-goto taildup to after RegAlloc. (#150911)</title>
<updated>2025-07-31T18:20:05+00:00</updated>
<author>
<name>Florian Hahn</name>
<email>flo@fhahn.com</email>
</author>
<published>2025-07-31T18:20:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=078d214672e23691566137fa88b851c7022666b7'/>
<id>078d214672e23691566137fa88b851c7022666b7</id>
<content type='text'>
https://github.com/llvm/llvm-project/pull/114990 allowed more aggressive
tail duplication for computed-gotos in both pre- and post-regalloc tail
duplication.

In some cases, performing tail-duplication too early can lead to worse
results, especially if we duplicate blocks with a number of phi nodes.

This is causing a ~3% performance regression in some workloads using
Python 3.12.

This patch updates TailDup to delay aggressive tail-duplication for
computed gotos to after register allocation.

This means we can keep the non-duplicated version for a bit longer
throughout the backend, which should reduce compile-time as well as
allowing a number of optimizations and simplifications to trigger before
drastically expanding the CFG.

For the case in https://github.com/llvm/llvm-project/issues/106846, I
get the same performance with and without this patch on Skylake.

PR: https://github.com/llvm/llvm-project/pull/150911</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
https://github.com/llvm/llvm-project/pull/114990 allowed more aggressive
tail duplication for computed-gotos in both pre- and post-regalloc tail
duplication.

In some cases, performing tail-duplication too early can lead to worse
results, especially if we duplicate blocks with a number of phi nodes.

This is causing a ~3% performance regression in some workloads using
Python 3.12.

This patch updates TailDup to delay aggressive tail-duplication for
computed gotos to after register allocation.

This means we can keep the non-duplicated version for a bit longer
throughout the backend, which should reduce compile-time as well as
allowing a number of optimizations and simplifications to trigger before
drastically expanding the CFG.

For the case in https://github.com/llvm/llvm-project/issues/106846, I
get the same performance with and without this patch on Skylake.

PR: https://github.com/llvm/llvm-project/pull/150911</pre>
</div>
</content>
</entry>
<entry>
<title>[MachineBB] Make sure there are successors in terminatorIsComputedGoto. (#151342)</title>
<updated>2025-07-31T16:52:45+00:00</updated>
<author>
<name>Florian Hahn</name>
<email>flo@fhahn.com</email>
</author>
<published>2025-07-31T16:52:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=69f3ea08522eca4b8617145fdafb8fc6595ddf97'/>
<id>69f3ea08522eca4b8617145fdafb8fc6595ddf97</id>
<content type='text'>
Currently terminatorIsComputedGoto will return for blocks with a
indirect branch terminator and no successor. If there are no successor,
the terminator is likely not a computed goto, return false in that case.

Note that this is currently NFC, as the only use checks it only if there
are successors, but it will be needed in
https://github.com/llvm/llvm-project/pull/150911.

PR: https://github.com/llvm/llvm-project/pull/151342</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently terminatorIsComputedGoto will return for blocks with a
indirect branch terminator and no successor. If there are no successor,
the terminator is likely not a computed goto, return false in that case.

Note that this is currently NFC, as the only use checks it only if there
are successors, but it will be needed in
https://github.com/llvm/llvm-project/pull/150911.

PR: https://github.com/llvm/llvm-project/pull/151342</pre>
</div>
</content>
</entry>
<entry>
<title>Use early return/continue in TailDuplicator::duplicateInstruction [nfc]</title>
<updated>2025-05-16T20:33:11+00:00</updated>
<author>
<name>Philip Reames</name>
<email>preames@rivosinc.com</email>
</author>
<published>2025-05-16T20:23:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3def9976ebeb1dec7fb867a927f3e2e4adf1816b'/>
<id>3def9976ebeb1dec7fb867a927f3e2e4adf1816b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Use range constructors of *Set (NFC) (#137552)</title>
<updated>2025-04-27T22:59:57+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-04-27T22:59:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=5cfd81b0cc9f92f3d4903f4e7b97769fe7b565b9'/>
<id>5cfd81b0cc9f92f3d4903f4e7b97769fe7b565b9</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[TailDuplicator] Determine if computed gotos using `blockaddress` (#132536)</title>
<updated>2025-03-26T13:27:43+00:00</updated>
<author>
<name>dianqk</name>
<email>dianqk@dianqk.net</email>
</author>
<published>2025-03-26T13:27:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=66f158d91803875de63d8f2a437ce8ecb22c4141'/>
<id>66f158d91803875de63d8f2a437ce8ecb22c4141</id>
<content type='text'>
Using `blockaddress` should be more reliable than determining if an
operand comes from a jump table index.

Alternative: Add the `MachineInstr::MIFlag::ComputedGoto` flag when
lowering `indirectbr`. But I don't think this approach is suitable to
backport.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using `blockaddress` should be more reliable than determining if an
operand comes from a jump table index.

Alternative: Add the `MachineInstr::MIFlag::ComputedGoto` flag when
lowering `indirectbr`. But I don't think this approach is suitable to
backport.</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Use *Set::insert_range (NFC) (#132509)</title>
<updated>2025-03-22T15:07:33+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-03-22T15:07:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1b189cab5e582a183f6946dcb3e20913add58476'/>
<id>1b189cab5e582a183f6946dcb3e20913add58476</id>
<content type='text'>
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch uses insert_range in
conjunction with llvm::{predecessors,successors} and
MachineBasicBlock::{predecessors,successors}.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch uses insert_range in
conjunction with llvm::{predecessors,successors} and
MachineBasicBlock::{predecessors,successors}.</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Use *Set::insert_range (NFC) (#132325)</title>
<updated>2025-03-21T05:24:06+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-03-21T05:24:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=599005686a1c27ffe97bb4eb07fcd98359a2af99'/>
<id>599005686a1c27ffe97bb4eb07fcd98359a2af99</id>
<content type='text'>
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.</pre>
</div>
</content>
</entry>
</feed>
