<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/CodeGen/MachineLICM.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[CodeGen] Turn MCRegUnit into an enum class (NFC) (#167943)</title>
<updated>2025-11-16T17:46:44+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-11-16T17:46:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=97a60aa37a048155fec0c560fc51ed52dbd84e44'/>
<id>97a60aa37a048155fec0c560fc51ed52dbd84e44</id>
<content type='text'>
This changes `MCRegUnit` type from `unsigned` to `enum class : unsigned`
and inserts necessary casts.
The added `MCRegUnitToIndex` functor is used with `SparseSet`,
`SparseMultiSet` and `IndexedMap` in a few places.

`MCRegUnit` is opaque to users, so it didn't seem worth making it a
full-fledged class like `Register`.

Static type checking has detected one issue in
`PrologueEpilogueInserter.cpp`, where `BitVector` created for
`MCRegister` is indexed by both `MCRegister` and `MCRegUnit`.

The number of casts could be reduced by using `IndexedMap` in more
places and/or adding a `BitVector` adaptor, but the number of casts *per
file* is still small and `IndexedMap` has limitations, so it didn't seem
worth the effort.

Pull Request: https://github.com/llvm/llvm-project/pull/167943</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This changes `MCRegUnit` type from `unsigned` to `enum class : unsigned`
and inserts necessary casts.
The added `MCRegUnitToIndex` functor is used with `SparseSet`,
`SparseMultiSet` and `IndexedMap` in a few places.

`MCRegUnit` is opaque to users, so it didn't seem worth making it a
full-fledged class like `Register`.

Static type checking has detected one issue in
`PrologueEpilogueInserter.cpp`, where `BitVector` created for
`MCRegister` is indexed by both `MCRegister` and `MCRegUnit`.

The number of casts could be reduced by using `IndexedMap` in more
places and/or adding a `BitVector` adaptor, but the number of casts *per
file* is still small and `IndexedMap` has limitations, so it didn't seem
worth the effort.

Pull Request: https://github.com/llvm/llvm-project/pull/167943</pre>
</div>
</content>
</entry>
<entry>
<title>CodeGen: Remove TRI argument from getRegClass (#158225)</title>
<updated>2025-11-10T23:43:55+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-11-10T23:43:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=55422e804b3bd2fcb1a330673af40240e359540f'/>
<id>55422e804b3bd2fcb1a330673af40240e359540f</id>
<content type='text'>
TargetInstrInfo now directly holds a reference to TargetRegisterInfo
and does not need TRI passed in anywhere.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
TargetInstrInfo now directly holds a reference to TargetRegisterInfo
and does not need TRI passed in anywhere.</pre>
</div>
</content>
</entry>
<entry>
<title>[MachineLICM] Use structured bindings for reg pressure cost map. NFC (#164368)</title>
<updated>2025-10-22T00:32:53+00:00</updated>
<author>
<name>Luke Lau</name>
<email>luke@igalia.com</email>
</author>
<published>2025-10-22T00:32:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=25ba59e6ceb1681bb700c9a146ae6ac3d613ef6e'/>
<id>25ba59e6ceb1681bb700c9a146ae6ac3d613ef6e</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[TII] Split isTrivialReMaterializable into two versions [nfc] (#160377)</title>
<updated>2025-09-25T01:52:17+00:00</updated>
<author>
<name>Philip Reames</name>
<email>preames@rivosinc.com</email>
</author>
<published>2025-09-25T01:52:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ea721e2fa1cd2a35652082dae1d0987de531883d'/>
<id>ea721e2fa1cd2a35652082dae1d0987de531883d</id>
<content type='text'>
This change builds on https://github.com/llvm/llvm-project/pull/160319
which tries to clarify which *callers* (not backends) assume that the
result is actually trivial.

This change itself should be NFC. Essentially, I'm just renaming the
existing isTrivialRematerializable to the non-trivial version and then
adding a new trivial version (with the same name as the prior function)
and simplifying a few callers which want that semantic.

This change does *not* enable non-trivial remat any more broadly than
was already done for our targets which were lying through the old APIs;
that will come separately. The goal here is simply to make the code
easier to follow in terms of what assumptions are being made where.

---------

Co-authored-by: Luke Lau &lt;luke_lau@icloud.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This change builds on https://github.com/llvm/llvm-project/pull/160319
which tries to clarify which *callers* (not backends) assume that the
result is actually trivial.

This change itself should be NFC. Essentially, I'm just renaming the
existing isTrivialRematerializable to the non-trivial version and then
adding a new trivial version (with the same name as the prior function)
and simplifying a few callers which want that semantic.

This change does *not* enable non-trivial remat any more broadly than
was already done for our targets which were lying through the old APIs;
that will come separately. The goal here is simply to make the code
easier to follow in terms of what assumptions are being made where.

---------

Co-authored-by: Luke Lau &lt;luke_lau@icloud.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>CodeGen: Remove MachineFunction argument from getRegClass (#158188)</title>
<updated>2025-09-12T10:22:02+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-09-12T10:22:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=7289f2cd0c371b2539faa628ec0eea58fa61892c'/>
<id>7289f2cd0c371b2539faa628ec0eea58fa61892c</id>
<content type='text'>
This is a low level utility to parse the MCInstrInfo and should
not depend on the state of the function.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a low level utility to parse the MCInstrInfo and should
not depend on the state of the function.</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Remove unused includes (NFC) (#150265)</title>
<updated>2025-07-23T22:18:46+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-07-23T22:18:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3e53d4d386626d78bf930307f0a65b6aebb48ee9'/>
<id>3e53d4d386626d78bf930307f0a65b6aebb48ee9</id>
<content type='text'>
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.</pre>
</div>
</content>
</entry>
<entry>
<title>MachineLICM: Merge logic for implicit and explicit definitions.</title>
<updated>2025-07-09T20:02:15+00:00</updated>
<author>
<name>Peter Collingbourne</name>
<email>peter@pcc.me.uk</email>
</author>
<published>2025-07-09T20:02:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8b9bbd9ed6f8267ad1e5aa76a8a96b8749cf16d9'/>
<id>8b9bbd9ed6f8267ad1e5aa76a8a96b8749cf16d9</id>
<content type='text'>
Anatoly Trosinenko found that when hasSideEffect was set to 0 in the
definition of LOADgotAUTH, MultiSource/Benchmarks/Ptrdist/ks/ks test
from llvm-test-suite started to crash. The issue was traced down to
MachineLICM pass placing LOADgotAUTH right after an unrelated copy to
x16 like rewriting this code:

````
bb.0:
  renamable $x16 = COPY renamable $x12
  B %bb.1

bb.1:
  ...
  /* use $x16 */
  ...
  renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv
  /* use $x20 */
  ...
````

like the following:

````
bb.0:
  renamable $x16 = COPY renamable $x12
  renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv
  B %bb.1

bb.1:
  ...
  /* use $x16 */
  ...
  /* use $x20 */
  ...
```

The issue was caused by inconsistent logic between implicit and explicit
operand definitions, where the implicit side was incorrectly skipping
checking RUDefs for dead operands, leading to RuledOut not being set
for the X16 operand.

Because there isn't really a semantic difference between implicit and
explicit operands at this point, let's remove the isImplicit check and
adjust the logic to do the same thing in both cases:

- For implicit operands, we now check and update RUDefs in the same way
  as explicit operands.
- For explicit operands, we now allow dead operands to be skipped.

Reviewers: arsenm, s-barannikov, atrosinenko

Reviewed By: arsenm, s-barannikov

Pull Request: https://github.com/llvm/llvm-project/pull/147624
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Anatoly Trosinenko found that when hasSideEffect was set to 0 in the
definition of LOADgotAUTH, MultiSource/Benchmarks/Ptrdist/ks/ks test
from llvm-test-suite started to crash. The issue was traced down to
MachineLICM pass placing LOADgotAUTH right after an unrelated copy to
x16 like rewriting this code:

````
bb.0:
  renamable $x16 = COPY renamable $x12
  B %bb.1

bb.1:
  ...
  /* use $x16 */
  ...
  renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv
  /* use $x20 */
  ...
````

like the following:

````
bb.0:
  renamable $x16 = COPY renamable $x12
  renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv
  B %bb.1

bb.1:
  ...
  /* use $x16 */
  ...
  /* use $x20 */
  ...
```

The issue was caused by inconsistent logic between implicit and explicit
operand definitions, where the implicit side was incorrectly skipping
checking RUDefs for dead operands, leading to RuledOut not being set
for the X16 operand.

Because there isn't really a semantic difference between implicit and
explicit operands at this point, let's remove the isImplicit check and
adjust the logic to do the same thing in both cases:

- For implicit operands, we now check and update RUDefs in the same way
  as explicit operands.
- For explicit operands, we now allow dead operands to be skipped.

Reviewers: arsenm, s-barannikov, atrosinenko

Reviewed By: arsenm, s-barannikov

Pull Request: https://github.com/llvm/llvm-project/pull/147624
</pre>
</div>
</content>
</entry>
<entry>
<title>[MachineLICM] Let targets decide if copy instructions are cheap (#146599)</title>
<updated>2025-07-05T11:06:33+00:00</updated>
<author>
<name>Guy David</name>
<email>49722543+guy-david@users.noreply.github.com</email>
</author>
<published>2025-07-05T11:06:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a38cf8573890103c8a26227bb9c395fd00102273'/>
<id>a38cf8573890103c8a26227bb9c395fd00102273</id>
<content type='text'>
When checking whether it is profitable to hoist an instruction, the pass
may override a target's ruling because it assumes that all COPY
instructions are cheap, and that may not be the case for all
micro-architectures (especially for when copying between different
register classes).

On AArch64 there's 0% difference in performance in LLVM's test-suite
with this change. Additionally, very few tests were affected which shows
how it is not so useful to keep it.

x86 performance is slightly better (but maybe that's just noise) for an
A/B comparison consisting of five iterations on LLVM's test suite (Ryzen
5950X on Ubuntu):
```
$ ./utils/compare.py build-a/results* vs build-b/results* --lhs-name base --rhs-name patch --absolute-diff
Tests: 3341
Metric: exec_time

Program                                       exec_time                 
                                              base      patch     diff  
LoopVector...meChecks4PointersDBeforeA/1000   824613.68 825394.06 780.38
LoopVector...timeChecks4PointersDBeforeA/32    18763.60  19486.02 722.42
LCALS/Subs...test:BM_MAT_X_MAT_LAMBDA/44217    37109.92  37572.52 462.60
LoopVector...ntimeChecks4PointersDAfterA/32    14211.35  14562.14 350.79
LoopVector...timeChecks4PointersDEqualsA/32    14221.44  14562.85 341.40
LoopVector...intersAllDisjointIncreasing/32    14222.73  14562.20 339.47
LoopVector...intersAllDisjointDecreasing/32    14223.85  14563.17 339.32
LoopVector...nLoopFrom_uint32_t_To_uint8_t_      739.60    807.45  67.86
harris/har...est:BENCHMARK_HARRIS/2048/2048    15953.77  15998.94  45.17
LoopVector...nLoopFrom_uint8_t_To_uint16_t_      301.94    331.21  29.27
LCALS/Subs...Raw.test:BM_DISC_ORD_RAW/44217      616.35    637.13  20.78
LCALS/Subs...Raw.test:BM_MAT_X_MAT_RAW/5001     3814.95   3833.70  18.75
LCALS/Subs...Raw.test:BM_HYDRO_2D_RAW/44217      812.98    830.64  17.66
LCALS/Subs...test:BM_IMP_HYDRO_2D_RAW/44217      811.26    828.13  16.87
ImageProce...ENCHMARK_BILATERAL_FILTER/64/4      714.77    726.23  11.46
           exec_time                            
l/r             base          patch         diff
count  3341.000000    3341.000000    3341.000000
mean   903.866450     899.732349    -4.134101   
std    20635.900959   20565.289417   115.346928 
min    0.000000       0.000000      -3380.455787
25%    0.000000       0.000000       0.000000   
50%    0.000000       0.000000       0.000000   
75%    1.806500       1.836397       0.000100   
max    824613.680801  825394.062500  780.381699
```</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When checking whether it is profitable to hoist an instruction, the pass
may override a target's ruling because it assumes that all COPY
instructions are cheap, and that may not be the case for all
micro-architectures (especially for when copying between different
register classes).

On AArch64 there's 0% difference in performance in LLVM's test-suite
with this change. Additionally, very few tests were affected which shows
how it is not so useful to keep it.

x86 performance is slightly better (but maybe that's just noise) for an
A/B comparison consisting of five iterations on LLVM's test suite (Ryzen
5950X on Ubuntu):
```
$ ./utils/compare.py build-a/results* vs build-b/results* --lhs-name base --rhs-name patch --absolute-diff
Tests: 3341
Metric: exec_time

Program                                       exec_time                 
                                              base      patch     diff  
LoopVector...meChecks4PointersDBeforeA/1000   824613.68 825394.06 780.38
LoopVector...timeChecks4PointersDBeforeA/32    18763.60  19486.02 722.42
LCALS/Subs...test:BM_MAT_X_MAT_LAMBDA/44217    37109.92  37572.52 462.60
LoopVector...ntimeChecks4PointersDAfterA/32    14211.35  14562.14 350.79
LoopVector...timeChecks4PointersDEqualsA/32    14221.44  14562.85 341.40
LoopVector...intersAllDisjointIncreasing/32    14222.73  14562.20 339.47
LoopVector...intersAllDisjointDecreasing/32    14223.85  14563.17 339.32
LoopVector...nLoopFrom_uint32_t_To_uint8_t_      739.60    807.45  67.86
harris/har...est:BENCHMARK_HARRIS/2048/2048    15953.77  15998.94  45.17
LoopVector...nLoopFrom_uint8_t_To_uint16_t_      301.94    331.21  29.27
LCALS/Subs...Raw.test:BM_DISC_ORD_RAW/44217      616.35    637.13  20.78
LCALS/Subs...Raw.test:BM_MAT_X_MAT_RAW/5001     3814.95   3833.70  18.75
LCALS/Subs...Raw.test:BM_HYDRO_2D_RAW/44217      812.98    830.64  17.66
LCALS/Subs...test:BM_IMP_HYDRO_2D_RAW/44217      811.26    828.13  16.87
ImageProce...ENCHMARK_BILATERAL_FILTER/64/4      714.77    726.23  11.46
           exec_time                            
l/r             base          patch         diff
count  3341.000000    3341.000000    3341.000000
mean   903.866450     899.732349    -4.134101   
std    20635.900959   20565.289417   115.346928 
min    0.000000       0.000000      -3380.455787
25%    0.000000       0.000000       0.000000   
50%    0.000000       0.000000       0.000000   
75%    1.806500       1.836397       0.000100   
max    824613.680801  825394.062500  780.381699
```</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Use llvm::fill instead of std::fill(NFC) (#146911)</title>
<updated>2025-07-04T06:10:28+00:00</updated>
<author>
<name>Austin</name>
<email>zhenhangwang@huawei.com</email>
</author>
<published>2025-07-04T06:10:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a550fef9061f3628e75825306759b13365cb50e3'/>
<id>a550fef9061f3628e75825306759b13365cb50e3</id>
<content type='text'>
Use llvm::fill instead of std::fill</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use llvm::fill instead of std::fill</pre>
</div>
</content>
</entry>
<entry>
<title>[CodeGen] Remove unused includes (NFC) (#141320)</title>
<updated>2025-05-24T07:00:00+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-05-24T07:00:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3bc174ba772c551352004417c11c35503d6283ad'/>
<id>3bc174ba772c551352004417c11c35503d6283ad</id>
<content type='text'>
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.</pre>
</div>
</content>
</entry>
</feed>
