<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp, branch users/mingmingl-llvm/samplefdo-profile-format</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[AArch64] Don't run loop-idiom-vectorize pass in the O0 pipeline (#156802)</title>
<updated>2025-09-06T07:12:36+00:00</updated>
<author>
<name>Fangrui Song</name>
<email>i@maskray.me</email>
</author>
<published>2025-09-06T07:12:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=0d28b925064e3b4e14555e137dd97651c1067b7c'/>
<id>0d28b925064e3b4e14555e137dd97651c1067b7c</id>
<content type='text'>
As noted in #156787</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As noted in #156787</pre>
</div>
</content>
</entry>
<entry>
<title>[AArch64][SME] Implement the SME ABI (ZA state management) in Machine IR (#149062)</title>
<updated>2025-08-19T09:00:28+00:00</updated>
<author>
<name>Benjamin Maxwell</name>
<email>benjamin.maxwell@arm.com</email>
</author>
<published>2025-08-19T09:00:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=eb764040bccc763afb2f6429baab63d1c73a5c85'/>
<id>eb764040bccc763afb2f6429baab63d1c73a5c85</id>
<content type='text'>
## Short Summary

This patch adds a new pass `aarch64-machine-sme-abi` to handle the ABI
for ZA state (e.g., lazy saves and agnostic ZA functions). This is
currently not enabled by default (but aims to be by LLVM 22). The goal
is for this new pass to more optimally place ZA saves/restores and to
work with exception handling.

## Long Description

This patch reimplements management of ZA state for functions with
private and shared ZA state. Agnostic ZA functions will be handled in a
later patch. For now, this is under the flag `-aarch64-new-sme-abi`,
however, we intend for this to replace the current SelectionDAG
implementation once complete.

The approach taken here is to mark instructions as needing ZA to be in a
specific ("ACTIVE" or "LOCAL_SAVED"). Machine instructions implicitly
defining or using ZA registers (such as $zt0 or $zab0) require the
"ACTIVE" state. Function calls may need the "LOCAL_SAVED" or "ACTIVE"
state depending on the callee (having shared or private ZA).

We already add ZA register uses/definitions to machine instructions, so
no extra work is needed to mark these.

Calls need to be marked by glueing Arch64ISD::INOUT_ZA_USE or
Arch64ISD::REQUIRES_ZA_SAVE to the CALLSEQ_START.

These markers are then used by the MachineSMEABIPass to find
instructions where there is a transition between required ZA states.
These are the points we need to insert code to set up or restore a ZA
save (or initialize ZA).

To handle control flow between blocks (which may have different ZA state
requirements), we bundle the incoming and outgoing edges of blocks.
Bundles are formed by assigning each block an incoming and outgoing
bundle (initially, all blocks have their own two bundles). Bundles are
then combined by joining the outgoing bundle of a block with the
incoming bundle of all successors.

These bundles are then assigned a ZA state based on the blocks that
participate in the bundle. Blocks whose incoming edges are in a bundle
"vote" for a ZA state that matches the state required at the first
instruction in the block, and likewise, blocks whose outgoing edges are
in a bundle vote for the ZA state that matches the last instruction in
the block. The ZA state with the most votes is used, which aims to
minimize the number of state transitions.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
## Short Summary

This patch adds a new pass `aarch64-machine-sme-abi` to handle the ABI
for ZA state (e.g., lazy saves and agnostic ZA functions). This is
currently not enabled by default (but aims to be by LLVM 22). The goal
is for this new pass to more optimally place ZA saves/restores and to
work with exception handling.

## Long Description

This patch reimplements management of ZA state for functions with
private and shared ZA state. Agnostic ZA functions will be handled in a
later patch. For now, this is under the flag `-aarch64-new-sme-abi`,
however, we intend for this to replace the current SelectionDAG
implementation once complete.

The approach taken here is to mark instructions as needing ZA to be in a
specific ("ACTIVE" or "LOCAL_SAVED"). Machine instructions implicitly
defining or using ZA registers (such as $zt0 or $zab0) require the
"ACTIVE" state. Function calls may need the "LOCAL_SAVED" or "ACTIVE"
state depending on the callee (having shared or private ZA).

We already add ZA register uses/definitions to machine instructions, so
no extra work is needed to mark these.

Calls need to be marked by glueing Arch64ISD::INOUT_ZA_USE or
Arch64ISD::REQUIRES_ZA_SAVE to the CALLSEQ_START.

These markers are then used by the MachineSMEABIPass to find
instructions where there is a transition between required ZA states.
These are the points we need to insert code to set up or restore a ZA
save (or initialize ZA).

To handle control flow between blocks (which may have different ZA state
requirements), we bundle the incoming and outgoing edges of blocks.
Bundles are formed by assigning each block an incoming and outgoing
bundle (initially, all blocks have their own two bundles). Bundles are
then combined by joining the outgoing bundle of a block with the
incoming bundle of all successors.

These bundles are then assigned a ZA state based on the blocks that
participate in the bundle. Blocks whose incoming edges are in a bundle
"vote" for a ZA state that matches the state required at the first
instruction in the block, and likewise, blocks whose outgoing edges are
in a bundle vote for the ZA state that matches the last instruction in
the block. The ZA state with the most votes is used, which aims to
minimize the number of state transitions.</pre>
</div>
</content>
</entry>
<entry>
<title>[MachineOutliner] Remove LOHs from outlined candidates (#143617)</title>
<updated>2025-06-30T21:29:06+00:00</updated>
<author>
<name>Ellis Hoag</name>
<email>ellis.sparky.hoag@gmail.com</email>
</author>
<published>2025-06-30T21:29:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=0d1392e979ba113304211832265613b81e21dc89'/>
<id>0d1392e979ba113304211832265613b81e21dc89</id>
<content type='text'>
Remove Linker Optimization Hints (LOHs) from outlining candidates
instead of simply preventing outlining if LOH labels are found in the
candidate. This will improve the effectiveness of the machine outliner
when LOHs are enabled (which is the default).

In
https://discourse.llvm.org/t/loh-conflicting-with-machineoutliner/83279/1
it was observed that the machine outliner is much more effective when
LOHs are disabled. Rather than completely disabling LOH, this PR aims to
keep LOH in most places and removing them from outlined functions where
it could be illegal. Note that we are conservatively removing all LOHs
from outlined functions for simplicity, but I believe we could retain
LOHs that are in the intersection of all candidates.

It should be ok to remove these LOHs since these blocks are being
outlined anyway, which will harm performance much more than the gain
from keeping the LOHs.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove Linker Optimization Hints (LOHs) from outlining candidates
instead of simply preventing outlining if LOH labels are found in the
candidate. This will improve the effectiveness of the machine outliner
when LOHs are enabled (which is the default).

In
https://discourse.llvm.org/t/loh-conflicting-with-machineoutliner/83279/1
it was observed that the machine outliner is much more effective when
LOHs are disabled. Rather than completely disabling LOH, this PR aims to
keep LOH in most places and removing them from outlined functions where
it could be illegal. Note that we are conservatively removing all LOHs
from outlined functions for simplicity, but I believe we could retain
LOHs that are in the intersection of all candidates.

It should be ok to remove these LOHs since these blocks are being
outlined anyway, which will harm performance much more than the gain
from keeping the LOHs.</pre>
</div>
</content>
</entry>
<entry>
<title>[AArch64][SME] Use reportFatalUsageError rather than assert (NFC) (#145491)</title>
<updated>2025-06-24T23:08:36+00:00</updated>
<author>
<name>Benjamin Maxwell</name>
<email>benjamin.maxwell@arm.com</email>
</author>
<published>2025-06-24T23:08:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=f37d944152ab177e923112871b97e84530aa7057'/>
<id>f37d944152ab177e923112871b97e84530aa7057</id>
<content type='text'>
Fixes #144351</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes #144351</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] annotate interfaces in llvm/Target for DLL export (#143615)</title>
<updated>2025-06-17T20:28:45+00:00</updated>
<author>
<name>Andrew Rogers</name>
<email>andrurogerz@gmail.com</email>
</author>
<published>2025-06-17T20:28:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=19658d14749876cf0b6633f210c923be3709323b'/>
<id>19658d14749876cf0b6633f210c923be3709323b</id>
<content type='text'>
## Purpose

This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/Target` library.
These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.

## Background

This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).

A sub-set of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.

The bulk of this change is manual additions of `LLVM_ABI` to
`LLVMInitializeX` functions defined in .cpp files under llvm/lib/Target.
Adding `LLVM_ABI` to the function implementation is required here
because they do not `#include "llvm/Support/TargetSelect.h"`, which
contains the declarations for this functions and was already updated
with `LLVM_ABI` in a previous patch. I considered patching these files
with `#include "llvm/Support/TargetSelect.h"` instead, but since
TargetSelect.h is a large file with a bunch of preprocessor x-macro
stuff in it I was concerned it would unnecessarily impact compile times.

In addition, a number of unit tests under llvm/unittests/Target required
additional dependencies to make them build correctly against the LLVM
DLL on Windows using MSVC.

## Validation

Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:

- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
## Purpose

This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm/Target` library.
These annotations currently have no meaningful impact on the LLVM build;
however, they are a prerequisite to support an LLVM Windows DLL (shared
library) build.

## Background

This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).

A sub-set of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.

The bulk of this change is manual additions of `LLVM_ABI` to
`LLVMInitializeX` functions defined in .cpp files under llvm/lib/Target.
Adding `LLVM_ABI` to the function implementation is required here
because they do not `#include "llvm/Support/TargetSelect.h"`, which
contains the declarations for this functions and was already updated
with `LLVM_ABI` in a previous patch. I considered patching these files
with `#include "llvm/Support/TargetSelect.h"` instead, but since
TargetSelect.h is a large file with a bunch of preprocessor x-macro
stuff in it I was concerned it would unnecessarily impact compile times.

In addition, a number of unit tests under llvm/unittests/Target required
additional dependencies to make them build correctly against the LLVM
DLL on Windows using MSVC.

## Validation

Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:

- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang</pre>
</div>
</content>
</entry>
<entry>
<title>[MISched] Add templates for creating custom schedulers (#141935)</title>
<updated>2025-06-03T03:37:40+00:00</updated>
<author>
<name>Pengcheng Wang</name>
<email>wangpengcheng.pp@bytedance.com</email>
</author>
<published>2025-06-03T03:37:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=f393986b53b108457529213c1559346fdb8120ae'/>
<id>f393986b53b108457529213c1559346fdb8120ae</id>
<content type='text'>
We rename `createGenericSchedLive` and `createGenericSchedPostRA`
to `createSchedLive` and `createSchedPostRA`, and add a template
parameter `Strategy` which is the generic implementation by default.

This can simplify some code for targets that have custom scheduler
strategy.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We rename `createGenericSchedLive` and `createGenericSchedPostRA`
to `createSchedLive` and `createSchedPostRA`, and add a template
parameter `Strategy` which is the generic implementation by default.

This can simplify some code for targets that have custom scheduler
strategy.</pre>
</div>
</content>
</entry>
<entry>
<title>Register assembly printer passes (#138348)</title>
<updated>2025-05-07T01:01:17+00:00</updated>
<author>
<name>Matthias Braun</name>
<email>matze@braunis.de</email>
</author>
<published>2025-05-07T01:01:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=675cb706411ced3172bd21def5b38f5ee7cca308'/>
<id>675cb706411ced3172bd21def5b38f5ee7cca308</id>
<content type='text'>
Register assembly printer passes in the pass registry.

This makes it possible to use `llc -start-before=&lt;target&gt;-asm-printer ...` in tests.

Adds a `char &amp;ID` parameter to the AssemblyPrinter constructor to allow
targets to use the `INITIALIZE_PASS` macros and register the pass in the
pass registry. This currently has a default parameter so it won't break
any targets that have not been updated.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Register assembly printer passes in the pass registry.

This makes it possible to use `llc -start-before=&lt;target&gt;-asm-printer ...` in tests.

Adds a `char &amp;ID` parameter to the AssemblyPrinter constructor to allow
targets to use the `INITIALIZE_PASS` macros and register the pass in the
pass registry. This currently has a default parameter so it won't break
any targets that have not been updated.</pre>
</div>
</content>
</entry>
<entry>
<title>[TTI] Simplify implementation (NFCI) (#136674)</title>
<updated>2025-04-26T12:25:40+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-04-26T12:25:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=bb1765179e1fe7d671edf92eba22da2ed4173848'/>
<id>bb1765179e1fe7d671edf92eba22da2ed4173848</id>
<content type='text'>
Replace "concept based polymorphism" with simpler PImpl idiom.

This pursues two goals:
* Enforce static type checking. Previously, target implementations hid
base class methods and type checking was impossible. Now that they
override the methods, the compiler will complain on mismatched
signatures.
* Make the code easier to navigate. Previously, if you asked your
favorite LSP server to show a method (e.g. `getInstructionCost()`), it
would show you methods from `TTI`, `TTI::Concept`, `TTI::Model`,
`TTIImplBase`, and target overrides. Now it is two less :)

There are three commits to hopefully simplify the review.

The first commit removes `TTI::Model`. This is done by deriving
`TargetTransformInfoImplBase` from `TTI::Concept`. This is possible
because they implement the same set of interfaces with identical
signatures.

The first commit makes `TargetTransformImplBase` polymorphic, which
means all derived classes should `override` its methods. This is done in
second commit to make the first one smaller. It appeared infeasible to
extract this into a separate PR because the first commit landed
separately would result in tons of `-Woverloaded-virtual` warnings (and
break `-Werror` builds).

The third commit eliminates `TTI::Concept` by merging it with the only
derived class `TargetTransformImplBase`. This commit could be extracted
into a separate PR, but it touches the same lines in
`TargetTransformInfoImpl.h` (removes `override` added by the second
commit and adds `virtual`), so I thought it may make sense to land these
two commits together.

Pull Request: https://github.com/llvm/llvm-project/pull/136674
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace "concept based polymorphism" with simpler PImpl idiom.

This pursues two goals:
* Enforce static type checking. Previously, target implementations hid
base class methods and type checking was impossible. Now that they
override the methods, the compiler will complain on mismatched
signatures.
* Make the code easier to navigate. Previously, if you asked your
favorite LSP server to show a method (e.g. `getInstructionCost()`), it
would show you methods from `TTI`, `TTI::Concept`, `TTI::Model`,
`TTIImplBase`, and target overrides. Now it is two less :)

There are three commits to hopefully simplify the review.

The first commit removes `TTI::Model`. This is done by deriving
`TargetTransformInfoImplBase` from `TTI::Concept`. This is possible
because they implement the same set of interfaces with identical
signatures.

The first commit makes `TargetTransformImplBase` polymorphic, which
means all derived classes should `override` its methods. This is done in
second commit to make the first one smaller. It appeared infeasible to
extract this into a separate PR because the first commit landed
separately would result in tons of `-Woverloaded-virtual` warnings (and
break `-Werror` builds).

The third commit eliminates `TTI::Concept` by merging it with the only
derived class `TargetTransformImplBase`. This commit could be extracted
into a separate PR, but it touches the same lines in
`TargetTransformInfoImpl.h` (removes `override` added by the second
commit and adds `virtual`), so I thought it may make sense to land these
two commits together.

Pull Request: https://github.com/llvm/llvm-project/pull/136674
</pre>
</div>
</content>
</entry>
<entry>
<title>[NFC][LLVM][AArch64] Cleanup pass initialization for AArch64 (#134315)</title>
<updated>2025-04-07T17:24:27+00:00</updated>
<author>
<name>Rahul Joshi</name>
<email>rjoshi@nvidia.com</email>
</author>
<published>2025-04-07T17:24:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=23c27f3efcdda730b365698ade5fd0c1c283f2e7'/>
<id>23c27f3efcdda730b365698ade5fd0c1c283f2e7</id>
<content type='text'>
- Remove calls to pass initialization from pass constructors.
- https://github.com/llvm/llvm-project/issues/111767</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Remove calls to pass initialization from pass constructors.
- https://github.com/llvm/llvm-project/issues/111767</pre>
</div>
</content>
</entry>
<entry>
<title>Initialize aarch64-cond-br-tuning pass (#132087)</title>
<updated>2025-03-20T21:48:53+00:00</updated>
<author>
<name>Shubham Sandeep Rastogi</name>
<email>srastogi22@apple.com</email>
</author>
<published>2025-03-20T21:48:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=cc86d7cb191a64489e837c68f299abb930f5c6cb'/>
<id>cc86d7cb191a64489e837c68f299abb930f5c6cb</id>
<content type='text'>
The call to the initializeAArch64CondBrTuningPass function is missing in
the AArch64TargetMachine LLVMInitializeAArch64Target function.

This means that the pass is not in the pass registry and options such as
-run-pass=aarch64-cond-br-tuning and
-stop-after=aarch64-cond-br-tuning cannot be used. This patch fixes that
issue.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The call to the initializeAArch64CondBrTuningPass function is missing in
the AArch64TargetMachine LLVMInitializeAArch64Target function.

This means that the pass is not in the pass registry and options such as
-run-pass=aarch64-cond-br-tuning and
-stop-after=aarch64-cond-br-tuning cannot be used. This patch fixes that
issue.</pre>
</div>
</content>
</entry>
</feed>
