<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/CodeGen/StackFrameLayoutAnalysisPass.cpp, branch users/ojhunt/ptrauth-additions</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[NFC][LLVM][CodeGen] Namespace related cleanups (#162999)</title>
<updated>2025-10-13T14:54:50+00:00</updated>
<author>
<name>Rahul Joshi</name>
<email>rjoshi@nvidia.com</email>
</author>
<published>2025-10-13T14:54:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2a4f5b2751efbddd7bfe9818ab9ea57d03a13752'/>
<id>2a4f5b2751efbddd7bfe9818ab9ea57d03a13752</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AArch64][SME] Support split ZPR and PPR area allocation (#142392)</title>
<updated>2025-10-02T18:05:14+00:00</updated>
<author>
<name>Benjamin Maxwell</name>
<email>benjamin.maxwell@arm.com</email>
</author>
<published>2025-10-02T18:05:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8f67cdd9b7f4ffa3cca552b00d58e72dba66b924'/>
<id>8f67cdd9b7f4ffa3cca552b00d58e72dba66b924</id>
<content type='text'>
For a while we have supported the `-aarch64-stack-hazard-size=&lt;size&gt;`
option, which adds "hazard padding" between GPRs and FPR/ZPRs. However,
there is currently a hole in this mitigation as PPR and FPR/ZPR accesses
to the same area also cause streaming memory hazards (this is noted by
`-pass-remarks-analysis=sme -aarch64-stack-hazard-remark-size=&lt;val&gt;`),
and the current stack layout places PPRs and ZPRs within the same area.

Which looks like:

```
------------------------------------  Higher address
| callee-saved gpr registers        |
|---------------------------------- |
| lr,fp  (a.k.a. "frame record")    |
|-----------------------------------| &lt;- fp(=x29)
|   &lt;hazard padding&gt;                |
|-----------------------------------|
| callee-saved fp/simd/SVE regs     |
|-----------------------------------|
|        SVE stack objects          |
|-----------------------------------|
| local variables of fixed size     |
|   &lt;FPR&gt;                           |
|   &lt;hazard padding&gt;                |
|   &lt;GPR&gt;                           |
------------------------------------| &lt;- sp
                                    | Lower address
```

With this patch the stack (and hazard padding) is rearranged so that
hazard padding is placed between the PPRs and ZPRs rather than within
the (fixed size) callee-save region. Which looks something like this:

```
------------------------------------  Higher address
| callee-saved gpr registers        |
|---------------------------------- |
| lr,fp  (a.k.a. "frame record")    |
|-----------------------------------| &lt;- fp(=x29)
|        callee-saved PPRs          |
|        PPR stack objects          | (These are SVE predicates)
|-----------------------------------|
|   &lt;hazard padding&gt;                |
|-----------------------------------|
|       callee-saved ZPR regs       | (These are SVE vectors)
|        ZPR stack objects          | Note: FPRs are promoted to ZPRs
|-----------------------------------|
| local variables of fixed size     |
|   &lt;FPR&gt;                           |
|   &lt;hazard padding&gt;                |
|   &lt;GPR&gt;                           |
------------------------------------| &lt;- sp
                                    | Lower address
```

This layout is only enabled if:

 * SplitSVEObjects are enabled (`-aarch64-split-sve-objects`)
   - (This may be enabled by default in a later patch)
 * Streaming memory hazards are present
   - (`-aarch64-stack-hazard-size=&lt;val&gt;` != 0)
 * PPRs and FPRs/ZPRs are on the stack
 * There's no stack realignment or variable-sized objects
   - This is left as a TODO for now

Additionally, any FPR callee-saves that are present will be promoted to
ZPRs. This is to prevent stack hazards between FPRs and GRPs in the
fixed size callee-save area (which would otherwise require more hazard
padding, or moving the FPR callee-saves).

This layout should resolve the hole in the hazard padding mitigation,
and is not intended change codegen for non-SME code.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For a while we have supported the `-aarch64-stack-hazard-size=&lt;size&gt;`
option, which adds "hazard padding" between GPRs and FPR/ZPRs. However,
there is currently a hole in this mitigation as PPR and FPR/ZPR accesses
to the same area also cause streaming memory hazards (this is noted by
`-pass-remarks-analysis=sme -aarch64-stack-hazard-remark-size=&lt;val&gt;`),
and the current stack layout places PPRs and ZPRs within the same area.

Which looks like:

```
------------------------------------  Higher address
| callee-saved gpr registers        |
|---------------------------------- |
| lr,fp  (a.k.a. "frame record")    |
|-----------------------------------| &lt;- fp(=x29)
|   &lt;hazard padding&gt;                |
|-----------------------------------|
| callee-saved fp/simd/SVE regs     |
|-----------------------------------|
|        SVE stack objects          |
|-----------------------------------|
| local variables of fixed size     |
|   &lt;FPR&gt;                           |
|   &lt;hazard padding&gt;                |
|   &lt;GPR&gt;                           |
------------------------------------| &lt;- sp
                                    | Lower address
```

With this patch the stack (and hazard padding) is rearranged so that
hazard padding is placed between the PPRs and ZPRs rather than within
the (fixed size) callee-save region. Which looks something like this:

```
------------------------------------  Higher address
| callee-saved gpr registers        |
|---------------------------------- |
| lr,fp  (a.k.a. "frame record")    |
|-----------------------------------| &lt;- fp(=x29)
|        callee-saved PPRs          |
|        PPR stack objects          | (These are SVE predicates)
|-----------------------------------|
|   &lt;hazard padding&gt;                |
|-----------------------------------|
|       callee-saved ZPR regs       | (These are SVE vectors)
|        ZPR stack objects          | Note: FPRs are promoted to ZPRs
|-----------------------------------|
| local variables of fixed size     |
|   &lt;FPR&gt;                           |
|   &lt;hazard padding&gt;                |
|   &lt;GPR&gt;                           |
------------------------------------| &lt;- sp
                                    | Lower address
```

This layout is only enabled if:

 * SplitSVEObjects are enabled (`-aarch64-split-sve-objects`)
   - (This may be enabled by default in a later patch)
 * Streaming memory hazards are present
   - (`-aarch64-stack-hazard-size=&lt;val&gt;` != 0)
 * PPRs and FPRs/ZPRs are on the stack
 * There's no stack realignment or variable-sized objects
   - This is left as a TODO for now

Additionally, any FPR callee-saves that are present will be promoted to
ZPRs. This is to prevent stack hazards between FPRs and GRPs in the
fixed size callee-save area (which would otherwise require more hazard
padding, or moving the FPR callee-saves).

This layout should resolve the hole in the hazard padding mitigation,
and is not intended change codegen for non-SME code.</pre>
</div>
</content>
</entry>
<entry>
<title>[Codegen] Add a separate stack ID for scalable predicates (#142390)</title>
<updated>2025-10-02T13:43:07+00:00</updated>
<author>
<name>Benjamin Maxwell</name>
<email>benjamin.maxwell@arm.com</email>
</author>
<published>2025-10-02T13:43:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=9f5abd38dd1782a6fd3b8ed1c2f76aa62dc850b1'/>
<id>9f5abd38dd1782a6fd3b8ed1c2f76aa62dc850b1</id>
<content type='text'>
This splits out "ScalablePredicateVector" from the "ScalableVector"
StackID this is primarily to allow easy differentiation between vectors
and predicates (without inspecting instructions).

This new stack ID is not used in many places yet, but will be used in a
later patch to mark stack slots that are known to contain predicates.

Co-authored-by: Kerry McLaughlin &lt;kerry.mclaughlin@arm.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This splits out "ScalablePredicateVector" from the "ScalableVector"
StackID this is primarily to allow easy differentiation between vectors
and predicates (without inspecting instructions).

This new stack ID is not used in many places yet, but will be used in a
later patch to mark stack slots that are known to contain predicates.

Co-authored-by: Kerry McLaughlin &lt;kerry.mclaughlin@arm.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[CodeGen][NPM] Port StackFrameLayoutAnalysisPass to NPM (#130070)</title>
<updated>2025-04-15T07:07:19+00:00</updated>
<author>
<name>Akshat Oke</name>
<email>Akshat.Oke@amd.com</email>
</author>
<published>2025-04-15T07:07:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a388395b869ada3a4d104aa9963fa233b45522ea'/>
<id>a388395b869ada3a4d104aa9963fa233b45522ea</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[CodeGen] Remove unused includes (NFC) (#115996)</title>
<updated>2024-11-13T07:15:06+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2024-11-13T07:15:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=735ab61ac828bd61398e6847d60e308fdf2b54ec'/>
<id>735ab61ac828bd61398e6847d60e308fdf2b54ec</id>
<content type='text'>
Identified with misc-include-cleaner.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Identified with misc-include-cleaner.</pre>
</div>
</content>
</entry>
<entry>
<title>[StackFrameLayoutAnalysis] Support more SlotTypes (#100562)</title>
<updated>2024-07-25T17:54:24+00:00</updated>
<author>
<name>Hari Limaye</name>
<email>hari.limaye@arm.com</email>
</author>
<published>2024-07-25T17:54:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=e31794f99d72dd764c4bc5c5583a0a4c89df22c3'/>
<id>e31794f99d72dd764c4bc5c5583a0a4c89df22c3</id>
<content type='text'>
Add new SlotTypes to StackFrameLayoutAnalysis to disambiguate Fixed and
Variable-Sized stack slots from Variable slots. As Offsets are
unreliable for VLA-area objects, sort these to the end of the list -
using the Frame Index to ensure a deterministic order when Offsets are
equal.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add new SlotTypes to StackFrameLayoutAnalysis to disambiguate Fixed and
Variable-Sized stack slots from Variable slots. As Offsets are
unreliable for VLA-area objects, sort these to the end of the list -
using the Frame Index to ensure a deterministic order when Offsets are
equal.</pre>
</div>
</content>
</entry>
<entry>
<title>[StackFrameLayoutAnalysis] Use target-specific hook for SP offsets (#100386)</title>
<updated>2024-07-25T08:03:48+00:00</updated>
<author>
<name>Hari Limaye</name>
<email>hari.limaye@arm.com</email>
</author>
<published>2024-07-25T08:03:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dc1c00f6b13f724154f9883990f8b21fb8dcccef'/>
<id>dc1c00f6b13f724154f9883990f8b21fb8dcccef</id>
<content type='text'>
StackFrameLayoutAnalysis currently calculates SP-relative offsets in a
target-independent way via MachineFrameInfo offsets. This is incorrect
for some Targets, e.g. AArch64, when there are scalable vector stack
slots.

This patch adds a virtual function to TargetFrameLowering to provide
offsets from SP, with a default implementation matching what is
currently used in StackFrameLayoutAnalysis, and refactors
StackFrameLayoutAnalysis to use this function. Only non-zero scalable
offsets are output by the analysis pass.

An implementation of this function is added for AArch64 targets, which
aims to provide correct SP offsets in most cases.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
StackFrameLayoutAnalysis currently calculates SP-relative offsets in a
target-independent way via MachineFrameInfo offsets. This is incorrect
for some Targets, e.g. AArch64, when there are scalable vector stack
slots.

This patch adds a virtual function to TargetFrameLowering to provide
offsets from SP, with a default implementation matching what is
currently used in StackFrameLayoutAnalysis, and refactors
StackFrameLayoutAnalysis to use this function. Only non-zero scalable
offsets are output by the analysis pass.

An implementation of this function is added for AArch64 targets, which
aims to provide correct SP offsets in most cases.</pre>
</div>
</content>
</entry>
<entry>
<title>[StackFrameLayoutAnalysis] Add basic Scalable stack slot output (#99883)</title>
<updated>2024-07-22T19:45:18+00:00</updated>
<author>
<name>David Green</name>
<email>david.green@arm.com</email>
</author>
<published>2024-07-22T19:45:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=e09032f7a36ffb5eb8638a3933aeca7015a9579a'/>
<id>e09032f7a36ffb5eb8638a3933aeca7015a9579a</id>
<content type='text'>
The existing StackFrameLayoutAnalysis details do not do well with
Scalable vector stack slots, which are not marked as scalable and
intertwined with the other fixed-size slots. This patch adds some very
basic support, marking them as scalable and sorting them to the end of
the list. The slot addresses are not really correct (for fixed as well
as scalable), but this prints something a little better with the limited
information curently available.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The existing StackFrameLayoutAnalysis details do not do well with
Scalable vector stack slots, which are not marked as scalable and
intertwined with the other fixed-size slots. This patch adds some very
basic support, marking them as scalable and sorting them to the end of
the list. The slot addresses are not really correct (for fixed as well
as scalable), but this prints something a little better with the limited
information curently available.</pre>
</div>
</content>
</entry>
<entry>
<title>[MachineFunction][DebugInfo][nfc] Introduce EntryValue variable kind</title>
<updated>2023-05-11T11:29:57+00:00</updated>
<author>
<name>Felipe de Azevedo Piovezan</name>
<email>fpiovezan@apple.com</email>
</author>
<published>2023-04-30T13:48:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3db7d0dffb9875e8e180567b079f7d8e3fc5843f'/>
<id>3db7d0dffb9875e8e180567b079f7d8e3fc5843f</id>
<content type='text'>
MachineFunction keeps a table of variables whose addresses never change
throughout the function. Today, the only kinds of locations it can
handle are stack slots.

However, we could expand this for variables whose address is derived
from the value a register had upon function entry. One case where this
happens is with variables alive across coroutine funclets: these can
be placed in a coroutine frame object whose pointer is placed in a
register that is an argument to coroutine funclets.

```
define @foo(ptr %frame_ptr) {
  dbg.declare(%frame_ptr, !some_var,
              !DIExpression(EntryValue, &lt;ptr_arithmetic&gt;))
```

This is a patch in a series that aims to improve the debug information
generated by the CoroSplit pass in the context of `swiftasync`
arguments. Variables stored in the coroutine frame _must_ be described
the entry_value of the ABI-defined register containing a pointer to the
coroutine frame. Since these variables have a single location throughout
their lifetime, they are candidates for being stored in the
MachineFunction table.

Differential Revision: https://reviews.llvm.org/D149879
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
MachineFunction keeps a table of variables whose addresses never change
throughout the function. Today, the only kinds of locations it can
handle are stack slots.

However, we could expand this for variables whose address is derived
from the value a register had upon function entry. One case where this
happens is with variables alive across coroutine funclets: these can
be placed in a coroutine frame object whose pointer is placed in a
register that is an argument to coroutine funclets.

```
define @foo(ptr %frame_ptr) {
  dbg.declare(%frame_ptr, !some_var,
              !DIExpression(EntryValue, &lt;ptr_arithmetic&gt;))
```

This is a patch in a series that aims to improve the debug information
generated by the CoroSplit pass in the context of `swiftasync`
arguments. Variables stored in the coroutine frame _must_ be described
the entry_value of the ABI-defined register containing a pointer to the
coroutine frame. Since these variables have a single location throughout
their lifetime, they are candidates for being stored in the
MachineFunction table.

Differential Revision: https://reviews.llvm.org/D149879
</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm][codegen] Fix non-determinism in StackFrameLayoutAnalysisPass output</title>
<updated>2023-01-19T20:04:14+00:00</updated>
<author>
<name>Paul Kirth</name>
<email>paulkirth@google.com</email>
</author>
<published>2023-01-19T16:14:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=af9a452e57554c2c5e876986e33c2a75314259e8'/>
<id>af9a452e57554c2c5e876986e33c2a75314259e8</id>
<content type='text'>
We were iterating over a SmallPtrSet when outputting slot variables.
This is still correct but made the test fail under reverse iteration.
This patch replaces the SmallPtrSet with a SmallVector.

Also remove the "Stack Frame Layout" lines from arm64-opt-remarks-lazy-bfi test,
since those also break under reverse iteration.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D142127
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We were iterating over a SmallPtrSet when outputting slot variables.
This is still correct but made the test fail under reverse iteration.
This patch replaces the SmallPtrSet with a SmallVector.

Also remove the "Stack Frame Layout" lines from arm64-opt-remarks-lazy-bfi test,
since those also break under reverse iteration.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D142127
</pre>
</div>
</content>
</entry>
</feed>
