<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/IR/DataLayout.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[DataLayout][LangRef] Split non-integral and unstable pointer properties</title>
<updated>2025-09-23T18:16:47+00:00</updated>
<author>
<name>Alexander Richardson</name>
<email>alexrichardson@google.com</email>
</author>
<published>2025-09-23T18:16:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dde000a7d619ab2032f3e721edc850fb421e50cd'/>
<id>dde000a7d619ab2032f3e721edc850fb421e50cd</id>
<content type='text'>
This commit adds finer-grained versions of isNonIntegralAddressSpace() and
isNonIntegralPointerType() where the current semantics prohibit
introduction of both ptrtoint and inttoptr instructions. The current
semantics are too strict for some targets (e.g. AMDGPU/CHERI) where
ptrtoint has a stable value, but the pointer has additional metadata.
Currently, marking a pointer address space as non-integral also marks it
as having an unstable bitwise representation (e.g. when pointers can be
changed by a copying GC). This property inhibits a lot of
optimizations that are perfectly legal for other non-integral pointers
such as fat pointers or CHERI capabilities that have a well-defined
bitwise representation but can't be created with only an address.

This change splits the properties of non-integral pointers and allows
for address spaces to be marked as unstable or non-integral (or both)
independently using the 'p' part of the DataLayout string.
A 'u' following the p marks the address space as unstable and specifying
a index width != representation width marks it as non-integral.
Finally, we also add an 'e' flag to mark pointers with external state
(such as the CHERI capability validity) state. These pointers require
special handling of loads and stores in addition to being non-integral.

This does not change the checks in any of the passes yet - we
currently keep the existing non-integral behaviour. In the future I plan
to audit calls to DL.isNonIntegral[PointerType]() and replace them with
the DL.mustNotIntroduce{IntToPtr,PtrToInt}() checks that allow for more
optimizations.

RFC: https://discourse.llvm.org/t/rfc-finer-grained-non-integral-pointer-properties/83176

Reviewed By: nikic, krzysz00

Pull Request: https://github.com/llvm/llvm-project/pull/105735
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit adds finer-grained versions of isNonIntegralAddressSpace() and
isNonIntegralPointerType() where the current semantics prohibit
introduction of both ptrtoint and inttoptr instructions. The current
semantics are too strict for some targets (e.g. AMDGPU/CHERI) where
ptrtoint has a stable value, but the pointer has additional metadata.
Currently, marking a pointer address space as non-integral also marks it
as having an unstable bitwise representation (e.g. when pointers can be
changed by a copying GC). This property inhibits a lot of
optimizations that are perfectly legal for other non-integral pointers
such as fat pointers or CHERI capabilities that have a well-defined
bitwise representation but can't be created with only an address.

This change splits the properties of non-integral pointers and allows
for address spaces to be marked as unstable or non-integral (or both)
independently using the 'p' part of the DataLayout string.
A 'u' following the p marks the address space as unstable and specifying
a index width != representation width marks it as non-integral.
Finally, we also add an 'e' flag to mark pointers with external state
(such as the CHERI capability validity) state. These pointers require
special handling of loads and stores in addition to being non-integral.

This does not change the checks in any of the passes yet - we
currently keep the existing non-integral behaviour. In the future I plan
to audit calls to DL.isNonIntegral[PointerType]() and replace them with
the DL.mustNotIntroduce{IntToPtr,PtrToInt}() checks that allow for more
optimizations.

RFC: https://discourse.llvm.org/t/rfc-finer-grained-non-integral-pointer-properties/83176

Reviewed By: nikic, krzysz00

Pull Request: https://github.com/llvm/llvm-project/pull/105735
</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Move data layout string computation to TargetParser (#157612)</title>
<updated>2025-09-11T18:05:29+00:00</updated>
<author>
<name>Reid Kleckner</name>
<email>rnk@google.com</email>
</author>
<published>2025-09-11T18:05:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=f3efbce4a73c595a038a6778a28c307ea987c2a7'/>
<id>f3efbce4a73c595a038a6778a28c307ea987c2a7</id>
<content type='text'>
Clang and other frontends generally need the LLVM data layout string in
order to generate LLVM IR modules for LLVM. MLIR clients often need it
as well, since MLIR users often lower to LLVM IR.

Before this change, the LLVM datalayout string was computed in the
LLVM${TGT}CodeGen library in the relevant TargetMachine subclass.
However, none of the logic for computing the data layout string requires
any details of code generation. Clients who want to avoid duplicating
this information were forced to link in LLVMCodeGen and all registered
targets, leading to bloated binaries. This happened in PR #145899,
which measurably increased binary size for some of our users.

By moving this information to the TargetParser library, we
can delete the duplicate datalayout strings in Clang, and retain the
ability to generate IR for unregistered targets.

This is intended to be a very mechanical LLVM-only change, but there is
an immediately obvious follow-up to clang, which will be prepared
separately.

The vast majority of data layouts are computable with two inputs: the
triple and the "ABI name". There is only one exception, NVPTX, which has
a cl::opt to enable short device pointers. I invented a "shortptr" ABI
name to pass this option through the target independent interface.
Everything else fits. Mips is a bit awkward because it uses a special
MipsABIInfo abstraction, which includes members with codegen-like
concepts like ABI physical registers that can't live in TargetParser. I
think the string logic of looking for "n32" "n64" etc is reasonable to
duplicate. We have plenty of other minor duplication to preserve
layering.

---------

Co-authored-by: Matt Arsenault &lt;arsenm2@gmail.com&gt;
Co-authored-by: Sergei Barannikov &lt;barannikov88@gmail.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Clang and other frontends generally need the LLVM data layout string in
order to generate LLVM IR modules for LLVM. MLIR clients often need it
as well, since MLIR users often lower to LLVM IR.

Before this change, the LLVM datalayout string was computed in the
LLVM${TGT}CodeGen library in the relevant TargetMachine subclass.
However, none of the logic for computing the data layout string requires
any details of code generation. Clients who want to avoid duplicating
this information were forced to link in LLVMCodeGen and all registered
targets, leading to bloated binaries. This happened in PR #145899,
which measurably increased binary size for some of our users.

By moving this information to the TargetParser library, we
can delete the duplicate datalayout strings in Clang, and retain the
ability to generate IR for unregistered targets.

This is intended to be a very mechanical LLVM-only change, but there is
an immediately obvious follow-up to clang, which will be prepared
separately.

The vast majority of data layouts are computable with two inputs: the
triple and the "ABI name". There is only one exception, NVPTX, which has
a cl::opt to enable short device pointers. I invented a "shortptr" ABI
name to pass this option through the target independent interface.
Everything else fits. Mips is a bit awkward because it uses a special
MipsABIInfo abstraction, which includes members with codegen-like
concepts like ABI physical registers that can't live in TargetParser. I
think the string logic of looking for "n32" "n64" etc is reasonable to
duplicate. We have plenty of other minor duplication to preserve
layering.

---------

Co-authored-by: Matt Arsenault &lt;arsenm2@gmail.com&gt;
Co-authored-by: Sergei Barannikov &lt;barannikov88@gmail.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[DataLayout] Remove i1 alignment entry (#156657)</title>
<updated>2025-09-08T07:51:16+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>npopov@redhat.com</email>
</author>
<published>2025-09-08T07:51:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6a571a1fb3d7bc58ae5f0ad8b85ece707ae5fcc3'/>
<id>6a571a1fb3d7bc58ae5f0ad8b85ece707ae5fcc3</id>
<content type='text'>
I don't think we need to explicitly specify i1 alignment, as this is
going to fall back to i8 alignment.

This may change behavior if a data layout explicitly sets i8 alignment
without also setting i1 layout, but I'd expect this to be a bug fix in
that case.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I don't think we need to explicitly specify i1 alignment, as this is
going to fall back to i8 alignment.

This may change behavior if a data layout explicitly sets i8 alignment
without also setting i1 layout, but I'd expect this to be a bug fix in
that case.</pre>
</div>
</content>
</entry>
<entry>
<title>[DataLayout] Specialize the getTypeAllocSize() implementation (#156687)</title>
<updated>2025-09-04T07:27:33+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>npopov@redhat.com</email>
</author>
<published>2025-09-04T07:27:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4d927a5faf42d025410586f0cdc3bf60ef198a86'/>
<id>4d927a5faf42d025410586f0cdc3bf60ef198a86</id>
<content type='text'>
getTypeAllocSize() currently works by taking the type store size and
aligning it to the ABI alignment. However, this ends up doing redundant
work in various cases, for example arrays will unnecessarily repeat the
alignment step, and structs will fetch the StructLayout multiple times.

As this code is rather hot (it is called every time we need to calculate
GEP offsets for example), specialize the implementation. This repeats a
small amount of logic from getAlignment(), but I think that's
worthwhile.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
getTypeAllocSize() currently works by taking the type store size and
aligning it to the ABI alignment. However, this ends up doing redundant
work in various cases, for example arrays will unnecessarily repeat the
alignment step, and structs will fetch the StructLayout multiple times.

As this code is rather hot (it is called every time we need to calculate
GEP offsets for example), specialize the implementation. This repeats a
small amount of logic from getAlignment(), but I think that's
worthwhile.</pre>
</div>
</content>
</entry>
<entry>
<title>[DataLayout] Use linear scan to determine integer alignment (NFC)</title>
<updated>2025-09-03T12:04:54+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>npopov@redhat.com</email>
</author>
<published>2025-09-03T11:09:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=38b376f1927df5c1dea1065041779b28b13b9dd9'/>
<id>38b376f1927df5c1dea1065041779b28b13b9dd9</id>
<content type='text'>
The number of alignment entries is usually very small (5-7), so
it is more efficient to use a linear scan than a binary search.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The number of alignment entries is usually very small (5-7), so
it is more efficient to use a linear scan than a binary search.
</pre>
</div>
</content>
</entry>
<entry>
<title>[DataLayout] Explicitly call getFixedValue() (NFC)</title>
<updated>2025-09-02T07:52:16+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>npopov@redhat.com</email>
</author>
<published>2025-09-01T16:14:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a7ff60788df8d86f930c4588c9f2c4130da55630'/>
<id>a7ff60788df8d86f930c4588c9f2c4130da55630</id>
<content type='text'>
Instead of relying on the implicit cast. The scalable case has
been explicitly checked beforehand.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of relying on the implicit cast. The scalable case has
been explicitly checked beforehand.
</pre>
</div>
</content>
</entry>
<entry>
<title>[IR] Use llvm::upper_bound (NFC) (#139656)</title>
<updated>2025-05-13T05:59:05+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2025-05-13T05:59:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=0fedccf3899778c9bf09964b0eab5fb35927c39e'/>
<id>0fedccf3899778c9bf09964b0eab5fb35927c39e</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[NFC][llvm] Drop isOsWindowsOrUEFI API (#138733)</title>
<updated>2025-05-06T22:41:35+00:00</updated>
<author>
<name>Prabhu Rajasekaran</name>
<email>prabhukr@google.com</email>
</author>
<published>2025-05-06T22:41:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=aa77f7a923780fc512977bb7df149e9313658106'/>
<id>aa77f7a923780fc512977bb7df149e9313658106</id>
<content type='text'>
The Triple and SubTarget API functions isOsWindowsOrUEFI is not
preferred. Dropping them.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The Triple and SubTarget API functions isOsWindowsOrUEFI is not
preferred. Dropping them.</pre>
</div>
</content>
</entry>
<entry>
<title>[nfc][llvm] Clean up isUEFI checks (#124845)</title>
<updated>2025-01-28T23:18:10+00:00</updated>
<author>
<name>Prabhuk</name>
<email>prabhukr@google.com</email>
</author>
<published>2025-01-28T23:18:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=e7f02241ad4d38fc50b63ad589a024323e6bc3c6'/>
<id>e7f02241ad4d38fc50b63ad589a024323e6bc3c6</id>
<content type='text'>
The check for `isOSWindows() || isUEFI()` is used in several places
across the codebase. Introducing `isOSWindowsOrUEFI()` in Triple.h
to simplify these checks.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The check for `isOSWindows() || isUEFI()` is used in several places
across the codebase. Introducing `isOSWindowsOrUEFI()` in Triple.h
to simplify these checks.</pre>
</div>
</content>
</entry>
<entry>
<title>[DataLayout] Remove getMaxIndexSizeInBits() API</title>
<updated>2024-12-13T12:01:01+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>npopov@redhat.com</email>
</author>
<published>2024-12-13T12:01:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=07aab4a3cdab3d46caab270845413c5ba4546b50'/>
<id>07aab4a3cdab3d46caab270845413c5ba4546b50</id>
<content type='text'>
The last use was removed in #119365, and we should not add more
uses of this concept in the future either.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The last use was removed in #119365, and we should not add more
uses of this concept in the future either.
</pre>
</div>
</content>
</entry>
</feed>
