<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git, branch users/guy-david/aarch64-post-truncating-store-simd</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[AArch64] Post-truncating store for i8/i16 on lane zero</title>
<updated>2025-10-20T12:45:11+00:00</updated>
<author>
<name>Guy David</name>
<email>guyda96@gmail.com</email>
</author>
<published>2025-10-19T11:32:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=cc5b39683eb235d4a58d1b9e1a1886d5ec304a75'/>
<id>cc5b39683eb235d4a58d1b9e1a1886d5ec304a75</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[InstCombine] Move ptrtoaddr tests to InstSimplify (NFC)</title>
<updated>2025-10-20T12:39:40+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>npopov@redhat.com</email>
</author>
<published>2025-10-20T12:37:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2e7afb11706e474af6801e63daa8085479cdd08d'/>
<id>2e7afb11706e474af6801e63daa8085479cdd08d</id>
<content type='text'>
All the existing tests test code either in ConstantFolding or
InstSimplify, so move them to use -passes=instsimplify instead of
-passes=instcombine. This makes sure we keep InstSimplify coverage
even if there are subsuming InstCombine folds.

This requires writing some of the constant folding tests in a
different way, as InstSimplify does not try to re-fold already
existing constant expressions.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All the existing tests test code either in ConstantFolding or
InstSimplify, so move them to use -passes=instsimplify instead of
-passes=instcombine. This makes sure we keep InstSimplify coverage
even if there are subsuming InstCombine folds.

This requires writing some of the constant folding tests in a
different way, as InstSimplify does not try to re-fold already
existing constant expressions.
</pre>
</div>
</content>
</entry>
<entry>
<title>[Clang][NFC] Rename UnqualPtrTy to DefaultPtrTy (#163207)</title>
<updated>2025-10-20T12:34:21+00:00</updated>
<author>
<name>Juan Manuel Martinez Caamaño</name>
<email>jmartinezcaamao@gmail.com</email>
</author>
<published>2025-10-20T12:34:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=74d77dc2ec2f93c151bd98687799ed90e9bea849'/>
<id>74d77dc2ec2f93c151bd98687799ed90e9bea849</id>
<content type='text'>
`UnqualPtrTy` didn't always match `llvm::PointerType::getUnqual`:
sometimes it returned a pointer that is not in address space 0 (notably
for SPIRV).

Since `UnqualPtrTy` was used as the "generic" or "default" pointer type,
this patch renames it to `DefaultPtrTy` to avoid confusion with LLVM's
`PointerType::getUnqual`.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
`UnqualPtrTy` didn't always match `llvm::PointerType::getUnqual`:
sometimes it returned a pointer that is not in address space 0 (notably
for SPIRV).

Since `UnqualPtrTy` was used as the "generic" or "default" pointer type,
this patch renames it to `DefaultPtrTy` to avoid confusion with LLVM's
`PointerType::getUnqual`.</pre>
</div>
</content>
</entry>
<entry>
<title>[SLP]Do not pack div-like copyable values</title>
<updated>2025-10-20T12:19:42+00:00</updated>
<author>
<name>Alexey Bataev</name>
<email>a.bataev@outlook.com</email>
</author>
<published>2025-10-20T11:15:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=154138c25f358ed812eafc2880225c3d88221e8a'/>
<id>154138c25f358ed812eafc2880225c3d88221e8a</id>
<content type='text'>
If a main instruction in the copyables is a div-like instruction, the
compiler cannot pack duplicates, extending with poisons, these
instructions, being vectorize, will result in undefined behavior.

Fixes #164185
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a main instruction in the copyables is a div-like instruction, the
compiler cannot pack duplicates, extending with poisons, these
instructions, being vectorize, will result in undefined behavior.

Fixes #164185
</pre>
</div>
</content>
</entry>
<entry>
<title>[InstSimplify] Support ptrtoaddr in simplifyCastInst()</title>
<updated>2025-10-20T12:18:34+00:00</updated>
<author>
<name>Nikita Popov</name>
<email>npopov@redhat.com</email>
</author>
<published>2025-10-10T12:57:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ee50839700af4800a2d72702a5583b72e1ffb81e'/>
<id>ee50839700af4800a2d72702a5583b72e1ffb81e</id>
<content type='text'>
Handle ptrtoaddr the same way as ptrtoint. The fold already only
operates on the index/address bits.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Handle ptrtoaddr the same way as ptrtoint. The fold already only
operates on the index/address bits.
</pre>
</div>
</content>
</entry>
<entry>
<title>[mlir][docs] Add documentation for No-rollback Conversion Driver (#164071)</title>
<updated>2025-10-20T12:04:30+00:00</updated>
<author>
<name>Matthias Springer</name>
<email>me@m-sp.org</email>
</author>
<published>2025-10-20T12:04:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=565e9fa1956802f9c4aefe2dea9a1061f52667b0'/>
<id>565e9fa1956802f9c4aefe2dea9a1061f52667b0</id>
<content type='text'>
Add documentation for the no-rollback conversion driver. Also improve
the documentation of the old rollback driver. In particular: which
modifications are performed immediately and which are delayed.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add documentation for the no-rollback conversion driver. Also improve
the documentation of the old rollback driver. In particular: which
modifications are performed immediately and which are delayed.</pre>
</div>
</content>
</entry>
<entry>
<title>[AArch64] Improve lowering of GPR zeroing in copyPhysReg (#163059)</title>
<updated>2025-10-20T11:59:58+00:00</updated>
<author>
<name>Tomer Shafir</name>
<email>tomer.shafir8@gmail.com</email>
</author>
<published>2025-10-20T11:59:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=5ac616f3327e0d5a7871b92c91b17fd021b35d0d'/>
<id>5ac616f3327e0d5a7871b92c91b17fd021b35d0d</id>
<content type='text'>
This patch pivots GPR32 and GPR64 zeroing into distinct branches to
simplify the code an improve the lowering.

Zeroing GPR moves are now handled differently than non-zeroing ones.
Zero source registers WZR and XZR do not require register annotations of
undef, implicit and kill. The non-zeroing source now cannot process WZR
removing the ternary expression. This patch also moves GPR64 logic right
after GPR32 for better organization.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch pivots GPR32 and GPR64 zeroing into distinct branches to
simplify the code an improve the lowering.

Zeroing GPR moves are now handled differently than non-zeroing ones.
Zero source registers WZR and XZR do not require register annotations of
undef, implicit and kill. The non-zeroing source now cannot process WZR
removing the ternary expression. This patch also moves GPR64 logic right
after GPR32 for better organization.</pre>
</div>
</content>
</entry>
<entry>
<title>[AArch64][GlobalISel] Add rax1.ll test converage. NFC</title>
<updated>2025-10-20T11:53:30+00:00</updated>
<author>
<name>David Green</name>
<email>david.green@arm.com</email>
</author>
<published>2025-10-20T11:53:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=324bd1588123f7b168e8a9842a96a6f799e4a0db'/>
<id>324bd1588123f7b168e8a9842a96a6f799e4a0db</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[lldb] Fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers (#163646)</title>
<updated>2025-10-20T11:48:00+00:00</updated>
<author>
<name>Matej Košík</name>
<email>m4tej.kosik@gmail.com</email>
</author>
<published>2025-10-20T11:48:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=b2ad90b8dcafdf3baea10457c4cac73bb34d06ef'/>
<id>b2ad90b8dcafdf3baea10457c4cac73bb34d06ef</id>
<content type='text'>
Fix the `RegisterValue::SetValueFromData` method so that it works also
for 128-bit registers that contain integers.

Without this change, the `RegisterValue::SetValueFromData` method does
not work correctly
for 128-bit registers that contain (signed or unsigned) integers.

---

Steps to reproduce the problem:

(1)

Create a program that writes a 128-bit number to a 128-bit registers
`xmm0`. E.g.:
```
#include &lt;stdint.h&gt;

int main() {
  __asm__ volatile (
      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
      :
      : [lo]"r"(0x7766554433221100),
        [hi]"r"(0xffeeddccbbaa9988)
  );
  return 0;
}
```

(2)

Compile this program with LLVM compiler:
```
$ $YOUR/clang -g -o main main.c
```

(3)

Modify LLDB so that when it will be reading value from the `xmm0`
register, instead of assuming that it is vector register, it will treat
it as if it contain an integer. This can be achieved e.g. this way:
```
diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp
index 0e99451c3b70..a4b51db3e56d 100644
--- a/lldb/source/Utility/RegisterValue.cpp
+++ b/lldb/source/Utility/RegisterValue.cpp
@@ -188,6 +188,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &amp;reg_info,
     break;
   case eEncodingUint:
   case eEncodingSint:
+  case eEncodingVector:
     if (reg_info.byte_size == 1)
       SetUInt8(src.GetMaxU32(&amp;src_offset, src_len));
     else if (reg_info.byte_size &lt;= 2)
@@ -217,23 +218,6 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &amp;reg_info,
     else if (reg_info.byte_size == sizeof(long double))
       SetLongDouble(src.GetLongDouble(&amp;src_offset));
     break;
-  case eEncodingVector: {
-    m_type = eTypeBytes;
-    assert(reg_info.byte_size &lt;= kMaxRegisterByteSize);
-    buffer.bytes.resize(reg_info.byte_size);
-    buffer.byte_order = src.GetByteOrder();
-    if (src.CopyByteOrderedData(
-            src_offset,          // offset within "src" to start extracting data
-            src_len,             // src length
-            buffer.bytes.data(), // dst buffer
-            buffer.bytes.size(), // dst length
-            buffer.byte_order) == 0) // dst byte order
-    {
-      error = Status::FromErrorStringWithFormat(
-          "failed to copy data for register write of %s", reg_info.name);
-      return error;
-    }
-  }
   }
 
   if (m_type == eTypeInvalid)
```

(4)

Rebuild the LLDB.

(5)

Observe what happens how LLDB will print the content of this register
after it was initialized with 128-bit value.
```
$YOUR/lldb --source ./main
(lldb) target create main
Current executable set to '.../main' (x86_64).
(lldb) breakpoint set --file main.c --line 11
Breakpoint 1: where = main`main + 45 at main.c:11:3, address = 0x000000000000164d
(lldb) settings set stop-line-count-before 20
(lldb) process launch
Process 2568735 launched: '.../main' (x86_64)
Process 2568735 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x000055555555564d main`main at main.c:11:3
   1   	#include &lt;stdint.h&gt;
   2   	
   3   	int main() {
   4   	  __asm__ volatile (
   5   	      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
   6   	      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
   7   	      :
   8   	      : [lo]"r"(0x7766554433221100),
   9   	        [hi]"r"(0xffeeddccbbaa9988)
   10  	  );
-&gt; 11  	  return 0;
   12  	}
(lldb) register read --format hex xmm0
    xmm0 = 0x7766554433221100ffeeddccbbaa9988
```

You can see that the upper and lower 64-bit wide halves are swapped.

---------

Co-authored-by: Matej Košík &lt;matej.kosik@codasip.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix the `RegisterValue::SetValueFromData` method so that it works also
for 128-bit registers that contain integers.

Without this change, the `RegisterValue::SetValueFromData` method does
not work correctly
for 128-bit registers that contain (signed or unsigned) integers.

---

Steps to reproduce the problem:

(1)

Create a program that writes a 128-bit number to a 128-bit registers
`xmm0`. E.g.:
```
#include &lt;stdint.h&gt;

int main() {
  __asm__ volatile (
      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
      :
      : [lo]"r"(0x7766554433221100),
        [hi]"r"(0xffeeddccbbaa9988)
  );
  return 0;
}
```

(2)

Compile this program with LLVM compiler:
```
$ $YOUR/clang -g -o main main.c
```

(3)

Modify LLDB so that when it will be reading value from the `xmm0`
register, instead of assuming that it is vector register, it will treat
it as if it contain an integer. This can be achieved e.g. this way:
```
diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp
index 0e99451c3b70..a4b51db3e56d 100644
--- a/lldb/source/Utility/RegisterValue.cpp
+++ b/lldb/source/Utility/RegisterValue.cpp
@@ -188,6 +188,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &amp;reg_info,
     break;
   case eEncodingUint:
   case eEncodingSint:
+  case eEncodingVector:
     if (reg_info.byte_size == 1)
       SetUInt8(src.GetMaxU32(&amp;src_offset, src_len));
     else if (reg_info.byte_size &lt;= 2)
@@ -217,23 +218,6 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &amp;reg_info,
     else if (reg_info.byte_size == sizeof(long double))
       SetLongDouble(src.GetLongDouble(&amp;src_offset));
     break;
-  case eEncodingVector: {
-    m_type = eTypeBytes;
-    assert(reg_info.byte_size &lt;= kMaxRegisterByteSize);
-    buffer.bytes.resize(reg_info.byte_size);
-    buffer.byte_order = src.GetByteOrder();
-    if (src.CopyByteOrderedData(
-            src_offset,          // offset within "src" to start extracting data
-            src_len,             // src length
-            buffer.bytes.data(), // dst buffer
-            buffer.bytes.size(), // dst length
-            buffer.byte_order) == 0) // dst byte order
-    {
-      error = Status::FromErrorStringWithFormat(
-          "failed to copy data for register write of %s", reg_info.name);
-      return error;
-    }
-  }
   }
 
   if (m_type == eTypeInvalid)
```

(4)

Rebuild the LLDB.

(5)

Observe what happens how LLDB will print the content of this register
after it was initialized with 128-bit value.
```
$YOUR/lldb --source ./main
(lldb) target create main
Current executable set to '.../main' (x86_64).
(lldb) breakpoint set --file main.c --line 11
Breakpoint 1: where = main`main + 45 at main.c:11:3, address = 0x000000000000164d
(lldb) settings set stop-line-count-before 20
(lldb) process launch
Process 2568735 launched: '.../main' (x86_64)
Process 2568735 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x000055555555564d main`main at main.c:11:3
   1   	#include &lt;stdint.h&gt;
   2   	
   3   	int main() {
   4   	  __asm__ volatile (
   5   	      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
   6   	      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
   7   	      :
   8   	      : [lo]"r"(0x7766554433221100),
   9   	        [hi]"r"(0xffeeddccbbaa9988)
   10  	  );
-&gt; 11  	  return 0;
   12  	}
(lldb) register read --format hex xmm0
    xmm0 = 0x7766554433221100ffeeddccbbaa9988
```

You can see that the upper and lower 64-bit wide halves are swapped.

---------

Co-authored-by: Matej Košík &lt;matej.kosik@codasip.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[libcxx] Optimize std::generate for segmented iterators (#163006)</title>
<updated>2025-10-20T11:37:33+00:00</updated>
<author>
<name>Connector Switch</name>
<email>c8ef@outlook.com</email>
</author>
<published>2025-10-20T11:37:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=46e88169284aadb06fafcbe18ff440ff0fdebfa3'/>
<id>46e88169284aadb06fafcbe18ff440ff0fdebfa3</id>
<content type='text'>
Part of #102817.

This patch attempts to optimize the performance of `std::generate` for
segmented iterators. Below are the benchmark numbers from
`libcxx\test\benchmarks\algorithms\modifying\generate.bench.cpp`. Test
cases that use segmented iterators have also been added.

- before

```
std::generate(deque&lt;int&gt;)/32           194 ns          193 ns      3733333
std::generate(deque&lt;int&gt;)/50           276 ns          276 ns      2488889
std::generate(deque&lt;int&gt;)/1024        5096 ns         5022 ns       112000
std::generate(deque&lt;int&gt;)/8192       40806 ns        40806 ns        17231
```

- after

```
std::generate(deque&lt;int&gt;)/32           106 ns          105 ns      6400000
std::generate(deque&lt;int&gt;)/50           139 ns          138 ns      4977778
std::generate(deque&lt;int&gt;)/1024        2713 ns         2699 ns       248889
std::generate(deque&lt;int&gt;)/8192       18983 ns        19252 ns        37333
```

---------

Co-authored-by: A. Jiang &lt;de34@live.cn&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Part of #102817.

This patch attempts to optimize the performance of `std::generate` for
segmented iterators. Below are the benchmark numbers from
`libcxx\test\benchmarks\algorithms\modifying\generate.bench.cpp`. Test
cases that use segmented iterators have also been added.

- before

```
std::generate(deque&lt;int&gt;)/32           194 ns          193 ns      3733333
std::generate(deque&lt;int&gt;)/50           276 ns          276 ns      2488889
std::generate(deque&lt;int&gt;)/1024        5096 ns         5022 ns       112000
std::generate(deque&lt;int&gt;)/8192       40806 ns        40806 ns        17231
```

- after

```
std::generate(deque&lt;int&gt;)/32           106 ns          105 ns      6400000
std::generate(deque&lt;int&gt;)/50           139 ns          138 ns      4977778
std::generate(deque&lt;int&gt;)/1024        2713 ns         2699 ns       248889
std::generate(deque&lt;int&gt;)/8192       18983 ns        19252 ns        37333
```

---------

Co-authored-by: A. Jiang &lt;de34@live.cn&gt;</pre>
</div>
</content>
</entry>
</feed>
