summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-10-30fix after rebaseusers/ro-i/callbr-amdgpu_2Robert Imschweiler
2025-10-30[AMDGPU][UnifyDivergentExitNodes][StructurizeCFG] Add support for callbr ↵Robert Imschweiler
instruction with basic inline-asm Finishes adding basic inline-asm callbr support for AMDGPU, started by https://github.com/llvm/llvm-project/pull/149308.
2025-10-30bunch of small changes to fix a number of LIT tests on z/OS (#165567)Sean Perry
A collection of small changes to get a number of lit tests working on z/OS.
2025-10-30[AMDGPU][FixIrreducible][UnifyLoopExits] Support callbr with inline-asm ↵Robert Imschweiler
(#149308) First batch of changes to add support for inline-asm callbr for the AMDGPU backend.
2025-10-30[X86] combinePTESTCC - fold PTESTZ(X,SIGNMASK) -> VTESTPD/PSZ(X,X) on AVX ↵Simon Pilgrim
targets (#165676) If the PTEST is just using the ZF result and one of the operands is a i32/i64 sign mask we can use the TESTPD/PS instructions instead and avoid the use of an extra constant. Fixes some codegen identified in #156233
2025-10-30[CIR] Upstream handling for __builtin_prefetch (Typo Fix) (#165209)Shawn K
Not sure if this warrants a PR, but I realized there was a typo in a test filename from my previous PR #164387.
2025-10-30Reapply "[HIP][Clang] Remove __AMDGCN_WAVEFRONT_SIZE macros" (#164217)Fabian Ritter
This reverts commit 78bf682cb9033cf6a5bbc733e062c7b7d825fdaf. Original PR: #157463 Revert PR: #158566 The relevant buildbots have been updated to a ROCm version that does not use the macros anymore to avoid the failures. Implements SWDEV-522062.
2025-10-30[X86] Narrow BT/BTC/BTR/BTS compare + RMW patterns on very large integers ↵Simon Pilgrim
(#165540) This patch allows us to narrow single bit-test/twiddle operations for larger than legal scalar integers to efficiently operate just on the i32 sub-integer block actually affected. The BITOP(X,SHL(1,IDX)) patterns are split, with the IDX used to access the specific i32 block as well as specific bit within that block. BT comparisons are relatively simple, and builds on the truncated shifted loads fold from #165266. BTC/BTR/BTS bit twiddling patterns need to match the entire RMW pattern to safely confirm only one block is affected, but a similar approach is taken and creates codegen that should allow us to further merge with matching BT opcodes in a future patch (see #165291). The resulting codegen is notably more efficient than the heavily micro-coded memory folded variants of BT/BTC/BTR/BTS. There is still some work to improve the bit insert 'init' patterns included in bittest-big-integer.ll but I'm expecting this to be a straightforward future extension. Fixes #164225
2025-10-30[X86] combinePTESTCC - ensure repeated operands are frozen (#165697)Simon Pilgrim
As noticed on #165676 - if we're increasing the use of an operand we should freeze it
2025-10-30[llvm-cxxfilt] update docs to reflect #106233 (#165709)Mads Marquart
It looks like the documentation for `llvm-cxxfilt`'s `--[no-]strip-underscore` options weren't updated when https://github.com/llvm/llvm-project/pull/106233 was made. CC @Michael137 (I don't have merge rights myself).
2025-10-30[X86] Add ldexp test coverage for avx512 targets (#165698)Simon Pilgrim
Pulled out of the abandoned patch #69710 to act as a baseline for #165694
2025-10-30[LoongArch][NFC] Pre-commit tests for vector type average (#161076)ZhaoQi
2025-10-30[DA] Add tests where dependencies are missed due to overflow (NFC) (#164246)Ryotaro Kasuga
This patch adds test cases that demonstrate missing dependencies in DA caused by the lack of overflow handling. These issues will be addressed by properly inserting overflow checks and bailing out when one is detected. It covers the following dependence test functions: - Strong SIV - Weak-Crossing SIV - Weak-Zero SIV - Symbolic RDIV - GCD MIV It does NOT cover: - Exact SIV - Exact RDIV - Banerjee MIV
2025-10-30[Clang][AArch64] Lower NEON vaddv/vminv/vmaxv builtins to llvm.vector.reduce ↵Paul Walker
intrinsics. (#165400) This is the first step in removing some NEON reduction intrinsics that duplicate the behaviour of their llvm.vector.reduce counterpart. NOTE: The i8/i16 variants differ in that the NEON versions return an i32 result. However, this looks more about making their code generation convenient with SelectionDAG disgarding the extra bits. This is only relevant for the next phase because the Clang usage always truncate their result, making llvm.vector.reduce a drop in replacement.
2025-10-30[lldb-dap][test] skip io_redirection in debug builds (#165593)Ebuka Ezike
Currently all `runInTerminal` test are skipped in debug builds because, when attaching it times out parsing the debug symbols of lldb-dap. Add this test since it is running in teminal.
2025-10-30Revert "[lldb-dap] Improving consistency of tests by removing concurrency." ↵David Spickett
(#165688) Reverts llvm/llvm-project#165496 Due to flaky failures on Arm 32-bit since this change. Detailed in https://github.com/llvm/llvm-project/pull/165496#issuecomment-3467209089.
2025-10-30[clang][OpenMP] New OpenMP 6.0 threadset clause (#135807)Ritanya-B-Bharadwaj
Initial parsing/sema/codegen support for threadset clause in task and taskloop directives [Section 14.8 in in OpenMP 6.0 spec] ---------
2025-10-30[clang][NFC] Make ellipse strings constexpr (#165680)Timm Baeder
Also rename map to Map, remove the m_ prefix from member variables and fix the naming of the existing color variables.
2025-10-30[ORC] Fix missing include for MemoryAccess interface (NFC) (#165576)Stefan Gränitz
MemoryAccess base class was included from Core.h when it was a subclass of ExecutorProcessControl, but this changed in 0faa181434cf959110651fe974bef31e7390eba8
2025-10-30[AMDGPU] insert eof white space (#165673)Pankaj Dwivedi
2025-10-30[GVN] Add tests for pointer replacement with different addr size (NFC)Nikita Popov
2025-10-30[libc++] Fix LLVM 22 TODOs (#153367)Nikolas Klauser
We've upgraded to LLVM 22 now, so we can remove a bunch of TODOs.
2025-10-30[DeveloperPolicy] Add guidelines for adding/enabling passes (#158591)Nikita Popov
This documents two things: * The recommended way to go about adding a new pass. * The criteria for enabling a pass. RFC: https://discourse.llvm.org/t/rfc-guidelines-for-adding-enabling-new-passes/88290
2025-10-30[MemCpyOpt] Allow stack move optimization if one address captured (#165527)Nikita Popov
Allow the stack move optimization (which merges two allocas) when the address of only one alloca is captured (and the provenance is not captured). Both addresses need to be captured to observe that the allocas were merged. Fixes https://github.com/llvm/llvm-project/issues/165484.
2025-10-30[DebugInfo] Add bit size to _BitInt name in debug info (#165583)Orlando Cazalet-Hyams
Follow on from #164372 This changes the DW_AT_name for `_BitInt(N)` from `_BitInt` to `_BitInt(N)`
2025-10-30[AArch64][GlobalISel] Add some GISel test coverage for icmp-and tests. NFCDavid Green
2025-10-30[clang] Update C++ DR status pageVlad Serebrennikov
2025-10-30[utils][UpdateTestChecks] Extract MIR functionality into separate mir.py ↵Valery Pykhtin
module (#165535) This commit extracts some MIR-related code from `common.py` and `update_mir_test_checks.py` into a dedicated `mir.py` module to improve code organization. This is a preparation step for https://github.com/llvm/llvm-project/pull/164965 and also moves some pieces already moved by https://github.com/llvm/llvm-project/pull/140296 All code intentionally moved verbatim with minimal necessary adaptations: * `log()` calls converted to `print(..., file=sys.stderr)` at `mir.py` lines 62, 64 due to a `log` locality.
2025-10-30[clang] Add Bytes/Columns types to TextDiagnostic (#165541)Timm Baeder
In `TextDiagnostic.cpp`, we're using column- and byte indices everywhere, but we were using integers for them which made it hard to know what to pass where, and what was produced. To make matters worse, that `SourceManager` considers a "column" is a byte in `TextDiagnostic`. Add `Bytes` and `Columns` structs, which are not related so API using them can differentiate between values interpreted columns or bytes.
2025-10-30[mlir] Fix use-after-move issues (#165660)Slava Gurevich
This patch addresses two use-after-move issues: 1. `Timing.cpp` A variable was std::moved and then immediately passed to an `assert()` check. Since the moved-from state made the assertion condition trivially true, the check was effectively useless. The `assert()` is removed. 2. `Query.cpp` The `matcher` object was moved-from and then subsequently used as if it still retained valid state. The fix ensures no subsequent use for the moved-from variable. Testing: `ninja check-mlir`
2025-10-30[AMDGPU] Enable "amdgpu-uniform-intrinsic-combine" pass in pipeline. (#162819)Pankaj Dwivedi
This PR enables AMDGPUUniformIntrinsicCombine pass in the llc pipeline. Also introduces the "amdgpu-uniform-intrinsic-combine" command-line flag to enable/disable the pass. see the PR:https://github.com/llvm/llvm-project/pull/116953
2025-10-29[MLIR] Apply clang-tidy fixes for llvm-qualified-auto in Vectorization.cpp (NFC)Mehdi Amini
2025-10-30[LV] Only skip scalarization overhead for members used as address.Florian Hahn
Refine logic to scalarize interleave group member: only skip scalarization overhead for member being used as addresses. For others, use the regular scalar memory op cost. This currently doesn't trigger in practice as far as I could find, but fixes a potential divergence between VPlan- and legacy cost models. It fixes a concrete divergence with a follow-up patch, https://github.com/llvm/llvm-project/pull/161276.
2025-10-30[bazel][mlir] Port #165629: ControlFlowTransforms deps (#165646)Jordan Rupprecht
2025-10-29[mlir][CF] Add structural type conversion patterns (#165629)Matthias Springer
Add structural type conversion patterns for CF dialect ops. These patterns are similar to the SCF structural type conversion patterns. This commit adds missing functionality and is in preparation of #165180, which changes the way blocks are converted. (Only entry blocks are converted.)
2025-10-29[clang-shlib] Fix linking libclang-cpp on Haiku (#156401)Brad Smith
Haiku requires linking in libnetwork. Co-authored-by: Jérôme Duval <jerome.duval@gmail.com>
2025-10-29[llvm] Proofread HowToSubmitABug.rst (#165511)Kazu Hirata
2025-10-29[WebAssembly] Remove a redundant cast (NFC) (#165508)Kazu Hirata
Local is already of type unsigned.
2025-10-29[MC] Remove a duplicate #include (NFC) (#165507)Kazu Hirata
Identified with readability-duplicate-include.
2025-10-29[acc] Expand OpenACCSupport to provide getRecipeName and emitNYI (#165628)Razvan Lupusoru
Extends OpenACCSupport utilities to include recipe name generation and error reporting for unsupported features, providing foundation for variable privatization handling. Changes: - Add RecipeKind enum (private, firstprivate, reduction) for APIs that request a specific kind of recipe - Add getRecipeName() API to OpenACCSupport and OpenACCUtils that generates recipe names from types (e.g., "privatization_memref_5x10xf32_") - Add emitNYI() API to OpenACCSupport for graceful handling of not-yet-implemented cases - Generalize MemRefPointerLikeModel template to support UnrankedMemRefType - Add unit tests and integration tests for new APIs
2025-10-29[LLDB][Windows]: Don't pass duplicate HANDLEs to CreateProcess (#165281)lb90
CreateProcess fails with ERROR_INVALID_PARAMETER when duplicate HANDLEs are passed via `PROC_THREAD_ATTRIBUTE_HANDLE_LIST`. This can happen, for example, if stdout and stdin are the same device (e.g. a bidirectional named pipe), or if stdout and stderr are the same device. Fixes https://github.com/msys2/MINGW-packages/issues/26030
2025-10-29[flang][rt] Add install target for header files (#165610)Valentin Clement (バレンタイン クレメン)
2025-10-29[HashRecognize] Forbid optz when data.next has exit-block user (#165574)Ramkumar Ramachandra
The CRC optimization relies on stripping the auxiliary data completely, and should hence be forbidden when it has a user in the exit-block. Forbid this case, fixing a miscompile. Fixes #165382.
2025-10-29[ARM] Add instruction selection for strict FP (#160696)Erik Enikeev
This consists of marking the various strict opcodes as legal, and adjusting instruction selection patterns so that 'op' is 'any_op'. The changes are similar to those in D114946 for AArch64. Custom lowering and promotion are set for some FP16 strict ops to work correctly. This PR is part of the work on adding strict FP support in ARM, which was previously discussed in #137101.
2025-10-29[dfsan] Fix getShadowAddress computation (#162864)anoopkg6
Fix getShadowAddress computation by adding ShadowBase if it is not zero. Co-authored-by: anoopkg6 <anoopkg6@github.com>
2025-10-29[clang-format][NFC] Port FormatTestComments to verifyFormat (#164310)Björn Schäpers
And reduce the number of getLLVMStyleWithColumnLimit calls.
2025-10-29[lldb-dap] Improving consistency of tests by removing concurrency. (#165496)John Harrison
We currently use a background thread to read the DAP output. This means the test thread and the background thread can race at times and we may have inconsistent timing due to these races. To improve the consistency I've removed the reader thread and instead switched to using the `selectors` module that wraps `select` in a platform independent way.
2025-10-29[DirectX] Use an allow-list of DXIL compatible module metadata (#165290)Finn Plummer
This pr introduces an allow-list for module metadata, this encompasses the llvm metadata nodes: `llvm.ident` and `llvm.module.flags`, as well as, the generated `dx.` options. Resolves: #164473.
2025-10-29[mlir][sparse] Include sparse emit strategy in wrapping iterator (#165611)Jordan Rupprecht
When we create a `SparseIterator`, we sometimes wrap it in a `FilterIterator`, which delegates _some_ calls to the underlying `SparseIterator`. After construction, e.g. in `makeNonEmptySubSectIterator()`, we call `setSparseEmitStrategy()`. This sets the strategy only in one of the filters -- if we call `setSparseEmitStrategy()` immediately after creating the `SparseIterator`, then the wrapped `SparseIterator` will have the right strategy, and the `FilterIterator` strategy will be unintialized; if we call `setSparseEmitStrategy()` after wrapping the iterator in `FilterIterator`, then the opposite happens. If we make `setSparseEmitStrategy()` a virtual method so that it's included in the `FilterIterator` pattern, and then do all reads of `emitStrategy` via a virtual method as well, it's pretty simple to ensure that the value of `strategy` is being set consistently and correctly. Without this, the UB of strategy being uninitialized manifests as a sporadic test failure in mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir, when run downstream with the right flags (e.g. asan + assertions off). The test sometimes fails with `ne_sub<trivial<dense[0,1]>>.begin' op created with unregistered dialect`. It can also be directly observed w/ msan that this uninitialized read is the cause of that issue, but msan causes other problems w/ this test.
2025-10-29[gn build] Port e9389436e5eaLLVM GN Syncbot