summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-11-05[dwarf] make dwarf fission compatible with RISCV relaxations 1/2users/dlav-sc/dwarf_split_relaxable_rangeDaniil Avdeev
Currently, -gsplit-dwarf and -mrelax are incompatible options in Clang. The issue is that .dwo files should not contain any relocations, as they are not processed by the linker. However, relaxable code emits relocations in DWARF for debug ranges that reside in the .dwo file when DWARF fission is enabled. This patch makes DWARF fission compatible with RISC-V relaxations. It uses the StartxEndx DWARF forms in .debug_rnglists.dwo, which allow referencing addresses from .debug_addr instead of using absolute addresses. This approach eliminates relocations from .dwo files.
2025-11-05[CIR] Add support for storing into _Atomic variables (#165872)Sirui Mu
2025-11-05[clang-tidy][doc] add more information in twine-local's document (#166266)Congcong Cai
explain more about use-after-free in llvm-twine-local add note about manually adjusting code after applying fix-it. fixed: #154810
2025-11-05[clang][bytecode] Remove dummy variables once they are proper globals (#166174)Timm Baeder
Dummy variables have an entry in `Program::Globals`, but they are not added to `GlobalIndices`. When registering redeclarations, we used to only patch up the global indices, but that left the dummy variables alone. Update the dummy variables of all redeclarations as well. Fixes https://github.com/llvm/llvm-project/issues/165952
2025-11-05[libc][math] Disable `FEnvSafeTest.cpp` if AArch64 target has no FP support ↵Victor Campos
(#166370) The `FEnvSafeTest.cpp` test fails on AArch64 soft nofp configurations because LLVM libc does not provide a floating-point environment in these configurations. This patch adds another preprocessor guard on `__ARM_FP` to disable the test on those.
2025-11-05[clang][bytecode] Print primitive arrays in Descriptor::dumpFull() (#166393)Timm Baeder
And recurse into records properly.
2025-11-05[BOLT][AArch64] Fix search to proceed upwards from memcpy call (#166182)Elvina Yakubova
The search should proceed from CallInst to the beginning of BB since X2 can be rewritten and we need to catch the most recent write before the call. Patch by Yafet Beyene alulayafet@gmail.com
2025-11-05[clang] Delete duplicate code in sourcemanager (#166236)SKill
Now that the `SourceManager::getExpansionLoc` and `SourceManager::getSpellingLoc` functions are efficient, delete unnecessary code duplicate in `SourceManager::getDecomposedExpansionLoc` and `SourceManager::getDecomposedSpellingLoc` methods.
2025-11-05[X86] narrowBitOpRMW - allow additional uses of the BTC/R/S result (#166376)Simon Pilgrim
If there are additional uses of the bit twiddled value as well as the rmw store, we can replace them with a (re)loaded copy of the full width integer value after the store. There's some memory op chain handling to handle here - the additional (re)load is chained after the new store and then any dependencies of the original store are chained after the (re)load.
2025-11-05[AMDGPU] Another test for missing S_WAIT_XCNT (#166154)Jay Foad
2025-11-05Revert "CodeGen: Record MMOs in finalizeBundle" (#166520)Jan Patrick Lehr
Reverts llvm/llvm-project#166210 Buildbot failures in the libc on GPU bot: https://lab.llvm.org/buildbot/#/builders/10/builds/16711
2025-11-05[MLIR][NVVM] Update mbarrier Ops to use AnyTypeOf[] (2/n) (#165993)Durgadoss R
This is a follow up of PR #165558. (1/n) This patch updates the below mbarrier Ops to use AnyTypeOf[] construct: ``` * mbarrier.arrive * mbarrier.arrive.noComplete * mbarrier.test.wait * cp.async.mbarrier.arrive ``` * Updated existing tests accordingly. * Verified locally that there are no new regressions in the `integration` tests. * TODO: Two more Ops remain and will be migrated in a subsequent PR. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-11-05[Clang] Add constexpr support for AVX512 permutex2 intrinsics (#165085)NagaChaitanya Vellanki
This patch enables compile-time evaluation of AVX512 permutex2var intrinsics in constexpr contexts. Extend shuffle generic to handle both integer immediate and vector mask operands. Resolves #161335
2025-11-05[Headers][X86] avx ifma - move constexpr to the end of the function ↵Simon Pilgrim
attribute lists. NFC. (#166523) Makes it easier to compare constexpr/non-constexpr attribute defines Allows clang-format to pack the attributes more efficiently
2025-11-05[LV][NFC] Remove undef values in some test cases (#164401)David Sherwood
Split off from PR #163525, this standalone patch replaces simple cases where undef is used as a value for arithmetic or getelementptr instructions. This will reduce the likelihood of contributors hitting the `undef deprecator` warning in github.
2025-11-05[MLIR][ODS] Re-enable direct implementation of type interfaces with method ↵Andi Drebes
bodies (#166335) Since commit 842622bf8bea782e9d9865ed78b0d8643f098122 adding support for overloading interface methods, a `using` directive is emitted for any interface method that does not require emission of a trait method, including for methods that define a method body. However, methods directly specifying a body (e.g., via the `methodBody` parameter of `InterfaceMethod`) are implemented directly in the interface class and are therefore not present in the associated trait. The generated `using` directive then referes to a non-existent method of the trait, resulting in an error upon compilation of the generated code. This patch changes `DefGen::emitTraitMethods()`, such that `genTraitMethodUsingDecl()` is not invoked for interface methods with a body anymore.
2025-11-05[InstCombine] Enable FoldOpIntoSelect and foldOpIntoPhi when the Op's other ↵Gábor Spaits
parameter is non-const (#166102) This patch enables `FoldOpIntoSelect` and `foldOpIntoPhi` for the cases when Op's second parameter is a non-constant. It doesn't seem to bring significant improvements, but the compile time impact is neglegable.
2025-11-05[libc++][NFC] Make __type_info_implementations a namespace (#166339)Nikolas Klauser
There doesn't seem much of a reason why this should be a struct. Make it a namespace instead.
2025-11-05[clang-tidy] Rename `cert-dcl58-cpp` to ↵mitchell
`bugprone-std-namespace-modification` (#165659) Closes [#157290](https://github.com/llvm/llvm-project/issues/157290)
2025-11-05[libc++] Remove <cstdlib> include from <exception> (#166340)Nikolas Klauser
2025-11-05Fix bazel build issue caused by #166259 (#166519)Karlo Basioli
2025-11-05[clang] Call ActOnCaseExpr even if the 'case' is missing (#166326)Timm Baeder
This otherwise happens in ParseCaseExpression. If we don't call this, we don't perform the usual arithmetic conversions, etc.
2025-11-05test: correct typo in RUN line (#166511)Saleem Abdulrasool
Correct a typo in the triple that is used for the test. Because the OS was not recognised, it would fall to the non-Windows code generation.
2025-11-05CodeGen: Record MMOs in finalizeBundle (#166210)Nicolai Hähnle
This allows more accurate alias analysis to apply at the bundle level. This has a bunch of minor effects in post-RA scheduling that look mostly beneficial to me, all of them in AMDGPU (the Thumb2 change is cosmetic). The pre-existing (and unchanged) test in CodeGen/MIR/AMDGPU/custom-pseudo-source-values.ll tests that MIR with a bundle with MMOs can be parsed successfully. v2: - use cloneMergedMemRefs - add another test to explicitly check the MMO bundling behavior v3: - use poison instead of undef to initialize the global variable in the test
2025-11-05[clang] Accept empty enum in MSVC compatible C (#159981)yicuixi
Fixes https://github.com/llvm/llvm-project/issues/114402. This patch accept empty enum in C as a microsoft extension and introduce an new warning `-Wmicrosoft-empty-enum`. --------- Signed-off-by: yicuixi <qin_17914@126.com> Co-authored-by: Erich Keane <ekeane@nvidia.com> Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2025-11-04[msan][test] Add some avx512bf16 tests (#166219)Thurston Dang
Forked from llvm/test/CodeGen/X86
2025-11-04[AMDGPU][NFC] Avoid copying MachineOperands (#166293)LU-JOHN
Avoid copying machine operands. Signed-off-by: John Lu <John.Lu@amd.com>
2025-11-05Revert "IR: Remove null UseList checks in hasNUses methods (#165929)" (#166500)Matt Arsenault
This reverts commit 93e860e694770f52a9eeecda88ba11173c291ef8. hasOneUse still has the null check, and it seems bad to be logically inconsistent across multiple of these predicate functions.
2025-11-05AMDGPU: Do not infer implicit inputs for !nocallback intrinsicsMatt Arsenault
(#131759) This isn't really the right check, we want to know that the intrinsic does not perform a true function call to any code (in the module or not). nocallback appears to be the closest thing to this property we have now though. Fixes theoretically miscompiles with intrinsics like statepoint, which hide a call to a real function. Also do the same for inferring no-agpr usage.
2025-11-05[RISCV] Implement shouldFoldMaskToVariableShiftPair (#166159)Sudharsan Veeravalli
Folding a mask to a variable shift pair results in better code size as long as they are scalars that are <= XLen. Similar to https://github.com/llvm/llvm-project/pull/158069
2025-11-04[CodeGen] Register-coalescer remat fix subreg liveness (#165662)Vigneshwar Jayakumar
This is a bugfix in rematerialization where the liveness of subreg mask was incorrectly updated causing crash in scheduler.
2025-11-04[msan][NFCI] Generalize handleVectorPmaddIntrinsic() (#166282)Thurston Dang
This generalizes `handleVectorPmaddIntrinsic()`: - potentially handle floating-point type intrinsics (e.g., `llvm.x86.avx512bf16.dpbf16ps.512`). This usage is not enabled yet. - "multiplication with an initialized zero guarantees that the corresponding output becomes initialized" is now gated by a parameter
2025-11-04Revert commit d8e5698 and 15b19c7 (#166498)Kewen Meng
2025-11-04[MLIR][XeGPU] Support order attribute and add pattern for vector.transpose ↵Nishant Patel
in WgToSg Pass (#165307) This PR does the following: 1. Handle order attribute during the delinearization from linear subgroup Id to multi-dim id. 2. Adds a transformation pattern for vector.transpose in wg to sg pass. 3. Updates CHECKS in the wg to sg tests
2025-11-05[CIR] Fix assignment ignore in ScalarExprEmitter (#166118)Morris Hafner
We are missing a couple of cases were we are not supposed to ignore assignment results but did so, which results in compiler crashes. Fix that. Also start ignoring IgnoredExprs unless there's side effects (assignments) inside.
2025-11-05[WebAssembly] TableGen-erate SDNode descriptions (#166259)Sergei Barannikov
This allows SDNodes to be validated against their expected type profiles and reduces the number of changes required to add a new node. CALL and RET_CALL do not have a description in td files, and it is not currently possible to add one as these nodes have both variable operands and variable results. This also fixes a subtle bug detected by the enabled verification functionality. `LOCAL_GET` is declared with `SDNPHasChain` property, and thus should have both a chain operand and a chain result. The original code created a node without a chain result, which caused a check in `SDNodeInfo::verifyNode()` to fail. Part of #119709. Pull Request: https://github.com/llvm/llvm-project/pull/166259
2025-11-04[HLSL] Layout Initalizer list in Column order via index conversion (#166277)Farzon Lotfi
fixes #165663 The bug was that we were using the initalizer lists index to populate the matrix. This meant that [0..n] would coorelate to [0..n] indicies of the flattened matrix. Hence why we were seeing the Row-major order: [ 0 1 2 3 4 5 ]. To fix this we can simply converted these indicies to the Column-major order: [ 0 3 1 4 2 5 ]. The net effect of this is the layout of the matrix is now correct and we don't need to change the MatrixSubscriptExpr indexing scheme. --------- Co-authored-by: Deric C. <cheung.deric@gmail.com> Co-authored-by: Helena Kotas <hekotas@microsoft.com>
2025-11-05[libc++][NFC] Removed unsupported compilers from tests (#166403)Hristo Hristov
2025-11-04[MLIR] Fix generate-test-checks.py to not remove every blank lines (#166493)Mehdi Amini
The stripping of the notes was done on a line-by-line basis which was fragile and led to remove empty lines everywhere in the file. Instead we can strip it as a single block before splitting the input into multiple lines.
2025-11-04[flang][cuda][NFC] Move CUDA intrinsics lowering to a separate file (#166461)Valentin Clement (バレンタイン クレメン)
Just move all CUDA related intrinsics lowering to a separate file to avoid clobbering the main Fortran intrinsic file.
2025-11-04[CodeGen] MachineVerifier to check early-clobber constraint (#151421)Abhay Kanhere
Currently MachineVerifier is missing verifying early-clobber operand constraint. The only other machine operand constraint - TiedTo is already verified.
2025-11-05CodeGen: Record tied virtual register operands in finalizeBundle (#166209)Nicolai Hähnle
This is in preparation of a future AMDGPU change where we are going to create bundles before register allocation and want to rely on the TwoAddressInstructionPass handling those bundles correctly. v2: - simplify the virtual register check and the test
2025-11-05AMDGPU: Add and clarify reserved address spaces (#166486)Nicolai Hähnle
Address spaces 10 and 11 are reserved for future use in the sense that we plain to upstream their use. Address space 12 is used by LLPC. It is used in a workaround for an issue with SMEM accesses to PRT buffers that is specific to the LLPC ecosystem and makes no sense to upstream.
2025-11-05[WebAssembly] Use IRBuilder in FixFunctionBitcasts (NFC) (#164268)Kleis Auke Wolthuizen
Simplifies the code a bit.
2025-11-04[SLU][profcheck] Propagate profile for branches on injected conditions. ↵Mircea Trofin
(#164476) This patch addresses the profile of 2 branches: - one that compares the 2 limits, for which we have no information (the C1, C2, see https://reviews.llvm.org/D136233) - one that is conditioned on a condition for which we have a profile, so we reuse it Issue #147390
2025-11-05AMDGPU: Pre-commit a test (#166414)Nicolai Hähnle
2025-11-04[BOLT] Fix impute-fall-throughs (#166305)Amir Ayupov
BOLT expects pre-aggregated profile entries to be unique, which holds for externally aggregated traces (or branches+fall-through ranges). Therefore, BOLT doesn't merge duplicate entries for faster processing. However, such traces are not expressly prohibited and could come from concatenated pre-aggregated profiles or otherwise. Relax the assumption about no duplicate (branch-only) traces in fall- through imputing. Test Plan: updated callcont-fallthru.s
2025-11-05[libc] Fix fprintf_test assuming specific errnos. (#166479)Michael Jones
The patch #166382 fixed most of these, but missed the fprintf_test ones.
2025-11-05[ProfCheck] Disable X86 AMX Test CaseAiden Grossman
4776451693f4a6bd18e50106edb4b3cfa766484f broke this because it started running an existing pass using the NewPM, which caused ProfCheck to catch existing issues. Disable it for now because we have not started looking at anything in the Codegen pipeline. This pass is also only enabled at O0 or if a function has optnone, so not super critical.
2025-11-05[llvm][mustache] Avoid extra allocations in parseSection (#159199)Paul Kirth
We don't need to have extra allocations when concatenating raw bodies.