summaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/BranchFolding.cpp
AgeCommit message (Collapse)Author
2025-11-12Reland "[LoongArch] Add `isSafeToMove` hook to prevent unsafe instruction ↵hev
motion" (#167465) This patch introduces a new virtual method `TargetInstrInfo::isSafeToMove()` to allow backends to control whether a machine instruction can be safely moved by optimization passes. The `BranchFolder` pass now respects this hook when hoisting common code. By default, all instructions are considered safe to to move. For LoongArch, `isSafeToMove()` is overridden to prevent relocation-related instruction sequences (e.g. PC-relative addressing and calls) from being broken by instruction motion. Correspondingly, `isSchedulingBoundary()` is updated to reuse this logic for consistency. Relands #163725
2025-11-11Revert "[LoongArch] Add `isSafeToMove` hook to prevent unsafe instruction ↵hev
motion" (#167463) Reverts llvm/llvm-project#163725
2025-11-07[LoongArch] Add `isSafeToMove` hook to prevent unsafe instruction motion ↵hev
(#163725) This patch introduces a new virtual method `TargetInstrInfo::isSafeToMove()` to allow backends to control whether a machine instruction can be safely moved by optimization passes. The `BranchFolder` pass now respects this hook when hoisting common code. By default, all instructions are considered safe to to move. For LoongArch, `isSafeToMove()` is overridden to prevent relocation-related instruction sequences (e.g. PC-relative addressing and calls) from being broken by instruction motion. Correspondingly, `isSchedulingBoundary()` is updated to reuse this logic for consistency. Fixes #163681
2025-08-14[BranchFolding] Avoid moving blocks to fall through to an indirect target ↵XChy
(#152916) Depend on #152591 to fix https://github.com/llvm/llvm-project/issues/149023. Similar to an EH pad, there is no real advantage in "falling through" to an indirect target of an INLINEASM_BR. And multiple indirect targets of inline asm at the end of a function may be rotated infinitely. Therefore, this patch avoids such optimization on indirect target of inline asm as fall through.
2025-07-29[BranchFolding] Follow up #149999 crash fixOrlando Cazalet-Hyams
fbf6271c7da20356d7b34583b3711b4126ca1dbb introduced an assertion failure as setDebugValueUndef was called on DBG_LABELs, which isn't allowed and doesn't make sense. Fix by skipping the call for DBG_LABELs and hoisting, in line with the original behaviour.
2025-07-28Reapply (2) [BranchFolding] Kill common hoisted debug instructions (#149999)Orlando Cazalet-Hyams
Reapply #140091. branch-folder hoists common instructions from TBB and FBB into their pred. Without this patch it achieves this by splicing the instructions from TBB and deleting the common ones in FBB. That moves the debug locations and debug instructions from TBB into the pred without modification, which is not ideal. Debug locations are handled in #140063. This patch handles debug instructions - in the simplest way possible, which is to just kill (undef) them. We kill and hoist the ones in FBB as well as TBB because otherwise the fact there's an assignment on the code path is deleted (which might lead to a prior location extending further than it should). There's possibly something we could do to preserve some variable locations in some cases, but this is the easiest not-incorrect thing to do. Note I had to replace the constant DBG_VALUEs to use registers in the test- it turns out setDebugValueUndef doesn't undef constant DBG_VALUEs... which feels wrong to me, but isn't something I want to touch right now. --- Fix end-iterator-dereference and add test.
2025-07-25Revert "[BranchFolding] Kill common hoisted debug instructions" (#150632)Orlando Cazalet-Hyams
Reverts llvm/llvm-project#149999 https://lab.llvm.org/buildbot/#/builders/139/builds/17622
2025-07-25Reapply [BranchFolding] Kill common hoisted debug instructions (#149999)Orlando Cazalet-Hyams
Reapply #140091. branch-folder hoists common instructions from TBB and FBB into their pred. Without this patch it achieves this by splicing the instructions from TBB and deleting the common ones in FBB. That moves the debug locations and debug instructions from TBB into the pred without modification, which is not ideal. Debug locations are handled in #140063. This patch handles debug instructions - in the simplest way possible, which is to just kill (undef) them. We kill and hoist the ones in FBB as well as TBB because otherwise the fact there's an assignment on the code path is deleted (which might lead to a prior location extending further than it should). There's possibly something we could do to preserve some variable locations in some cases, but this is the easiest not-incorrect thing to do. Note I had to replace the constant DBG_VALUEs to use registers in the test- it turns out setDebugValueUndef doesn't undef constant DBG_VALUEs... which feels wrong to me, but isn't something I want to touch right now.
2025-07-21Revert "[BranchFolding] Kill common hoisted debug instructions" (#149845)Orlando Cazalet-Hyams
Reverts llvm/llvm-project#140091 due to crash (see comments for reproducer)
2025-07-21[BranchFolding] Kill common hoisted debug instructions (#140091)Orlando Cazalet-Hyams
branch-folder hoists common instructions from TBB and FBB into their pred. Without this patch it achieves this by splicing the instructions from TBB and deleting the common ones in FBB. That moves the debug locations and debug instructions from TBB into the pred without modification, which is not ideal. Debug locations are handled in pull request 140063. This patch handles debug instructions - in the simplest way possible, which is to just kill (undef) them. We kill and hoist the ones in FBB as well as TBB because otherwise the fact there's an assignment on the code path is deleted (which might lead to a prior location extending further than it should). We might be able to do something smarter to preserve some variable locations in some cases, but this is the easiest not-incorrect thing to do.
2025-07-03 [DLCov] Origin-Tracking: Collect stack traces in DebugLoc (#146678)Stephen Tozer
This patch is part of a series that adds origin-tracking to the debugify source location coverage checks, allowing us to report symbolized stack traces of the point where missing source locations appear. This patch adds the logic for collecting stack traces in DebugLoc instances. We do not symbolize the stack traces in this patch - that only happens when we decide to actually print them, which will be the responsibility of debugify. The collection happens in the constructor of a DebugLoc that has neither a valid location nor an annotation; we also collect an extra stack trace every time we call setDebugLoc, as sometimes the more interesting point is not where the DebugLoc was constructed, but where it was applied to an instruction. This takes the form of a getCopied() method on DebugLoc, which is the identity function in normal builds, but adds an extra stack trace in origin-tracking builds.
2025-06-12[DLCov][NFC] Propagate annotated DebugLocs through transformations (#138047)Stephen Tozer
Part of the coverage-tracking feature, following #107279. In order for DebugLoc coverage testing to work, we firstly have to set annotations for intentionally-empty DebugLocs, and secondly we have to ensure that we do not drop these annotations as we propagate DebugLocs throughout compilation. As the annotations exist as part of the DebugLoc class, and not the underlying DILocation, they will not survive a DebugLoc->DILocation->DebugLoc roundtrip. Therefore this patch modifies a number of places in the compiler to propagate DebugLocs directly rather than via the underlying DILocation. This has no effect on the output of normal builds; it only ensures that during coverage builds, we do not drop incorrectly annotations and therefore create false positives. The bulk of these changes are in replacing DILocation::getMergedLocation(s) with a DebugLoc equivalent, and in changing the IRBuilder to store a DebugLoc directly rather than storing DILocations in its general Metadata array. We also use a new function, `DebugLoc::orElse`, which selects the "best" DebugLoc out of a pair (valid location > annotated > empty), preferring the current DebugLoc on a tie - this encapsulates the existing behaviour at a few sites where we _may_ assign a DebugLoc to an existing instruction, while extending the logic to handle annotation DebugLocs at the same time.
2025-05-22[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties ↵users/pcc/spr/main.elf-add-branch-to-branch-optimizationRahul Joshi
(#140002) Add per-property has<Prop>/set<Prop>/reset<Prop> functions to MachineFunctionProperties.
2025-05-22[BranchFolding] Fix assertion failure in HoistCommonCodeInSuccs (#141028)Orlando Cazalet-Hyams
Assertion failure introduced in #140063, which didn't account for TBB and FBB being the same block.
2025-05-20[BranchFolding] Merge debug locs on common hoisted code (#140063)Orlando Cazalet-Hyams
branch-folder hoists common instructions from TBB and FBB into their pred. Without this patch it achieves this by splicing the instructions from TBB and deleting the common ones in FBB. That moves the debug locations and debug instructions from TBB into the pred without modification, which is not ideal. The merged instructions should get merged debug locations for debugging and PGO purposes, which is handled in this patch. Debug instructions also need to be handled differently. That'll come in another patch. This issue was found by @omern1.
2025-03-13[CodeGen][NPM] Port BranchFolder to NPM (#128858)Akshat Oke
EnableTailMerge is false by default and is handled by the pass builder. Passes are independent of target pipeline options. This completes the generic `MachineLateOptimization` passes for the NPM pipeline.
2025-03-02[CodeGen] Use Register::id() to avoid implicit cast. NFCCraig Topper
2025-01-22[BranchFolding] Remove getBranchDebugLoc() (#114613)Ellis Hoag
2025-01-21[CodeGen] Use MCRegister instead of MCPhysReg in RegisterMaskPair. NFC (#123688)Craig Topper
Update some other places to avoid implicit conversions this introduces, but I probably missed some.
2025-01-13[aarch64][win] Update Called Globals info when updating Call Site info (#122762)Daniel Paoliello
Fixes the "use after poison" issue introduced by #121516 (see <https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>). The root cause of this issue is that #121516 introduced "Called Global" information for call instructions modeling how "Call Site" info is stored in the machine function, HOWEVER it didn't copy the copy/move/erase operations for call site information. The fix is to rename and update the existing copy/move/erase functions so they also take care of Called Global info.
2024-12-16[NFC] Remove some unnecessary semicolonsDavid Green
All inside LLVM_DEBUG, some of which have been cleaned up by adding block scopes to allow them to format more nicely.
2024-10-28Check hasOptSize() in shouldOptimizeForSize() (#112626)Ellis Hoag
2024-07-26[CodeGen] Remove AA parameter of isSafeToMove (#100691)Pengcheng Wang
This `AA` parameter is not used and for most uses they just pass a nullptr. The use of `AA` was removed since 8d0383e.
2024-07-22[BranchFolding] Add a hook to override tail merge size (#99025)Pengcheng Wang
A new hook `TargetInstrInfo::getTailMergeSize()` is added so that targets can override it. This removes an existing TODO.
2024-07-12[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317)paperchalice
- Add `MachineBlockFrequencyAnalysis`. - Add `MachineBlockFrequencyPrinterPass`. - Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager. - `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new pass manager migration.
2024-07-09[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793)paperchalice
- Add `MachineLoopAnalysis`. - Add `MachineLoopPrinterPass`. - Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
2024-06-28Reapply "[CodeGen][NewPM] Port machine-branch-prob to new pass manager" ↵paperchalice
(#96858) (#96869) This reverts commit ab58b6d58edf6a7c8881044fc716ca435d7a0156. In `CodeGen/Generic/MachineBranchProb.ll`, `llc` crashed with dumped MIR when targeting PowerPC. Move test to `llc/new-pm`, which is X86 specific.
2024-06-27Revert "[CodeGen][NewPM] Port machine-branch-prob to new pass manager" (#96858)paperchalice
Reverts llvm/llvm-project#96389 Some ppc bots failed.
2024-06-27[CodeGen][NewPM] Port machine-branch-prob to new pass manager (#96389)paperchalice
Like IR version `print<branch-prob>`, there is also a `print<machine-branch-prob>`.
2024-06-20[BranchFolder] Fix missing debug info with tail merging (#94715)Alan Zhao
`BranchFolder::TryTailMergeBlocks(...)` removes unconditional branch instructions and then recreates them. However, this process loses debug source location information from the previous branch instruction, even if tail merging doesn't change IR. This patch preserves the debug information from the removed instruction and inserts them into the recreated instruction. Fixes #94050
2024-04-15[NFC] Refactor looping over recomputeLiveIns into function (#88040)Kai Nacke
https://github.com/llvm/llvm-project/pull/79940 put calls to recomputeLiveIns into a loop, to repeatedly call the function until the computation converges. However, this repeats a lot of code. This changes moves the loop into a function to simplify the handling. Note that this changes the order in which recomputeLiveIns is called. For example, ``` bool anyChange = false; do { anyChange = recomputeLiveIns(*ExitMBB) || recomputeLiveIns(*LoopMBB); } while (anyChange); ``` only begins to recompute the live-ins for LoopMBB after the computation for ExitMBB has converged. With this change, all basic blocks have a recomputation of the live-ins for each loop iteration. This can result in less or more calls, depending on the situation.
2024-01-30Refactor recomputeLiveIns to converge on added MachineBasicBlocks (#79940)Oskar Wirga
This is a fix for the regression seen in https://github.com/llvm/llvm-project/pull/79498 > Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. Now we do not recompute the entire CFG but we do ensure that the newly added MBB do reach convergence.
2024-01-26Revert "Refactor recomputeLiveIns to operate on whole CFG (#79498)"Nikita Popov
This reverts commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b. Introduces a major compile-time regression.
2024-01-26Refactor recomputeLiveIns to operate on whole CFG (#79498)Oskar Wirga
Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. This PR fixes that by simply recomputing the liveins for the entire CFG until convergence is achieved. This makes it harder to introduce subtle bugs which alter liveness.
2024-01-18[BranchFolding] Use isSuccessor to confirm fall through (#77923)Haohai Wen
When merging blocks, if the previous block has no any branch instruction and has one successor, the successor may be SEH landing pad and the block will always raise exception and nerver fall through to next block. We can not merge them in such case. isSuccessor should be used to confirm it can fall through to next block.
2024-01-11[BranchFolding] Fix missing predecessors of landing-pad (#77608)HaohaiWen
When removing an empty machine basic block, all of its successors should be inherited by its fall through MBB. This keeps CFG as only have one entry which is required by LiveDebugValues. Reland #77441 as LiveDebugValues test.
2023-11-09[BranchFolding] Remove dubious assert from operator< (#71639)Nikita Popov
`MergePotentialElts::operator<` asserts that the two elements being compared are not equal. However, sorting functions are allowed to invoke the comparison function with equal arguments (though they usually don't for efficiency reasons). There is an existing special-case that disables the assert if _GLIBCXX_DEBUG is used, which may invoke the comparator with equal args to verify strict weak ordering. I believe libc++ also has strict weak ordering checks under some options nowadays. Recently, #71312 was reported, where a change to glibc's qsort_r implementation can also result in comparison between equal elements. From what I understood, this is an inefficiency that will be fixed on the glibc side as well, but I think at this point we should just remove this assertion. Fixes https://github.com/llvm/llvm-project/issues/71312.
2023-06-01[CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.Jay Foad
Differential Revision: https://reviews.llvm.org/D151424
2023-04-27[BranchFolder] Skip redundant IMPLICIT_DEFs of subregsJay Foad
Differential Revision: https://reviews.llvm.org/D148509
2023-04-18[MC] Use subregs/superregs instead of MCSubRegIterator/MCSuperRegIterator. NFC.Jay Foad
Differential Revision: https://reviews.llvm.org/D148613
2023-03-29Reland "[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2"Phoebe Wang
This reverts commit db6a979ae82410e42430e47afa488936ba8e3025. Reland D102817 without any change. The previous revert was a mistake. Differential Revision: https://reviews.llvm.org/D102817
2023-02-06Recommit "Improve and enable folding of conditional branches with tail ↵Noah Goldstein
calls." (2nd Try) Improve and enable folding of conditional branches with tail calls. 1. Make it so that conditional tail calls can be emitted even when there are multiple predecessors. 2. Don't guard the transformation behind -Os. The rationale for guarding it was static-prediction can be affected by whether the branch is forward of backward. This is no longer true for almost any X86 cpus (anything newer than `SnB`) so is no longer a meaningful concern. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D140931
2023-02-01Revert "Improve and enable folding of conditional branches with tail calls."Mikhail Goncharov
This reverts commit c05ddc9cbc12b1f2038380f57a16c4ca98c614b7. Fails under asan: https://lab.llvm.org/buildbot/#/builders/168/builds/11637 Failed Tests (3): LLVM :: CodeGen/X86/jump_sign.ll LLVM :: CodeGen/X86/or-branch.ll LLVM :: CodeGen/X86/tailcall-extract.ll
2023-02-01Improve and enable folding of conditional branches with tail calls.Noah Goldstein
Improve and enable folding of conditional branches with tail calls. 1. Make it so that conditional tail calls can be emitted even when there are multiple predecessors. 2. Don't guard the transformation behind -Os. The rationale for guarding it was static-prediction can be affected by whether the branch is forward of backward. This is no longer true for almost any X86 cpus (anything newer than `SnB`) so is no longer a meaningful concern. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D140931
2023-01-13[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFCCraig Topper
Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715
2022-12-02Revert "[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2"tentzen
This reverts commit 1a949c871ab4a6b6d792849d3e8c0fa6958d27f5.
2022-12-01[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2tentzen
This patch is the Part-2 (BE LLVM) implementation of HW Exception handling. Part-1 (FE Clang) was committed in 797ad701522988e212495285dade8efac41a24d4. This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual. NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change. The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs. The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process. Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial. This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option. One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales: A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed: Warning: jump bypasses variable with a non-trivial destructor In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State. Implementation: Part-1: Clang implementation (already in): Please see commit 797ad701522988e212495285dade8efac41a24d4). Part-2 : LLVM implementation described below. For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap). For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address. The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through. Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html Differential Revision: https://reviews.llvm.org/D102817/new/
2022-06-01BranchFolder: Require NoPHIsMatt Arsenault
The pass doesn't handle SSA and breaks any phis.
2022-04-27Revert "BranchFolder: Assert on SSA functions"Matt Arsenault
This reverts commit 6ff91d17d66da46572e97f9a0b042182762cbe9e.
2022-04-27BranchFolder: Assert on SSA functionsMatt Arsenault
We probably should have the opposite of getRequiredProperties for this