summaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
AgeCommit message (Collapse)Author
2025-11-03Reapply: [AMDGPU][UnifyDivergentExitNodes][StructurizeCFG] Add support for ↵Robert Imschweiler
callbr instruction with inline-asm (#152161) (#166195) Reapply #152161 with fixed 'changed' flags.
2025-11-03Revert "[AMDGPU][UnifyDivergentExitNodes][StructurizeCFG] Add support for ↵Robert Imschweiler
callbr instruction with inline-asm" (#166186) Reverts llvm/llvm-project#152161 Need to revert to fix changed logic for the expensive checks.
2025-11-03[AMDGPU][UnifyDivergentExitNodes][StructurizeCFG] Add support for callbr ↵Robert Imschweiler
instruction with inline-asm (#152161) Finishes adding inline-asm callbr support for AMDGPU, started by https://github.com/llvm/llvm-project/pull/149308.
2025-11-02[llvm] Remove redundant typename (NFC) (#166087)Kazu Hirata
Identified with readability-redundant-typename.
2025-10-14[AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. ↵Valery Pykhtin
(#150937) SSAUpdaterBulk replaces legacy SSAUpdater.
2025-09-15[StructurizeCFG] bug fix in zero cost hoist (#157969)Vigneshwar Jayakumar
This fixes a bug where zero cost instruction was hoisted to nearest common dominator but the hoisted instruction's operands didn't dominate the common dominator causing poison values.
2025-08-28[StructurizeCFG] nested-if zerocost hoist bugfix (#155408)Vigneshwar Jayakumar
When zero cost instructions are hoisted, the simplifyHoistedPhi function was setting incoming phi values which were not dominating the use causing runtime failure. This was set to poison by rebuildSSA function. This commit fixes the issue.
2025-08-22[llvm] Remove unused includes of SmallSet.h (NFC) (#154893)Kazu Hirata
We just replaced SmallSet<T *, N> with SmallPtrSet<T *, N>, bypassing the redirection found in SmallSet.h. With that, we no longer need to include SmallSet.h in many files.
2025-08-18[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)Kazu Hirata
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.
2025-07-25reland "[StructurizeCFG] Hoist and simplify zero-cost incoming else p… ↵Vigneshwar Jayakumar
(#149744) …hi values (#139605)" This relands commit b11523b494b with the fix for llvm-buildbot failures "clang-hip-vega20" and "openmp-offload-amdgpu-runtime-2". The reland prevents hoisting the phi node which fixes the issue. Original PR description: The order of if and else blocks can introduce unnecessary VGPR copies. Consider the case of an if-else block where the incoming phi from the 'Else block' only contains zero-cost instructions, and the 'Then' block modifies some value. There would be no interference when coalescing because only one value is live at any point before structurization. However, in the structurized CFG, the Then value is live at 'Else' block due to the path if→flow→else, leading to additional VGPR copies. This patch addresses the issue by: - Identifying PHI nodes with zero-cost incoming values from the Else block and hoisting those values to the nearest common dominator of the Then and Else blocks. - Updating Flow PHI nodes by replacing poison entries (on the if→flow edge) with the correct hoisted values.
2025-07-10Revert "[StructurizeCFG] Hoist and simplify zero-cost incoming else phi ↵Vigneshwar Jayakumar
values" (#148016) reverting to fix Buildbot failures.
2025-07-10[StructurizeCFG] Hoist and simplify zero-cost incoming else phi values (#139605)Vigneshwar Jayakumar
The order of if and else blocks can introduce unnecessary VGPR copies. Consider the case of an if-else block where the incoming phi from the 'Else block' only contains zero-cost instructions, and the 'Then' block modifies some value. There would be no interference when coalescing because only one value is live at any point before structurization. However, in the structurized CFG, the Then value is live at 'Else' block due to the path if→flow→else, leading to additional VGPR copies. This patch addresses the issue by: - Identifying PHI nodes with zero-cost incoming values from the Else block and hoisting those values to the nearest common dominator of the Then and Else blocks. - Updating Flow PHI nodes by replacing poison entries (on the if→flow edge) with the correct hoisted values.
2025-05-09[StructurizeCFG] Stop setting DebugLocs in flow blocks (#139088)Emma Pilkington
Flow blocks are generated code that don't really correspond to any location in the source, so principally they should have empty DebugLocs. Practically, setting these debug locs leads to redundant is_stmts being generated after #108251, causing stepping test failures in the ROCm GDB test suite. Fixes SWDEV-502134
2025-04-19Revert "[StructurizeCFG] Refactor insertConditions. NFC. (#115476)" (#136370)Shilei Tian
2025-03-29[Transforms] Use llvm::append_range (NFC) (#133607)Kazu Hirata
2025-03-27[llvm] Use *Set::insert_range (NFC) (#133353)Kazu Hirata
We can use *Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E.first); down to: Set.insert_range(llvm::make_first_range(Range)); In some cases, we can further fold that into the set declaration.
2025-03-11[AMDGPU] Improve StructurizeCFG pass performance: avoid redundant DebugLoc ↵Valery Pykhtin
map initialization. NFC. (#130568) Previously, the TermDL (BB terminator → DebugLoc) map was initialized at the start of processing each function's region, creating entries for the entire function. This could be inefficient for large functions. This patch improves performance by creating map entries only when needed—when a terminator is being killed or when a flow block is created. Additionally, entries are removed immediately after use, preventing unnecessary map growth and ensuring DebugLocs are not "retracked." A mapless variant was also explored, but due to limited familiarity with the structurizer, it was not pursued further. In my cases, this change improves performance by 2-3×.
2025-03-10StructurizeCFG: Use poison instead of undef (#130459)Matt Arsenault
There are a surprising number of codegen changes from this.
2025-03-09[Scalar] Avoid repeated hash lookups (NFC) (#130463)Kazu Hirata
2024-12-10[StructurizeCFG] Use `poison` instead of `undef` as placeholder [NFC] (#119137)Pedro Lobo
2024-11-26[StructurizeCFG] Refactor insertConditions. NFC. (#115476)Jay Foad
This just makes it more obvious that having Parent as the single predecessor is a special case, instead of checking for it in the middle of a loop that finds the nearest common dominator of multiple predecessors.
2024-11-08[StructurizeCFG] Remove one SSAUpdater::AddAvailableValue. NFCI. (#115472)Jay Foad
2024-11-08[StructurizeCFG] Introduce struct PredInfo. NFC. (#115457)Jay Foad
This just provides a neater encapsulation of the info about the predicate for an edge, rather than ValueWeightPair aka std::pair.
2024-11-02[Scalar] Remove unused includes (NFC) (#114645)Kazu Hirata
Identified with misc-include-cleaner.
2024-11-01Reapply "StructurizeCFG: Optimize phi insertion during ssa reconstruction ↵Ruiling, Song
(#101301)" (#114347) This reverts commit be40c723ce2b7bf2690d22039d74d21b2bd5b7cf.
2024-09-25[AMDGPU][StructurizeCFG] Maintain branch MD_prof metadata (#109813)Juan Manuel Martinez Caamaño
Currently `StructurizeCFG` drops branch_weight metadata . This metadata can be generated from user annotations in the source code like: ```cpp if (...) [[likely]] { } ```
2024-09-09[StructurizeCFG] Avoid repeated hash lookups (NFC) (#107797)Kazu Hirata
2024-08-12StructurizeCFG: Add SkipUniformRegions pass parameter to new PM version ↵Matt Arsenault
(#102812) Keep respecting the old cl::opt for now.
2024-08-08Revert "StructurizeCFG: Optimize phi insertion during ssa reconstruction ↵Yaxun (Sam) Liu
(#101301)" This reverts commit c62e2a2a4ed69d53a3c6ca5c24ee8d2504d6ba2b. Since it caused regression in HIP buildbot: https://lab.llvm.org/buildbot/#/builders/123/builds/3282
2024-08-08StructurizeCFG: Optimize phi insertion during ssa reconstruction (#101301)Ruiling, Song
After investigating more while-break cases, I think we should try to optimize the way we reconstruct phi nodes. Previously, we reconstruct each phi nodes separately, but this is not optimal. For example: ``` header: %v.1 = phi float [ %v, %entry ], [ %v.2, %latch ] br i1 %cc, label %if, label %latch if: %v.if = fadd float %v.1, 1.0 br i1 %cc2, label %latch, label %exit latch: %v.2 = phi float [ %v.if, %if ], [ %v.1, %header ] br i1 %cc3, label %exit, label %header exit: %v.3 = phi float [ %v.2, %latch ], [ %v.if, %if ] ``` For this case, we have different copies of value `v`, but there is at most one copy of value `v` alive at any program point shown above. The existing ssa reconstruction will use the incoming values from the old deleted phi. Below is a possible output after ssa reconstruction. ``` header: %v.1 = phi float [ %v, %entry ], [ %v.loop, %Flow1 ] br i1 %cc, label %if, label %flow if: %v.if = fadd float %v.1, 1.0 br label %flow flow: %v.exit.if = phi float [ %v.if, %if ], [ undef, %header ] %v.latch = phi float [ %v.if, %if ], [ %v.1, %header ] latch: br label %flow1 flow1: %v.loop = phi float [ %v.latch, %latch ], [ undef, %Flow ] %v.exit = phi float [ %v.latch, %latch ], [ %v.exit.if, %Flow ] exit: %v.3 = phi float [ %v.exit, %flow1 ] ``` If we look closely, in order to reconstruct `v.1` `v.2` `v.3`, we are having two simultaneous copies of `v` alive at `flow` and `flow1`. We highly depend on register coalescer to coalesce them together. But register coalescer may not always be able to coalesce them because of the complexity in the chain of phi. On the other side, now that we have only one copy of `v` alive at any program point before the transform, why not simplify the phi network as much as we can? Look at the incoming values of these PHIs: ``` header if latch v.1: -- -- v.2 v.2: v.1 v.if -- v.3: -- v.if v.2 ``` If we let them share the same incoming values for these three different incoming blocks, then we would have only one copy of alive `v` at any program point after ssa reconstruction. Something like: ``` header: %v.1 = phi float [ %v, %entry ], [ %v.2, %Flow1 ] br i1 %cc, label %if, label %flow if: %v.if = fadd float %v.1, 1.0 br label %flow flow: %v.2 = phi float [ %v.if, %if ], [ %v.1, %header ] latch: br label %flow1 flow1: ... exit: %v.3 = phi float [ %v.2, %flow1 ] ```
2024-06-28[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)Nikita Popov
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.
2023-10-25[LowerSwitch] Don't let pass manager handle the dependency (#68662)Ruiling, Song
Some passes has limitation that only support simple terminators: branch/unreachable/return. Right now, they ask the pass manager to add LowerSwitch pass to eliminate `switch`. Let's manage such kind of pass dependency by ourselves. Also add the assertion in the related passes.
2023-07-22[StructurizeCFG] Use poison instead of undef as placeholder [NFC]Nuno Lopes
These are used to create branch instructions. The condition is patched later
2023-03-14[StructurizeCFG] Correctly depend on UniformityAnalysispvanhout
Small oversight in https://reviews.llvm.org/D145688 - the pass' dependency was not updated to reflect the change to UA. Also, change DivergenceAnalysis to UniformityAnalysis in a comment. That way, StructurizeCFG only refers to UA and not DA anymore.
2023-03-13[StructurizeCFG] Use UniformityAnalysis instead of DivergenceAnalysispvanhout
Depends on D145572 Reviewed By: foad, sameerds Differential Revision: https://reviews.llvm.org/D145688
2022-11-04[StructurizeCFG][DebugInfo] Avoid use-after-freeJuan Manuel MARTINEZ CAAMAÑO
Reviewed By: dstuttard Differential Revision: https://reviews.llvm.org/D137408
2022-10-28[StructurizeCFG][DebugInfo] Maintain DILocations in the branches created by ↵Juan Manuel MARTINEZ CAAMAÑO
StructurizeCFG Make StructurizeCFG preserve the debug locations of the branch instructions it introduces. Differential Revision: https://reviews.llvm.org/D135967
2022-09-29[StructurizeCFG] Remove imposible case and replace by assertJuan Manuel MARTINEZ CAAMAÑO
In addition, replace outdated XFAIL test by a new one. Differential Revision: https://reviews.llvm.org/D134439
2022-09-26StructurizeCFG: Set Undef for non-predecessors in setPhiValues()Ruiling Song
During structurization process, we may place non-predecessor blocks between the predecessors of a block in the structurized CFG. Take the typical while-break case as an example: ``` /---A(v=...) | / \ ^ B C | \ /| \---L | \ / E (r = phi (v:C)...) ``` After structurization, the CFG would be look like: ``` /---A | |\ | | C | |/ | F1 ^ |\ | | B | |/ | F2 | |\ | | L \ |/ \--F3 | E ``` We can see that block B is placed between the predecessors(C/L) of E. During phi reconstruction, to achieve the same sematics as before, we are reconstructing the PHIs as: F1: v1 = phi (v:C), (undef:A) F3: r = phi (v1:F2), ... But this is also saying that `v1` would be live through B, which is not quite necessary. The idea in the change is to say the incoming value from B is Undef for the PHI in E. With this change, the reconstructed PHI would be: F1: v1 = phi (v:C), (undef:A) F2: v2 = phi (v1:F1), (undef:B) F3: r = phi (v2:F2), ... Reviewed by: sameerds Differential Revision: https://reviews.llvm.org/D132450
2022-09-26StructurizeCFG: prefer reduced number of live valuesRuiling Song
The instruction simplification will try to simplify the affected phis. In some cases, this might extend the liveness of values. For example: BB0: | \ | BB1 | / BB2:phi (BB0, v), (BB1, undef) The phi in BB2 will be simplified to v as v dominates BB2, but this is increasing the number of active values in BB1. By setting CanUseUndef to false, we will not simplify the phi in this way, this would help register pressure. This is mandatory for the later change to help reducing VGPR pressure for AMDGPU. Reviewed by: foad, sameerds Differential Revision: https://reviews.llvm.org/D132449
2022-08-20[Scalar] Qualify auto in range-based for loops (NFC)Kazu Hirata
Identified with readability-qualified-auto.
2022-08-07[llvm] Qualify auto (NFC)Kazu Hirata
Identified with readability-qualified-auto.
2022-07-14Revert "[StructurizeCFG] Improve basic block ordering"Brendon Cahoon
This reverts commit f1b05a0a2bbbea160002be709f8a1c59de366761. Need to revert to due to issues identified with testing. The transformation is incorrect for blocks that contain convergent instructions.
2022-06-22[StructurizeCFG] Improve basic block orderingBrendon Cahoon
StructurizeCFG linearizes the successors of branching basic block by adding Flow blocks to record the true/false path for branches and back edges. This patch reduces the number of Phi values needed to capture the control flow path by improving the basic block ordering. Previously, StructurizeCFG adds loop exit blocks outside of the loop. StructurizeCFG sets a boolean value to indicate the path taken, and all exit block live values extend to after the loop. For loops with a large number of exits blocks, this creates a huge number of values that are maintained, which increases compilation time and register pressure. This is problem especially with ASAN, which adds early exits to blocks with unreachable instructions for each instrumented check in the loop. In specific cases, this patch reduces the number of values needed after the loop by moving the exit block into the loop. This is done for blocks that have a single predecessor and single successor by moving the block to appear just after the predecessor. Differential Revision: https://reviews.llvm.org/D123231
2022-06-09[NFC] format InstructionSimplify & lowerCaseFunctionNamesSimon Moll
Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889
2022-03-03Cleanup includes: Transform/Scalarserge-sans-paille
Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817
2022-02-22[StructurizeCFG] Fix boolean not bugJay Foad
D118623 added code to fold not-of-compare into a compare with the inverted predicate, if the compare had no other uses. This relies on accurate use lists in the IR but it was run before setPhiValues, when some phi inputs are still stored in a data structure on the side, instead of being real uses in the IR. The effect was that a phi that should be using the original compare result would now get an inverted result instead. Fix this by moving simplifyConditions after setPhiValues. Differential Revision: https://reviews.llvm.org/D120312
2022-02-01[StructurizeCFG] Clean up some boolean not instructionsJay Foad
In some cases StructurizeCFG inserts i1 xor instructions to invert predicates. Add a quick loop to clean these up afterwards if we can get away with modifying an existing compare instruction instead. (StructurizeCFG is generally run late in the pipeline so instcombine does not clean them up for us.) Differential Revision: https://reviews.llvm.org/D118623
2021-02-25[Scalar] Use range-based for loops (NFC)Kazu Hirata
2021-02-04[Transforms/Scalar] Use range-based for loops (NFC)Kazu Hirata