summaryrefslogtreecommitdiff
path: root/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
AgeCommit message (Collapse)Author
2025-09-24[AMDGPU] SILowerControlFlow: ensure EXEC/SCC interval recompute (#160459)Carl Ritson
Ensure live intervals for EXEC and SCC are removed on all paths which generate instructions.
2025-09-16[AMDGPU] Refactor out common exec mask opcode patterns (NFCI) (#154718)Carl Ritson
Create utility mechanism for finding wave size dependent opcodes used to manipulate exec/lane masks.
2025-08-18[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)Kazu Hirata
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.
2025-08-14[AMDGPU] Preserve post dominator tree through SILowerControlFlow (#153528)Carl Ritson
Change dominator tree updates to also handle post dominator tree.
2025-01-16[AMDGPU][NewPM] Port SILowerControlFlow pass into NPM. (#123045)Christudasan Devadasan
2025-01-16[AMDGPU] Use LV wrapperPass in getAnalysisUsage. (#123044)Christudasan Devadasan
2024-10-18[AMDGPU][NFC] Correct description (#112847)Mariusz Sikora
2024-09-24[AMDGPU][NFC] Update comment referring to SIRemoveShortExecBranches pass ↵Fabian Ritter
(#109756) That pass no longer exists, since 5df2af8b0ef33f48b1ee72bcd27bc609b898da52 has merged it into SIPreEmitPeephole.
2024-07-10[CodeGen][NewPM] Port `LiveIntervals` to new pass manager (#98118)paperchalice
- Add `LiveIntervalsAnalysis`. - Add `LiveIntervalsPrinterPass`. - Use `LiveIntervalsWrapperPass` in legacy pass manager. - Use `std::unique_ptr` instead of raw pointer for `LICalc`, so destructor and default move constructor can handle it correctly. This would be the last analysis required by `PHIElimination`.
2024-07-09[CodeGen][NewPM] Port `SlotIndexes` to new pass manager (#97941)paperchalice
- Add `SlotIndexesAnalysis`. - Add `SlotIndexesPrinterPass`. - Use `SlotIndexesWrapperPass` in legacy pass.
2024-07-09[CodeGen][NewPM] Port `LiveVariables` to new pass manager (#97880)paperchalice
- Port `LiveVariables` to new pass manager. - Convert to `LiveVariablesWrapperPass` in legacy pass manager.
2024-06-11[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis ↵paperchalice
result (#94571) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.
2024-06-06[AMDGPU] Move INIT_EXEC lowering from SILowerControlFlow to SIWholeQuadMode ↵Jay Foad
(#94452) NFCI; this just preserves SI_INIT_EXEC and SI_INIT_EXEC_FROM_INPUT instructions a little longer so that we can reliably identify them in SIWholeQuadMode.
2023-09-14[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes ↵Arthur Eubanks
(#66295) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::
2023-09-11[AMDGPU] SILowerControlFlow: fix preservation of LiveIntervalsCarl Ritson
In emitElse live interval for SI_ELSE source must be recalculated as SI_ELSE is removed, and new user is placed at block start. In emitIfBreak live interval for new created AndReg must be computed. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158141
2023-06-05[AMDGPU] Make use of MachineInstr::all_defs and all_uses. NFCI.Jay Foad
2023-05-03Revert "Revert "[AMDGPU] Update LiveVariables in SILowerControlFlow""Mateja Marjanovic
Accidental error. This reverts commit 2e823da8dc652b23738e2d3b8e7e7f21335816eb.
2023-05-03Revert "[AMDGPU] Update LiveVariables in SILowerControlFlow"Mateja Marjanovic
This reverts commit 069f027e1e6b1db9e3e6dcf4193c670e2be3d5d5.
2023-05-03[AMDGPU] Update LiveVariables in SILowerControlFlowJay Foad
Update kills in one place that was missed. Fixes a test failure that would otherwise be introduced by D149651.
2023-04-08AMDGPU: Fix LiveVariables verifier error for values defined before SI_END_CFMatt Arsenault
GlobalISel happens to insert some constant materializes before SI_END_CF in one test. These need to be excluded from AliveBlocks since they are defined in the original block and used in the split block, so they aren't fully alive through either block. The case where the value defined in the first block which was originally used in a later block is still broken. Avoids a verifier error in a future patch.
2022-07-12[AMDGPU] SILowerControlFlow uses LiveIntervalsJay Foad
The availability of LiveIntervals affects kill flags in the output, so declare the use to avoid strange effects where the output of this pass is different depending on what other passes are scheduled after it. Differential Revision: https://reviews.llvm.org/D129555
2022-04-05AMDGPU: Fix LiveVariables error after lowering SI_END_CFMatt Arsenault
This wasn't accounting for the block change in updating LiveVariables.
2022-03-16[NFC][CodeGen] Rename some functions in MachineInstr.h and remove duplicated ↵Shengchen Kan
comments
2022-02-18[AMDGPU][NFC] Fix typosSebastian Neubauer
Fix some typos in the amdgpu backend. Differential Revision: https://reviews.llvm.org/D119235
2022-02-18[AMDGPU] Return better Changed status from SILowerControlFlowJay Foad
Differential Revision: https://reviews.llvm.org/D120025
2022-01-18[AMDGPU] Disable optimizeEndCf at -O0Christudasan Devadasan
Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D116819
2022-01-06[AMDGPU] Iterate LoweredEndCf in the reverse orderChristudasan Devadasan
The function that optimally inserts the exec mask restore operations by combining the blocks currently visits the lowered END_CF pseudos in the forward direction as it iterates the setvector in the order the entries are inserted in it. Due to the absence of BranchFolding at -O0, the irregularly placed BBs cause the forward traversal to incorrectly place two unconditional branches in certain BBs while combining them, especially when an intervening block later gets optimized away in subsequent iterations. It is avoided by reverse iterating the setvector. The blocks at the bottom of a function will get optimized first before processing those at the top. Fixes: SWDEV-315215 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D116273
2021-11-12[AMDGPU][NFC] Fix typosNeubauer, Sebastian
Differential Revision: https://reviews.llvm.org/D113672
2021-11-02[AMDGPU] Really preserve LiveVariables in SILowerControlFlowJay Foad
https://bugs.llvm.org/show_bug.cgi?id=52204 Differential Revision: https://reviews.llvm.org/D112731
2021-10-26[AMDGPU] Use standard MachineBasicBlock::getFallThrough method. NFCI.Jay Foad
Differential Revision: https://reviews.llvm.org/D101825
2021-10-18[AMDGPU] Add link to bugJay Foad
2021-10-18Add new MachineFunction property FailsVerificationJay Foad
TargetPassConfig::addPass takes a "bool verifyAfter" argument which lets you skip machine verification after a particular pass. Unfortunately this is used in generic code in TargetPassConfig itself to skip verification after a generic pass, only because some previous target- specific pass damaged the MIR on that specific target. This is bad because problems in one target cause lack of verification for all targets. This patch replaces that mechanism with a new MachineFunction property called "FailsVerification" which can be set by (usually target-specific) passes that are known to introduce problems. Later passes can reset it again if they are known to clean up the previous problems. Differential Revision: https://reviews.llvm.org/D111397
2021-10-15[amdgpu] Fix a crash case when preserving MDT in SILowerControlFlowMichael Liao
- When a redundant MBB is being erased from MDT, check whether its single successor is dominiated by it. If yes, update that successor's idom before erasing MBB; otherwise, it implies MBB is a leaf node and could be erased directly. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D111831
2021-10-07[AMDGPU] Preserve MachineDominatorTree in SILowerControlFlowJay Foad
Updating the MachineDominatorTree is easy since SILowerControlFlow only splits and removes basic blocks. This should save a bit of compile time because previously we would recompute the dominator tree from scratch after this pass. Another reason for doing this is that SILowerControlFlow preserves LiveIntervals which transitively requires MachineDominatorTree. I think that means that SILowerControlFlow is obliged to preserve MachineDominatorTree too as explained here: https://lists.llvm.org/pipermail/llvm-dev/2020-November/146923.html although it does not seem to have caused any problems in practice yet. Differential Revision: https://reviews.llvm.org/D111313
2021-07-13[AMDGPU] Mark waterfall loops as SI_WATERFALL_LOOPSebastian Neubauer
This way, they can be detected later, e.g. by the SIOptimizeVGPRLiveRange pass. Differential Revision: https://reviews.llvm.org/D105467
2021-07-06[AMDGPU] Remove outdated comment and tidy up. NFC.Jay Foad
This was left over from D94746.
2021-02-15[AMDGPU] Add llvm.amdgcn.wqm.demote intrinsicCarl Ritson
Add intrinsic which demotes all active lanes to helper lanes. This is used to implement demote to helper Vulkan extension. In practice demoting a lane to helper simply means removing it from the mask of live lanes used for WQM/WWM/Exact mode. Where the shader does not use WQM, demotes just become kills. Additionally add llvm.amdgcn.live.mask intrinsic to complement demote operations. In theory llvm.amdgcn.ps.live can be used to detect helper lanes; however, ps.live can be moved by LICM. The movement of ps.live cannot be remedied without changing its type signature and such a change would require ps.live users to update as well. Reviewed By: piotr Differential Revision: https://reviews.llvm.org/D94747
2021-02-11[AMDGPU] Move kill lowering to WQM pass and add live mask trackingCarl Ritson
Move implementation of kill intrinsics to WQM pass. Add live lane tracking by updating a stored exec mask when lanes are killed. Use live lane tracking to enable early termination of shader at any point in control flow. Reviewed By: piotr Differential Revision: https://reviews.llvm.org/D94746
2021-01-25[AMDGPU] Fix llvm.amdgcn.init.exec and frame materializationCarl Ritson
Frame-base materialization may insert vector instructions before EXEC is initialised. Fix this by moving lowering of llvm.amdgcn.init.exec later in backend. Also remove SI_INIT_EXEC_LO pseudo as this is not necessary. Reviewed By: ruiling Differential Revision: https://reviews.llvm.org/D94645
2021-01-20[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargetsdfukalov
... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036
2021-01-07[NFC][AMDGPU] Reduce include files dependency.dfukalov
Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813
2021-01-03[Target] Construct SmallVector with iterator ranges (NFC)Kazu Hirata
2020-10-30[AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for ↵alex-t
the null MBB pointer in MF->splice Detailed description: This change addresses the refactoring adviced by foad. It also contain the fix for the case when getNextNode is null if the successor block is the last in MachineFunction. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D90314
2020-10-20[AMDGPU] Remove fix up operand from SI_ELSECarl Ritson
Remove immediate operand from SI_ELSE which indicates if EXEC has been modified. Instead always emit code that handles EXEC and remove unnecessary instructions during pre-RA optimisation. This facilitates passes (i.e. SIWholeQuadMode) adding exec mask manipulation post control flow lowering, and pre control flow lower passes do not need to be aware of SI_ELSE handling. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D89644
2020-10-15[AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change ↵alex-t
MBB layout if it can fallthrough removeMBBifRedundant normally tries to keep predecessors fallthrough when removing redundant MBB. It has to change MBBs layout to keep the new successor to immediately follow the predecessor of removed MBB. It only may be allowed in case the new successor itself has no successors to which it fall through. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D89397
2020-09-18CodeGen: Move split block utility to MachineBasicBlockMatt Arsenault
AMDGPU needs this in several places, so consolidate them here.
2020-09-18AMDGPU: Don't sometimes allow instructions before lowered si_end_cfMatt Arsenault
Since 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f, this would sometimes not emit the or to exec at the beginning of the block, where it really has to be. If there is an instruction that defines one of the source operands, split the block and turn the si_end_cf into a terminator. This avoids regressions when regalloc fast is switched to inserting reloads at the beginning of the block, instead of spills at the end of the block. In a future change, this should always split the block.
2020-09-07[AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic blockalex-t
optimizeEndCF removes EXEC restoring instruction case this instruction is the only one except the branch to the single successor and that successor contains EXEC mask restoring instruction that was lowered from END_CF belonging to IF_ELSE. As a result of such optimization we get the basic block with the only one instruction that is a branch to the single successor. In case the control flow can reach such an empty block from S_CBRANCH_EXEZ/EXECNZ it might happen that spill/reload instructions that were inserted later by register allocator are placed under exec == 0 condition and never execute. Removing empty block solves the problem. This change require further work to re-implement LIS updates. Recently, LIS is always nullptr in this pass. To enable it we need another patch to fix many places across the codegen. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D86634
2020-09-03AMDGPU: Remove code to handle tied si_else operandsMatt Arsenault
This has not used tied operands for a long time.
2020-08-21[AMDGPU] Apply llvm-prefer-register-over-unsigned from clang-tidyJay Foad