| Age | Commit message (Collapse) | Author |
|
Ensure live intervals for EXEC and SCC are removed on all paths which
generate instructions.
|
|
Create utility mechanism for finding wave size dependent opcodes used to
manipulate exec/lane masks.
|
|
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:
template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};
We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
|
|
Change dominator tree updates to also handle post dominator tree.
|
|
|
|
|
|
|
|
(#109756)
That pass no longer exists, since
5df2af8b0ef33f48b1ee72bcd27bc609b898da52 has merged it into
SIPreEmitPeephole.
|
|
- Add `LiveIntervalsAnalysis`.
- Add `LiveIntervalsPrinterPass`.
- Use `LiveIntervalsWrapperPass` in legacy pass manager.
- Use `std::unique_ptr` instead of raw pointer for `LICalc`, so
destructor and default move constructor can handle it correctly.
This would be the last analysis required by `PHIElimination`.
|
|
- Add `SlotIndexesAnalysis`.
- Add `SlotIndexesPrinterPass`.
- Use `SlotIndexesWrapperPass` in legacy pass.
|
|
- Port `LiveVariables` to new pass manager.
- Convert to `LiveVariablesWrapperPass` in legacy pass manager.
|
|
result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
|
|
(#94452)
NFCI; this just preserves SI_INIT_EXEC and SI_INIT_EXEC_FROM_INPUT
instructions a little longer so that we can reliably identify them in
SIWholeQuadMode.
|
|
(#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
|
|
In emitElse live interval for SI_ELSE source must be recalculated
as SI_ELSE is removed, and new user is placed at block start.
In emitIfBreak live interval for new created AndReg must be
computed.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D158141
|
|
|
|
Accidental error.
This reverts commit 2e823da8dc652b23738e2d3b8e7e7f21335816eb.
|
|
This reverts commit 069f027e1e6b1db9e3e6dcf4193c670e2be3d5d5.
|
|
Update kills in one place that was missed. Fixes a test failure that
would otherwise be introduced by D149651.
|
|
GlobalISel happens to insert some constant materializes before SI_END_CF
in one test. These need to be excluded from AliveBlocks since they
are defined in the original block and used in the split block,
so they aren't fully alive through either block.
The case where the value defined in the first block which was originally used
in a later block is still broken.
Avoids a verifier error in a future patch.
|
|
The availability of LiveIntervals affects kill flags in the output, so
declare the use to avoid strange effects where the output of this pass
is different depending on what other passes are scheduled after it.
Differential Revision: https://reviews.llvm.org/D129555
|
|
This wasn't accounting for the block change in updating LiveVariables.
|
|
comments
|
|
Fix some typos in the amdgpu backend.
Differential Revision: https://reviews.llvm.org/D119235
|
|
Differential Revision: https://reviews.llvm.org/D120025
|
|
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D116819
|
|
The function that optimally inserts the exec mask
restore operations by combining the blocks currently
visits the lowered END_CF pseudos in the forward
direction as it iterates the setvector in the order
the entries are inserted in it.
Due to the absence of BranchFolding at -O0, the
irregularly placed BBs cause the forward traversal
to incorrectly place two unconditional branches in
certain BBs while combining them, especially when
an intervening block later gets optimized away in
subsequent iterations.
It is avoided by reverse iterating the setvector.
The blocks at the bottom of a function will get
optimized first before processing those at the top.
Fixes: SWDEV-315215
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D116273
|
|
Differential Revision: https://reviews.llvm.org/D113672
|
|
https://bugs.llvm.org/show_bug.cgi?id=52204
Differential Revision: https://reviews.llvm.org/D112731
|
|
Differential Revision: https://reviews.llvm.org/D101825
|
|
|
|
TargetPassConfig::addPass takes a "bool verifyAfter" argument which lets
you skip machine verification after a particular pass. Unfortunately
this is used in generic code in TargetPassConfig itself to skip
verification after a generic pass, only because some previous target-
specific pass damaged the MIR on that specific target. This is bad
because problems in one target cause lack of verification for all
targets.
This patch replaces that mechanism with a new MachineFunction property
called "FailsVerification" which can be set by (usually target-specific)
passes that are known to introduce problems. Later passes can reset it
again if they are known to clean up the previous problems.
Differential Revision: https://reviews.llvm.org/D111397
|
|
- When a redundant MBB is being erased from MDT, check whether its
single successor is dominiated by it. If yes, update that successor's
idom before erasing MBB; otherwise, it implies MBB is a leaf node and
could be erased directly.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D111831
|
|
Updating the MachineDominatorTree is easy since SILowerControlFlow only
splits and removes basic blocks. This should save a bit of compile time
because previously we would recompute the dominator tree from scratch
after this pass.
Another reason for doing this is that SILowerControlFlow preserves
LiveIntervals which transitively requires MachineDominatorTree. I think
that means that SILowerControlFlow is obliged to preserve
MachineDominatorTree too as explained here:
https://lists.llvm.org/pipermail/llvm-dev/2020-November/146923.html
although it does not seem to have caused any problems in practice yet.
Differential Revision: https://reviews.llvm.org/D111313
|
|
This way, they can be detected later, e.g. by the
SIOptimizeVGPRLiveRange pass.
Differential Revision: https://reviews.llvm.org/D105467
|
|
This was left over from D94746.
|
|
Add intrinsic which demotes all active lanes to helper lanes.
This is used to implement demote to helper Vulkan extension.
In practice demoting a lane to helper simply means removing it
from the mask of live lanes used for WQM/WWM/Exact mode.
Where the shader does not use WQM, demotes just become kills.
Additionally add llvm.amdgcn.live.mask intrinsic to complement
demote operations. In theory llvm.amdgcn.ps.live can be used
to detect helper lanes; however, ps.live can be moved by LICM.
The movement of ps.live cannot be remedied without changing
its type signature and such a change would require ps.live
users to update as well.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D94747
|
|
Move implementation of kill intrinsics to WQM pass. Add live lane
tracking by updating a stored exec mask when lanes are killed.
Use live lane tracking to enable early termination of shader
at any point in control flow.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D94746
|
|
Frame-base materialization may insert vector instructions before EXEC is initialised.
Fix this by moving lowering of llvm.amdgcn.init.exec later in backend.
Also remove SI_INIT_EXEC_LO pseudo as this is not necessary.
Reviewed By: ruiling
Differential Revision: https://reviews.llvm.org/D94645
|
|
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
|
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
|
|
|
the null MBB pointer in MF->splice
Detailed description: This change addresses the refactoring adviced by foad. It also contain the fix for the case when getNextNode is null if the successor block is the last in MachineFunction.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D90314
|
|
Remove immediate operand from SI_ELSE which indicates if EXEC has
been modified. Instead always emit code that handles EXEC and
remove unnecessary instructions during pre-RA optimisation.
This facilitates passes (i.e. SIWholeQuadMode) adding exec mask
manipulation post control flow lowering, and pre control flow
lower passes do not need to be aware of SI_ELSE handling.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D89644
|
|
MBB layout if it can fallthrough
removeMBBifRedundant normally tries to keep predecessors fallthrough when removing redundant MBB.
It has to change MBBs layout to keep the new successor to immediately follow the predecessor of removed MBB.
It only may be allowed in case the new successor itself has no successors to which it fall through.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D89397
|
|
AMDGPU needs this in several places, so consolidate them here.
|
|
Since 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f, this would sometimes
not emit the or to exec at the beginning of the block, where it really
has to be. If there is an instruction that defines one of the source
operands, split the block and turn the si_end_cf into a terminator.
This avoids regressions when regalloc fast is switched to inserting
reloads at the beginning of the block, instead of spills at the end of
the block.
In a future change, this should always split the block.
|
|
optimizeEndCF removes EXEC restoring instruction case this instruction is the only one except the branch to the single successor and that successor contains EXEC mask restoring instruction that was lowered from END_CF belonging to IF_ELSE.
As a result of such optimization we get the basic block with the only one instruction that is a branch to the single successor.
In case the control flow can reach such an empty block from S_CBRANCH_EXEZ/EXECNZ it might happen that spill/reload instructions that were inserted later by register allocator are placed under exec == 0 condition and never execute.
Removing empty block solves the problem.
This change require further work to re-implement LIS updates. Recently, LIS is always nullptr in this pass. To enable it we need another patch to fix many places across the codegen.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D86634
|
|
This has not used tied operands for a long time.
|
|
|