| Age | Commit message (Collapse) | Author |
|
This changes `MCRegUnit` type from `unsigned` to `enum class : unsigned`
and inserts necessary casts.
The added `MCRegUnitToIndex` functor is used with `SparseSet`,
`SparseMultiSet` and `IndexedMap` in a few places.
`MCRegUnit` is opaque to users, so it didn't seem worth making it a
full-fledged class like `Register`.
Static type checking has detected one issue in
`PrologueEpilogueInserter.cpp`, where `BitVector` created for
`MCRegister` is indexed by both `MCRegister` and `MCRegUnit`.
The number of casts could be reduced by using `IndexedMap` in more
places and/or adding a `BitVector` adaptor, but the number of casts *per
file* is still small and `IndexedMap` has limitations, so it didn't seem
worth the effort.
Pull Request: https://github.com/llvm/llvm-project/pull/167943
|
|
(#167931)
The methods will help reduce the number of static_casts after changing
MCRegUnit to a strong typedef.
|
|
|
|
This is directly available in TargetInstrInfo
|
|
Identified with modernize-use-equals-default.
|
|
Now that the old llvm::identity has moved into IndexedMap.h under a
different name, this patch renames identity_cxx20 to identity. Note
that llvm::identity closely models std::identity from C++20.
|
|
The legacy llvm::identity is not quite the same as std::identity from
C++20. llvm::identity is a template struct with an ::argument_type
member. In contrast, llvm::identity_cxx20 (and std::identity) is a
non-template struct with a templated call operator and no
::argument_type.
This patch modernizes llvm::SparseSet by updating its default
key-extraction functor to llvm::identity_cxx20. A new template
parameter KeyT takes over the role of ::argument_type.
Existing uses of SparseSet are updated for the new template signature.
Most use sites are of the form SparseSet<T>, requiring no update.
|
|
This is the fast regalloc equivalent of
773771ba382b1fbcf6acccc0046bfe731541a599.
|
|
(#140002)
Add per-property has<Prop>/set<Prop>/reset<Prop> functions to
MachineFunctionProperties.
|
|
|
|
We have already ensured in 9cec2b246e719533723562950e56c292fe5dd5ad
that `INLINEASM_BR` output operands get spilled onto the stack, both
in the fallthrough path and in the indirect targets. Since reloads of
live-ins values into physical registers contextually happen after all
MIR instructions (and ops) have been visited, make sure such loads are
placed at the start of the block, but after prologues or `INLINEASM_BR`
spills, as otherwise this may cause stale values to be read from the
stack.
Fixes: #74483, #110251.
|
|
|
|
assertions
|
|
Found by msan.
==8138==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x559016395beb in allocVirtRegUndef llvm/lib/CodeGen/RegAllocFast.cpp:1010:6
|
|
(#128281)
|
|
Register::isPhysicalRegister. NFC"
This reverts commit 5fadb3d680909ab30b37eb559f80046b5a17045e.
|
|
Prefer the nonstatic member by converting unsigned to Register instead.
|
|
Register::virtReg2Index. NFC (#125031)
These are the the ones where we already had a Register object being
used. Some places are still using unsigned which I did not convert.
|
|
Update some other places to avoid implicit conversions this introduces,
but I probably missed some.
|
|
|
|
(#120268)
There was a buildbot breakage
(https://lab.llvm.org/buildbot/#/builders/24/builds/3329/steps/11/logs/stdio):
/home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/test/CodeGen/AMDGPU/ran-out-of-registers-error-all-regs-reserved.ll:9:10:
error: CHECK: expected string not found in input
; CHECK: error: <unknown>:0:0: no registers from class available to
allocate in function 'no_registers_from_class_available_to_allocate'
2: ==75198==ERROR: AddressSanitizer: stack-use-after-scope on address
0xfa23f9f1c270 at pc 0xb2660dda9340 bp 0xfffffe8ab340 sp 0xfffffe8ab338
caused by https://github.com/llvm/llvm-project/pull/120184, which made a
partial fix but also renabled the tests. This patch attempts to fix
forward by applying the same fix to the error message highlighted in the
buildbot.
|
|
This reverts commit 1297933f35b4948b4d281259627a72094c407a75.
|
|
Greedy and fast would hit different assertions on undef uses if all
registers in a class were reserved.
|
|
(#119640)
Try to use DiagnosticInfo if every register in the class is reserved
by forcing assignment to a reserved register. Also reduces the number
of redundant errors emitted, particularly with fast.
This is still broken in the case of undef uses. There are additional
complications in greedy and fast, so leave it for a separate fix.
|
|
Improve the non-fatal cases to use DiagnosticInfo, which will now
provide a location. The allocators attempt to report different errors
if it happens to see inline assembly is involved (this detection is
quite unreliable) using srcloc instead of dbgloc. For now, leave this
behavior unchanged. I think reporting the full location and context
function would be more useful.
|
|
(#119575) (#119634)
This reverts commit 40986feda8b1437ed475b144d5b9a208b008782a.
Reapply with fix to prevent temporary Twine from going out of scope.
|
|
Reverts llvm/llvm-project#119485
Breaks builders, details in llvm/llvm-project#119485
|
|
Currently LLVMContext::emitError emits any error as an "inline asm"
error which does not make any sense. InlineAsm appears to be special,
in that it uses a "LocCookie" from srcloc metadata, which looks like
a parallel mechanism to ordinary source line locations. This meant
that other types of failures had degraded source information reported
when available.
Introduce some new generic error types, and only use inline asm
in the appropriate contexts. The DiagnosticInfo types are still
a bit of a mess, and I'm not sure why DiagnosticInfoWithLocationBase
exists instead of just having an optional DiagnosticLocation in the
base class.
DK_Generic is for any error that derives from an IR level instruction,
and thus can pull debug locations directly from it. DK_GenericWithLoc
is functionally the generic codegen error, since it does not depend
on the IR and instead can construct a DiagnosticLocation from the
MI debug location.
|
|
Reverts llvm/llvm-project#106404
Breaks:
https://lab.llvm.org/buildbot/#/builders/169/builds/2590
https://lab.llvm.org/buildbot/#/builders/164/builds/2454
|
|
|
|
|
|
Modifying `auto` to `auto&` to avoid unnecessary copying
|
|
[CodeGen] Change the prototype of regalloc filter function
Change the prototype of the filter function so that we can
filter not just by RegClass. We need to implement more
complicated filter based upon some other info associated
with each register.
Patch provided by: Gang Chen (gangc@amd.com)
|
|
RAII class (#94854)
Modify MachineFunctionProperties in PassModel makes `PassT P;
P.run(...);` not work properly. This is a necessary compromise.
|
|
A SparseSet adds an avoidable layer of indirection and possibly looping
control flow. Avoid this overhead by using a vector to store
UsedInInstrs and PhysRegUses.
To avoid clearing the vector after every instruction, use a
monotonically increasing counter. The two maps are now merged and the
lowest bit indicates whether the use is relevant for the livethrough
handling code only.
|
|
Previously, there was at least one virtual function call for every
allocated register. The only users of this feature are AMDGPU and RISC-V
(RVV), other targets don't use this. To easily identify these cases,
change the default functor to nullptr and don't call it for every
allocated register.
|
|
On x86, many instructions have tied operands, so allocateInstruction
uses the more complex assignment strategy, which computes the assignment
order of virtual defs first. This involves iterating over all register
classes (or register aliases for physical defs) to compute the possible
number of defs per register class.
However, this information is only used for sorting virtual defs and
therefore not required when there's only one virtual def -- which is a
very common case. As iterating over all register classes/aliases is not
cheap, do this only when there's more than one virtual def.
|
|
MachineInstr operand indices can be up 24 bits currently. Use unsigned
as consistent data type for operand indices instead of uint16_t.
|
|
This pull request port `regallocfast` to new pass manager. It exposes
the parameter `filter` to handle different register classes for AMDGPU.
IIUC AMDGPU need to allocate different register classes separately so it
need implement its own `--<reg-class>-regalloc`. Now users can use e.g.
`-passe=regallocfast<filter=sgpr>` to allocate specific register class.
The command line option `--regalloc-npm` is still in work progress, plan
to reuse the syntax of passes, e.g. use
`--regalloc-npm=regallocfast<filter=sgpr>,greedy<filter=vgpr>` to
replace `--sgpr-regalloc` and `--vgpr-regalloc`.
|
|
MachineRegisterInfo already knows the MF so there is no need to pass it
in as an argument.
|
|
Most basic block do not need to query dominates. Defer initialization of
InstrPosIndexes to first query for each MBB.
|
|
|
|
The original brute force dominates algorithm is O(n) complexity so it is
very slow for very large machine basic block which is very common with
O0. This patch added InstrPosIndexes to assign index for each
instruction and use it to determine dominance. The complexity is now
O(1).
|
|
- use more range for
- avoid capturing lambda
- prefer Register type to unsigned
- remove braces around single statement if
|
|
|
|
`MachineInstr.h` is a commonly included file and this includes
`llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is
used only in one place.
According to `ClangBuildAnalyzer` (run solely on building LLVM, no other
projects) the second most expensive template to instantiate is the
`SmallSet::insert` method used in the `inline` implementation in
`getUsedDebugRegs()`:
```
**** Templates that took longest to instantiate:
554239 ms: std::unordered_map<int, int> (2826 times, avg 196 ms)
521187 ms: llvm::SmallSet<llvm::Register, 4>::insert (930 times, avg 560
ms)
...
```
By removing this method and putting its implementation in the one call
site we greatly reduce the template instantiation time and reduce the
number of includes.
When copying the implementation, I removed a check on `MO.getReg()` as
this is checked within `MO.isVirtual()`.
Differential Revision: https://reviews.llvm.org/D157720
|
|
When inline assembly code requests more registers than available, the
MachineInstr::emitError function in the RegAllocFast pass emits an error
but doesn't stop the pass, and then the compiler crashes later with an
assertion failure. This commit, mimicking the RegAllocGreedy pass, assigns
a random physical register, and therefore avoids the crash after producing
the diagnostic. This problem has been observed for both rustc and clang,
while it doesn't occur in gcc.
|
|
Differential Revision: https://reviews.llvm.org/D153122
|
|
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D152098
|
|
Differential Revision: https://reviews.llvm.org/D151424
|