| Age | Commit message (Collapse) | Author |
|
This is directly available in TargetInstrInfo
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
This PR changes how the call to `mcount` is inserted in `emitPrologue`.
It is now emitted as an external symbol rather than a global variable,
preventing potentially unexpected IR modification.
Fixes: https://github.com/llvm/llvm-project/issues/152238
|
|
getInstrInfo() already returns const SystemZInstrInfo *.
|
|
|
|
Commit `083b4a3d66` introduced a store-and-load pair around the `BRASL`
call to mcount. That load instruction did not properly declare its
target register as defined, leading to a bad machine instruction.
This commit fixes this by explicitly labeling `%r14` on the load as
`def`.
|
|
When compiling with `-pg`, the `EntryExitInstrumenterPass` will insert
calls to the glibc function `mcount` at the begining of each
`MachineFunction`.
On SystemZ, these calls require special handling:
- The call to `mcount` needs to happen at the beginning of the prologue.
- Prior to the call to `mcount`, register `%r14`, the return address of
the callee function, must be stored 8 bytes above the stack pointer
`%r15`. After the call to `mcount` returns, that register needs to be
restored.
This commit adds some special handling to the EntryExitInstrumenterPass
that keeps the insertion of the mcount function into the module, but
skips over insertion of the actual call in order to perform this
insertion in the `emitPrologue` function. There, a simple sequence of
store/call/load is inserted, which implements the above.
The desired change in the `EntryExitInstrumenterPass` necessitated the
addition of a new attribute and attribute kind to each function, which
is used to trigger the postprocessing, aka call insertion, in
`emitPrologue`. Note that the new attribute must be of a different kind
than the `mcount` atribute, since otherwise it would replace that
attribute and later be deleted by the code that intended to delete
`mcount`. The new attribnute is called `insert-mcount`, while the
attribute kind is `systemz-backend`, to clearly mark it as a
SystemZ-specific backend concern.
This PR should address issue #121137 . The test inserted here is derived
from the example given in that issue.
|
|
(#128095)
Callee saved registers should always be phyiscal registers. They are
often passed directly to other functions that take MCRegister like
getMinimalPhysRegClass or TargetRegisterClass::contains.
Unfortunately, sometimes the MCRegister is compared to a Register which
gave an ambiguous comparison error when the MCRegister is on the LHS.
Adding a MCRegister==Register comparison operator created more ambiguous
comparison errors elsewhere. These cases were usually comparing against
a base or frame pointer register that is a physical register in a
Register. For those I added an explicit conversion of Register to
MCRegister to fix the error.
|
|
This seems like an oversight when copying code from other backends.
|
|
Identified with misc-include-cleaner.
|
|
Pointed out by @redstar here:
https://github.com/llvm/llvm-project/pull/106014/files#r1816845388
|
|
`TargetFrameLowering::hasFP()` (#106014)
Some targets (e.g. PPC and Hexagon) already did this. I think it's best
to do this consistently so that frontend authors don't run into
inconsistent results when they emit `naked` functions. For example, in
Zig, we had to change our emit code to also set `frame-pointer=none` to
get reliable results across targets.
Note: I don't have commit access.
|
|
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
just to query the MCContext, which MachineFunction already directly
stores. Other contexts are using it to query the LLVMContext, which
can already be accessed through the IR function reference.
|
|
This PR fixes the following failure by adjusting the calculation of
maximum displacement from Stack Pointer.
`LLVM ERROR: Error while trying to spill R5D from class ADDR64Bit:
Cannot scavenge register without an emergency spill slot!
`
|
|
Use the new getPointerSize() function throughout the frame lowering class.
|
|
The implementation follows the ELF implementation.
|
|
The implementation follows the ELF implementation.
|
|
https://github.com/llvm/llvm-project/pull/79940 put calls to
recomputeLiveIns into
a loop, to repeatedly call the function until the computation converges.
However,
this repeats a lot of code. This changes moves the loop into a function
to simplify
the handling.
Note that this changes the order in which recomputeLiveIns is called.
For example,
```
bool anyChange = false;
do {
anyChange = recomputeLiveIns(*ExitMBB) || recomputeLiveIns(*LoopMBB);
} while (anyChange);
```
only begins to recompute the live-ins for LoopMBB after the computation
for ExitMBB
has converged. With this change, all basic blocks have a recomputation
of the live-ins
for each loop iteration. This can result in less or more calls,
depending on the
situation.
|
|
On SystemZ, the outgoing argument area which is big enough for all calls
in the function is created once during the prolog, as opposed to
adjusting the stack around each call. The call-sequence instructions are
therefore not really useful any more than to compute the maximum call
frame size, which has so far been done by PEI, but can just as well be
done at an earlier point.
This patch removes the mapping of the CallFrameSetupOpcode and
CallFrameDestroyOpcode and instead computes the MaxCallFrameSize
directly after instruction selection and then removes the ADJCALLSTACK
pseudos. This removes the confusing pseudos and also avoids the problem
of having to keep the call frame size accurate when creating new MBBs.
This fixes #76618 which exposed the need to maintain the call frame size
when splitting blocks (which was not done).
|
|
This is a fix for the regression seen in
https://github.com/llvm/llvm-project/pull/79498
> Currently, the way that recomputeLiveIns works is that it will
recompute the livein registers for that MachineBasicBlock but it matters
what order you call recomputeLiveIn which can result in incorrect
register allocations down the line.
Now we do not recompute the entire CFG but we do ensure that the newly
added MBB do reach convergence.
|
|
This reverts commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b.
Introduces a major compile-time regression.
|
|
Currently, the way that recomputeLiveIns works is that it will recompute
the livein registers for that MachineBasicBlock but it matters what
order you call recomputeLiveIn which can result in incorrect register
allocations down the line.
This PR fixes that by simply recomputing the liveins for the entire CFG
until convergence is achieved. This makes it harder to introduce subtle
bugs which alter liveness.
|
|
Adds emitting the exception table and the EH registers for XPLINK.
---------
Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>
|
|
GCC supports building individual functions with backchain using the
__attribute__((target("backchain"))) syntax, and Clang should too.
Clang translates this into the "target-features"="+backchain" attribute,
and the -mbackchain command-line option into the "backchain" attribute.
The backend currently checks only the latter; furthermore, the backchain
target feature is not defined.
Handle backchain like soft-float. Define a target feature, convert
function attribute into it in getSubtargetImpl(), and check for target
feature instead of function attribute everywhere. Add an end-to-end test
to the Clang testsuite.
|
|
This PR adds vararg support to z/OS and updates the call-zos-vararg.ll
lit test.
Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>
|
|
This PR moves some calculation out of `LowerCall` and into
`SystemZXPLINKFrameLowering::processFunctionBeforeFrameFinalized`.
We need to make this change because LowerCall isn't invoked for
functions that don't have function calls, and it is required for some
tooling to work correctly. A function that does not make any calls is
required to allocate 32 bytes for the parameter area required by the
ABI. However, we allocate 64 bytes because this additional space is
utilized by certain tools, like the debugger.
Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>
|
|
call stack frame extension is invoked
When the stack frame extension routine is used, the contents of r3 is overwritten.
However, if r3 is live in the prologue (ie. one of the function's parameters
resides in r3), it needs to be saved. We save r3 in r0 if r0 is available
(ie. r0 is not used as temporary storage for r4), and in the corresponding
stack slot for the third parameter otherwise.
Differential Revision: https://reviews.llvm.org/D150332
Reviewed By: uweigand
|
|
when call stack frame extension is invoked"
This reverts commit 1aec3d15aaa25c39fae026688708d7353d488974.
|
|
call stack frame extension is invoked
When the stack frame extension routine is used, the contents of r3 is
overwritten. However, if r3 is live in the prologue (ie. one of the
function's parameters resides in r3), it needs to be saved. We save
r3 in r0 if r0 is available (ie. r0 is not used as temporary storage
for r4), and in the corresponding stack slot for the third parameter otherwise.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D150332
|
|
storeRegToStackSlot/loadRegFromStackSlot
With D134950, targets get notified when a virtual register is created and/or
cloned. Targets can do the needful with the delegate callback. AMDGPU propagates
the virtual register flags maintained in the target file itself. They are useful
to identify a certain type of machine operands while inserting spill stores and
reloads. Since RegAllocFast spills the physical register itself, there is no way
its virtual register can be mapped back to retrieve the flags. It can be solved
by passing the virtual register as an additional argument. This argument has no
use when the spill interfaces are called during the greedy allocator or even the
PrologEpilogInserter and can pass a null register in such cases.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138656
|
|
|
|
|
|
This PR adds support for creating leaf functions when there are no CSRs used, no function calls are made, no stack frame is acquired, and contain no try/catch/throw statements.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D129687
|
|
|
|
registers in CSR list instead of determineCalleeSave
This PR moves the handling of special registers that need to be saved/restored in the prolog/epilog respectively from determineCalleeSaves to assignCalleeSavedSpillSlots. The documentation of the parent function of assignCalleeSavedSpillSlots explicitly allows the modification of the CSI hence adding the special registers (the stack pointer register, the return address register, and the entry point register) to the CSI list at that stage should be permissible.
This cleans up the code a bit and makes it so that we do not have to place registers that are not actually considered CSRs by the spec in the CSR list, which is something of a hack.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D125044
|
|
Review: Ulrich Weigand
|
|
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926
before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
|
|
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
|
|
after: 1061034926
before: 1063332844
Differential Revision: https://reviews.llvm.org/D121169
|
|
XPLINK64
This patch extends support for generating huge stack frames on 64-bit XPLINK by implementing the ABI-mandated call to the stack extension routine.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D120450
|
|
By reordering the objects on the stack frame after looking at the users, a
better utilization of displacement operands will result. This means less
needed Load Address instructions for the accessing of these objects.
This is important for very large functions where otherwise small changes
could cause a lot more/less accesses go out of range.
Note: this is not yet enabled for SystemZXPLINKFrameLowering, but should be.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D115690
|
|
C++17"
This reverts commit ef8206320769ad31422a803a0d6de6077fd231d2.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
|
|
As a conquence move llvm::array_lengthof from STLExtras.h to
STLForwardCompat.h (which is included by STLExtras.h so no build
breakage expected).
|
|
This flag was set in the presence of stacksave/stackrestore in order to force
a frame pointer.
This should however not be needed per the comment in MachineFrameInfo.h
stating that a a variable sized object "...is the sole condition which
prevents frame pointer elimination", and experiments have also shown that
there seems to be no effect whatsoever on code generation with ManipulatesSP.
Review: Ulrich Weigand
|
|
|
|
This patch adds support for prologue and epilogue generation for the z/OS target under the XPLINK64 ABI for functions with a stack size of less than 1048576 bytes (huge stack frames).
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D114457
|
|
This reverts commit ffad4d777b227f91be04020e2cd86ab38e969e39 because it introduced buildbot failures.
|
|
This patch adds support for prologue and epilogue generation for
the z/OS target under the XPLINK64 ABI for functions with a stack
size of less than 1048576 bytes (huge stack frames).
Reviewed by: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D114457
|
|
|
|
This PR implements the save of the XPLINK callee-saved registers
on z/OS.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D111653
|