| Age | Commit message (Collapse) | Author |
|
Previously, sign-extending a 1-bit boolean operand in `#DBG_VALUE` would
convert `true` to -1 (i.e., 0xffffffffffffffff). However, DWARF treats
booleans as unsigned values, so this resulted in the attribute
`DW_AT_const_value(0xffffffffffffffff)` being emitted. As a result, the
debugger would display the value as `255` instead of `true`.
This change modifies the behavior to use zero-extension for 1-bit values
instead, ensuring that `true` is represented as 1. Consequently, the
DWARF attribute emitted is now `DW_AT_const_value(1)`, which allows the
debugger to correctly display the boolean as `true`.
|
|
I'm not sure if this is the best way forward or not, but we have a lot
of issues with forgetting that shuffle_vectors can be scalar again and
again. (There is another example from the recent known-bits code added
recently). As a scalar-dst shuffle vector is just an extract, and a
scalar-source shuffle vector is just a build vector, this patch makes
scalar shuffle vector illegal and adjusts the irbuilder to create the
correct node as required.
Most targets do this already through lowering or combines. Making scalar
shuffles illegal simplifies gisel as a whole, it just requires that
transforms that create shuffles of new sizes to account for the scalar
shuffle being illegal (mostly IRBuilder and LessElements).
|
|
This flag applies to G_PTR_ADD instructions and indicates that the operation
implements an inbounds getelementptr operation, i.e., the pointer operand is in
bounds wrt. the allocated object it is based on, and the arithmetic does not
change that.
It is set when the IRTranslator lowers inbounds GEPs (currently only in some
cases, to be extended with a future PR), and in the
(build|materialize)ObjectPtrOffset functions.
Inbounds information is useful in ISel when we have instructions that perform
address computations whose intermediate steps must be in the same memory region
as the final result. A follow-up patch will start using it for AMDGPU's flat
memory instructions, where the immediate offset must not affect the memory
aperture of the address.
This is analogous to a concurrent effort in SDAG: #131862
(related: #140017, #141725).
For SWDEV-516125.
|
|
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.
This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.
Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.
Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.
For SWDEV-516125.
|
|
instructions (#137701)
This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum`
instructions.
These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but
are atomic and use IEEE754 2019 handling for NaNs, which is different to
`fmax` and `fmin`. See:
https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic
for more details.
Future changes will allow this LLVM IR to be lowered to specialised
assembler instructions on suitable targets, such as AArch64.
|
|
instructions" (#137657)
Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4
|
|
(#136759)
This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum`
instructions.
These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but
are atomic and use IEEE754 2019 handling for NaNs, which is different to
`fmax` and `fmin`. See:
https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic
for more details.
Future changes will allow this LLVM IR to be lowered to specialised
assembler instructions on suitable targets, such as AArch64.
|
|
Constant{Int,FP}. (#137319)
|
|
Most places that call Intrinsic::getAttributes() are only interested in
the function attributes, so add a separate function for that.
The motivation for this is that I'd like to add the ability to specify
range attributes on intrinsics, which requires knowing the function
type. This avoids needing to know the type for most attribute queries.
|
|
|
|
Lower G_ instructions that can't be inst-selected with register bank
assignment from AMDGPURegBankSelect based on uniformity analysis.
- Lower instruction to perform it on assigned register bank
- Put uniform value in vgpr because SALU instruction is not available
- Execute divergent instruction in SALU - "waterfall loop"
Given LLTs on all operands after legalizer, some register bank
assignments require lowering while other do not.
Note: cases where all register bank assignments would require lowering
are lowered in legalizer.
AMDGPURegBankLegalize goals:
- Define Rules: when and how to perform lowering
- Goal of defining Rules it to provide high level table-like brief
overview of how to lower generic instructions based on available
target features and uniformity info (uniform vs divergent).
- Fast search of Rules, depends on how complicated Rule.Predicate is
- For some opcodes there would be too many Rules that are essentially
all the same just for different combinations of types and banks.
Write custom function that handles all cases.
- Rules are made from enum IDs that correspond to each operand.
Names of IDs are meant to give brief description what lowering does
for each operand or the whole instruction.
- AMDGPURegBankLegalizeHelper implements lowering algorithms
Since this is the first patch that actually enables -new-reg-bank-select
here is the summary of regression tests that were added earlier:
- if instruction is uniform always select SALU instruction if available
- eliminate back to back vgpr to sgpr to vgpr copies of uniform values
- fast rules: small differences for standard and vector instruction
- enabling Rule based on target feature - salu_float
- how to specify lowering algorithm - vgpr S64 AND to S32
- on G_TRUNC in reg, it is up to user to deal with truncated bits
G_TRUNC in reg is treated as no-op.
- dealing with truncated high bits - ABS S16 to S32
- sgpr S1 phi lowering
- new opcodes for vcc-to-scc and scc-to-vcc copies
- lowering for vgprS1-to-vcc copy (formally this is vgpr-to-vcc G_TRUNC)
- S1 zext and sext lowering to select
- uniform and divergent S1 AND(OR and XOR) lowering - inst-selected into
SALU instruction
- divergent phi with uniform inputs
- divergent instruction with temporal divergent use, source instruction
is defined as uniform(AMDGPURegBankSelect) - missing temporal
divergence lowering
- uniform phi, because of undef incoming, is assigned to vgpr. Will be
fixed in AMDGPURegBankSelect via another fix in machine uniformity
analysis.
|
|
This converts all ptr element shuffle vectors to s64, so that the
existing vector legalization handling can lower them as needed. This
prevents a lot of fallbacks that currently try to generate things like
`<2 x ptr> G_EXT`.
I'm not sure if bitcast/inttoptr/ptrtoint is intended to be necessary
for vectors of pointers, but it uses buildCast for the casts, which now
generates a ptrtoint/inttoptr.
|
|
|
|
aka llvm.stepvector Intrinsic
|
|
|
|
Credits: https://github.com/llvm/llvm-project/pull/111419
Fixes icmp-flags.mir
First attempt: https://github.com/llvm/llvm-project/pull/113090
Revert: https://github.com/llvm/llvm-project/pull/114256
|
|
Reverts llvm/llvm-project#113090
|
|
Credits: https://github.com/llvm/llvm-project/pull/111419
|
|
The implementation was missing the fact that `G_EXTRACT_SUBVECTOR`
destination and source vector can be different types.
Also fix a bug in the MIR builder for `G_EXTRACT_SUBVECTOR` to generate
the correct opcode.
Clarify the G_EXTRACT_SUBVECTOR specification.
|
|
|
|
https://github.com/llvm/llvm-project/pull/83227
|
|
This re-applies #94241 after fixing buildbot failure, see
https://lab.llvm.org/buildbot/#/builders/51/builds/570
According to standard, `constexpr` variables and `const` variables
initialized with constant expressions can be used in lambdas w/o
capturing - see https://en.cppreference.com/w/cpp/language/lambda.
However, MSVC used on buildkite seems to ignore that rule and does not
allow using such uncaptured variables in lambdas: we have "error C3493:
'Mask16' cannot be implicitly captured because no default capture mode
has been specified" - see
https://buildkite.com/llvm-project/github-pull-requests/builds/73238
Explicitly capturing such a variable, however, makes buildbot fail with
"error: lambda capture 'Mask16' is not required to be captured for this
use [-Werror,-Wunused-lambda-capture]" - see
https://lab.llvm.org/buildbot/#/builders/51/builds/570.
Fix both cases by using `0xffff` value directly instead of giving a name
to it.
Original PR description below.
Depends on #94240.
Define the following pseudos for lowering ptrauth constants in code:
- non-`extern_weak`:
- no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added
PAC;
- GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC;
- `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a
special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker
during relocation resolving instead of a GOT slot.
---------
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
|
|
This reverts #94241.
See buildbot failure
https://lab.llvm.org/buildbot/#/builders/51/builds/570
|
|
Depends on #94240.
Define the following pseudos for lowering ptrauth constants in code:
- non-`extern_weak`:
- no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added
PAC;
- GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC;
- `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a
special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker
during relocation resolving instead of a GOT slot.
---------
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
|
|
Vectors are supported for fp operations now, so remove the assert. The
supported type/operation combinations are best left for the verifier.
Avoids regression in future commit that starts treating some vector
cases as legal.
|
|
https://github.com/llvm/llvm-project/pull/85592
https://discourse.llvm.org/t/rfc-add-nowrap-flags-to-trunc/77453
https://github.com/llvm/llvm-project/pull/88609
|
|
Implements the core/target-agnostic components of Memory Model
Relaxation Annotations.
RFC:
https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5
|
|
G_ICMP for scalable vector types
This patch legalizes G_ZEXT, G_SEXT, and G_ANYEXT. If the type is a
legal mask type, then the instruction is legalized as the element-wise
select, where the condition on the select is the mask typed source
operand, and the true and false values are 1 or -1 (for
zero/any-extension and sign extension) and zero. If the type is a legal integer
or vector integer type, then the instruction is marked as legal.
The legalization of the extends may introduce a G_SPLAT_VECTOR, which
needs to be legalized in this patch for the extend test cases to pass.
A G_SPLAT_VECTOR is legal if the vector type is a legal integer or
floating point vector type and the source operand is sXLen type. This is
because the SelectionDAG patterns only support sXLen typed
ISD::SPLAT_VECTORS, and we'd like to reuse those patterns. A
G_SPLAT_VECTOR is cutom legalized if it has a legal s1 element vector
type and s1 scalar operand. It is legalized to G_VMSET_VL or G_VMCLR_VL
if the splat is all ones or all zeros respectivley. In the case of a
non-constant mask splat, we legalize by promoting the scalar value to
s8.
In order to get the s8 element vector back into s1 vector, we use a
G_ICMP. In order for the splat vector and extend tests to pass, we also
need to legalize G_ICMP in this patch.
A G_ICMP is legal if the destination type is a legal bool vector and the LHS and
RHS are legal integer vector types.
|
|
|
|
These cases in particular are done as a precommit to support
legalization, regbank selection, and instruction selection for extends,
splat vectors, and integer compares in #85938.
|
|
G_VSCALE should be lowered using VLENB. If the type is not sXLen it
should be lowered using a G_VSCALE on the narrow type and a G_MUL.
regbank select and instruction select are straightforward so we really
only need to add tests to show it works.
|
|
This reverts commit 47681506ded30fada68f180b5e80f740bc76abcd. It is not
consistent with SelectionDAG.
|
|
G_VSCALE should be lowered using VLENB.
|
|
|
|
G_INSERT and G_EXTRACT are not sufficient to use to represent both
INSERT/EXTRACT on a subregister and INSERT/EXTRACT on a vector.
We would like to be able to INSERT/EXTRACT on vectors in cases that
INSERT/EXTRACT on vector subregisters are not sufficient, so we add
these opcodes.
I tried to do a patch where we treated G_EXTRACT as both
G_EXTRACT_SUBVECTOR and G_EXTRACT_SUBREG, but ran into an infinite loop
at this
[point](https://github.com/llvm/llvm-project/blob/8b5b294ec2cf876bc5eb5bd5fcb56ef487e36d60/llvm/lib/Target/RISCV/RISCVISelLowering.cpp#L9932)
in the SDAG equivalent code.
|
|
|
|
Recommits llvm/llvm-project#80378 which was reverted in
llvm/llvm-project#84330. The problem was that the change in
llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used
217 as an opcode instead of a regex.
|
|
types" (#84330)
Reverts llvm/llvm-project#80378
causing Buildbot failures that did not show up with check-llvm or CI.
|
|
This patch is stacked on
https://github.com/llvm/llvm-project/pull/80372,
https://github.com/llvm/llvm-project/pull/80307, and
https://github.com/llvm/llvm-project/pull/80306.
ShuffleVector on scalable vector types gets IRTranslate'd to
G_SPLAT_VECTOR since a ShuffleVector that has operates on scalable
vectors is a splat vector where the value of the splat vector is the 0th
element of the first operand, because the index mask operand is the
zeroinitializer (undef and poison are treated as zeroinitializer here).
This is analogous to what happens in SelectionDAG for ShuffleVector.
`buildSplatVector` is renamed to`buildBuildVectorSplatVector`. I did not
make this a separate patch because it would cause problems to revert
that change without reverting this change too.
|
|
(#80377)
This patch is stacked on #80372, #80307, and #80306.
|
|
This alters the lowering of G_COPYSIGN to support vector types. The
general idea is that we just lower it to vector operations using and/or
and a mask, which are now converted to a BIF/BIT/BSP.
In the process the existing AArch64LegalizerInfo::legalizeFCopySign can
be removed, replying on expanding the scalar versions to vector instead,
which just needs a small adjustment to allow widening scalars to
vectors.
|
|
|
|
Protective measures against
https://github.com/llvm/llvm-project/pull/74502
|
|
|
|
vector types (#70882)
Scalable vector types from LLVM IR can be lowered to scalable vector
types in MIR according to the RISCVAssignFn.
|
|
Patch by: Acim Maravic
Differential Revision: https://reviews.llvm.org/D159515
|
|
Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:
- G_INTRINSIC_CONVERGENT
- G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS
Out of the targets that currently have some support for GlobalISel, the patch
assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D154766
|
|
<1 x i8> or <1 x i16> when using GlobalISel.
Code generation for return instruction of type <1 x i8> or <1 x i16> when using GlobalISel causes internal compiler crash Could not handle ret ty.
Fixes: https://github.com/llvm/llvm-project/issues/58211
Differential Revision: https://reviews.llvm.org/D153300
|
|
For IR like:
```
%alloca = alloca ...
dbg.value(%alloca, !myvar, OP_deref(<other_ops>))
```
GlobalISel lowers it to MIR:
```
%some_reg = G_FRAME_INDEX <stack_slot>
DBG_VALUE %some_reg, !myvar, OP_deref(<other_ops>)
```
In other words, if the value of `!myvar` can be obtained by
dereferencing an alloca, in MIR we say that the _location_ of a variable
is obtained by dereferencing register %some_reg (plus some
`<other_ops>`).
We can instead remove the use of `%some_reg`: the location of `!myvar`
_is_ `<stack_slot>` (plus some `<other_ops>`). This patch implements
this transformation, which improves debug information handling in O0, as
these registers hardly ever survive register allocation.
A note about testing: similar to what was done in D76934
(f24e2e9eebde4b7a1d), this patch exposed a bug in the Builder class when
using `-debug`, where we tried to print an incomplete instruction. The
changes in `MachineIRBuilder.cpp` address that.
Differential Revision: https://reviews.llvm.org/D147536
|
|
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D133340
|