| Age | Commit message (Collapse) | Author |
|
Prior to this change, the data layout calculation would not account for
explicitly set `-mabi=elfv2` on `powerpc64-unknown-linux-gnu`, a target
that defaults to `elfv1`.
This is loosely inspired by the equivalent ARM / RISC-V code.
`make check-llvm` passes fine for me, though AFAICT all the tests
specify the data layout manually so there isn't really a test for this
and I am not really sure what the best way to go about adding one would
be.
Signed-off-by: Jens Reidel <adrian@travitia.xyz>
|
|
Let Clang emit `llvm.tbaa.errno` metadata in order to let LLVM
carry out optimizations around errno-writing libcalls to, as
long as it is proved the involved memory location does not alias
`errno`.
Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.
|
|
Define the __dmr2048 type to represent the DMR pair introduced by the
Dense Math Facility on PowerPC, and add three Clang builtins
corresponding to DMF cryptography:
__builtin_mma_dmsha2hash
__builtin_mma_dmsha3hash
__builtin_mma_dmxxshapad
The __dmr2048 type is required for the dmsha3hash crypto builtin, and,
as withother PPC MMA and DMR types, its use is strongly restricted.
|
|
soft float ABI selection was not taking effect on little-endian powerPC
with embedded vectors (e.g. e500v2) leading to errors.
(embedded vectors use "extended" GPRs to store floating-point values,
and this caused issues with variadic arguments assuming dedicated
floating-point registers with hard-float ABI)
|
|
GCC 14 also made this an error by default, so we’re following suit.
Fixes #74605
|
|
Tests exercizing TBAA metadata (both purposefully and not), and
previously generated via UTC, have been regenerated and updated
to version 6.
|
|
Add clang builtins for DMF VSX Vector floats:
```
void __builtin_mma_dmxvf16gerx2 (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_dmxvf16gerx2nn (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_dmxvf16gerx2np (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_dmxvf16gerx2pn (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_dmxvf16gerx2pp (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_pmdmxvf16gerx2 (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
void __builtin_mma_pmdmxvf16gerx2nn (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
void __builtin_mma_pmdmxvf16gerx2np (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
void __builtin_mma_pmdmxvf16gerx2pn (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
void __builtin_mma_pmdmxvf16gerx2pp (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
void __builtin_mma_dmxvbf16gerx2 (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_dmxvbf16gerx2nn (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_dmxvbf16gerx2np (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_dmxvbf16gerx2pn (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_dmxvbf16gerx2pp (__dmr1024 *, __vector_pair, vec_t);
void __builtin_mma_pmdmxvbf16gerx2 (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
void __builtin_mma_pmdmxvbf16gerx2nn (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
void __builtin_mma_pmdmxvbf16gerx2np (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
void __builtin_mma_pmdmxvbf16gerx2pn (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
void __builtin_mma_pmdmxvbf16gerx2pp (__dmr1024 *, __vector_pair, vec_t, uint8, uint4, uint2);
```
|
|
Add support for PPC Dense Math builtins mma_build_dmr and
mma_disassemble_dmr builtins.
|
|
Support the following BCD format conversion builtins for PowerPC.
- `__builtin_bcdcopysign` – Conversion that returns the decimal value of
the first parameter combined with the sign code of the second parameter.
`
- `__builtin_bcdsetsign` – Conversion that sets the sign code of the
input parameter in packed decimal format.
> Note: This built-in function is valid only when all following
conditions are met:
> -qarch is set to utilize POWER9 technology.
> The bcd.h file is included.
## Prototypes
```c
vector unsigned char __builtin_bcdcopysign(vector unsigned char, vector unsigned char);
vector unsigned char __builtin_bcdsetsign(vector unsigned char, unsigned char);
```
## Usage Details
`__builtin_bcdsetsign`: Returns the packed decimal value of the first
parameter combined with the sign code.
The sign code is set according to the following rules:
- If the packed decimal value of the first parameter is positive, the
following rules apply:
- If the second parameter is 0, the sign code is set to 0xC.
- If the second parameter is 1, the sign code is set to 0xF.
- If the packed decimal value of the first parameter is negative, the
sign code is set to 0xD.
> notes:
> The second parameter can only be 0 or 1.
> You can determine whether a packed decimal value is positive or
negative as follows:
> - Packed decimal values with sign codes **0xA, 0xC, 0xE, or 0xF** are
interpreted as positive.
> - Packed decimal values with sign codes **0xB or 0xD** are interpreted
as negative.
---------
Co-authored-by: Aditi-Medhane <aditi.medhane@ibm.com>
|
|
(#151971)
NFC patch to clean up extra lines of code in the file
`llvm/test/CodeGen/PowerPC/check-zero-vector.ll` as the current one has
loop unrolled.
Also removing the file `clang/test/CodeGen/PowerPC/check-zero-vector.c`
as the patch affects only the backend.
Co-authored-by: himadhith <himadhith.v@ibm.com>
|
|
Combining since these are testing the same err message with only
difference being the target cpu.
|
|
Let Clang emit `dead_on_return` attribute on pointer arguments
that are passed indirectly, namely, large aggregates that the
ABI mandates be passed by value; thus, the parameter is destroyed
within the callee. Writes to such arguments are not observable by
the caller after the callee returns.
This should desirably enable further MemCpyOpt/DSE optimizations.
Previous discussion: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.
|
|
PPCSubtarget is not always initialized, depending on which passes are
running, and in our downstream fork, -enable-matrix is the default
configuration (regardless of whether matrix intrinsics are present in
the IR), which triggers a fatal error in builtins-ppc-fpconstrained.c.
|
|
Add support for PPC Dense Math basic builtins dmsetdmrz, dmmr, dmxor.
|
|
support for Zero vector comparisons (#147246)
NFC patch to add testcase for locking down the support of Zero vector
comparisons using the `vcmpgtuh (vector compare greater than unsigned
halfword)` instruction.
Currently `vcmpequh (vector compare equal unsigned halfword)` is in use.
---------
Co-authored-by: himadhith <himadhith.v@ibm.com>
Co-authored-by: Tony Varghese <tonypalampalliyil@gmail.com>
|
|
Support the following packed BCD builtins for PowerPC.
```
__builtin_national2packed - Conversion of National format to Packed decimal format.
__builtin_packed2national - Conversion of Packed decimal format to national format.
__builtin_packed2zoned - Conversion of Packed decimal format to Zoned decimal format.
__builtin_zoned2packed - Conversion of Zoned decimal format to Packed decimal format.
```
### Prototypes:
`vector unsigned char __builtin_national2packed(vector unsigned char a,
unsigned char b);`
`vector unsigned char __builtin_packed2zoned(vector unsigned char,
unsigned char);`
`vector unsigned char __builtin_zoned2packed(vector unsigned char,
unsigned char);`
The condition for the 2nd parameter is consistent over all the 3
prototypes (0 or 1 only).
`vector unsigned char __builtin_packed2national(vector unsigned char);`
Co-authored-by: himadhith <himadhith.v@ibm.com>
Co-authored-by: Tony Varghese <tonypalampalliyil@gmail.com>
|
|
(#142480)
Define the __dmr1024 type used to manipulate the new DMR registers
introduced by the Dense Math Facility (DMF) on PowerPC, and add six
Clang builtins that correspond to the integer outer-product accumulate
to ACC PowerPC instructions:
* __builtin_mma_dmxvi8gerx4
* __builtin_mma_pmdmxvi8gerx4
* __builtin_mma_dmxvi8gerx4pp
* __builtin_mma_pmdmxvi8gerx4pp
* __builtin_mma_dmxvi8gerx4spp
* __builtin_mma_pmdmxvi8gerx4spp.
|
|
Fixes https://github.com/llvm/llvm-project/issues/140149
|
|
Update test in prep for IR changes that will be introduced in
https://github.com/llvm/llvm-project/pull/129923
|
|
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.
Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
make it easier to use old IR files and somewhat reduce the test churn in
this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
attribute. The representation in the LLVM IR dialect should be updated
separately.
|
|
If the gep is nusw (usually via inbounds) and the offset is
non-negative, we can infer nuw.
Proof: https://alive2.llvm.org/ce/z/ihztLy
|
|
See
https://discourse.llvm.org/t/cant-cannot-can-not-in-diagnostic-messages/83171
|
|
|
|
This will fix the following failure seeing on z/OS:
```
In file included from clang/test/CodeGen/PowerPC/ppc-xmmintrin.c:31:
In file included from build/lib/clang/20/include/ppc_wrappers/xmmintrin.h:44:
In file included from build/lib/clang/20/include/altivec.h:48:
In file included from build/bin/../include/c++/v1/stddef.h:27:
In file included from build/bin/../include/c++/v1/__config:14:
In file included from build/bin/../include/c++/v1/__configuration/abi.h:15:
In file included from build/bin/../include/c++/v1/__configuration/platform.h:35:
/usr/include/features.h:1:20: error: expected unqualified-id
1 | ??=if ??/
| ^
/usr/include/features.h:2140:20: error: expected unqualified-id
2140 | ??=endif /* __features_h */
| ^
```
Adding `-nostdlibinc` will not use standard system include path and it
will prevent above errors.
|
|
Fixes #107229
|
|
Update mma tests in prep for changes needed in a followup patch for
https://github.com/llvm/llvm-project/issues/107229.
Checks for ``clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma-types.c``
seem to have been manually upated to rename temp variables even though
it says checks was auto generated. Regenerate via script.
Add noopt checks for
``clang/test/CodeGen/PowerPC/builtins-ppc-build-pair-mma.c``.
|
|
Update codegen for func param with transparent_union attr to be that of
the first union member.
This is a followup to #101738 to fix non-ppc codegen and closes #76773.
|
|
Add nuw attribute to inbounds GEPs where the expression used to form the
GEP is an addition of unsigned indices.
Relands #105496, which was reverted because it exposed a miscompilation
arising from #98608. This is now fixed by #106512.
|
|
Reverts llvm/llvm-project#105496
This patch breaks:
https://lab.llvm.org/buildbot/#/builders/25/builds/1952
https://lab.llvm.org/buildbot/#/builders/52/builds/1775
Somehow output is different with sanitizers.
Maybe non-determinism in the code?
|
|
Add nuw attribute to inbounds GEPs where the expression used to form the
GEP is an addition of unsigned indices.
|
|
Update codegen for func param with transparent_union attr to be that of
the first union member.
PPC fix for: https://github.com/llvm/llvm-project/issues/76773
|
|
Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.
Regression tests are updated using update scripts where possible, and by
find + replace where not.
|
|
Implement BCD assist builtins for XL and GCC compatibility.
GCC compat:
```
unsigned int __builtin_cdtbcd (unsigned int);
unsigned int __builtin_cbcdtd (unsigned int);
unsigned int __builtin_addg6s (unsigned int, unsigned int);
```
64BIT XL compat:
```
long long __cdtbcd (long long);
long long __cbcdtd (long long);
long long __addg6s (long long source1, long long source2)
```
|
|
In PowerPC ABI, a few initial arguments are passed through registers,
but their places in parameter save area are reserved, arguments passed
by memory goes after the reserved location.
For debugging purpose, we may want to save copy of the pass-by-reg
arguments into correct places on stack. The new option achieves by
adding new function level attribute and make argument lowering part
aware of it.
|
|
musttail is not often possible to be generated on PPC targets as when
calling to a function defined in another module, PPC needs to restore
the TOC pointer. To restore the TOC pointer, compiler needs to emit a
nop after the call to let linker generate codes to restore TOC pointer.
Tail call cannot generate expected call sequence for this case.
To avoid the crash inside the compiler backend, a diagnosis is added in
the frontend.
Fixes #63214
|
|
|
|
'a' is an input/output constraint for restraining assembly variables
to an indexed or indirect address operand. It previously was marked
as supported but would throw an assertion for unknown constraint type
in the back-end when this test case was compiled. This change marks it
as unsupported until we can add full support for address operands
constraining to the compiler code generation.
|
|
ParseFrontendArgs takes the last OPT_Action_Group option. The other
actions are overridden.
|
|
|
|
Also replace aarch64-none-linux-gnu (none can indicate an OS as well) with aarch64
|
|
And replace -emit-llvm -o - with -emit-llvm-only
|
|
These tests just don't check the output written to the current directory. The
current directory may be write protected e.g. in a sandboxed environment.
The Testcases that use -emit-llvm and -verify only care about stdout/stderr
and are in this patch changed to use -emit-llvm-only to avoid writing to an
output file. The verify-inlineasmbr.mir testcase that also only care about
stdout/stderr is in this patch changed to throw away the output file and just
write to /dev/null.
|
|
rldimi is 64-bit instruction, due to backward compatibility, it needs to
be expanded into series of rotate and masking in 32-bit environment. In
the future, we may improve bit permutation selector and remove such
direct codegen.
|
|
Currently, the builtins used for implementing `va_list` handling
unconditionally take their arguments as unqualified `ptr`s i.e. pointers
to AS 0. This does not work for targets where the default AS is not 0 or
AS 0 is not a viable AS (for example, a target might choose 0 to
represent the constant address space). This patch changes the builtins'
signature to take generic `anyptr` args, which corrects this issue. It
is noisy due to the number of tests affected. A test for an upstream
target which does not use 0 as its default AS (SPIRV for HIP device
compilations) is added as well.
|
|
rldimi is 64-bit instruction, so the corresponding builtin should not
be available in 32-bit mode. Rotate amount should be in range and
cases when mask is zero needs special handling.
This change also swaps the first and second operands of rldimi/rlwimi
to match previous behavior. For masks not ending at bit 63-SH,
rotation will be inserted before rldimi.
|
|
This patch enables support that the XL compiler had for AIX under
-qdatalocal/-qdataimported.
|
|
These builtins are already there in Clang, however current codegen may
produce suboptimal results due to their complex behavior. Implement them
as intrinsics to ensure expected instructions are emitted.
|
|
Supports TLS local-dynamic on AIX, generates below sequence of code:
```
.tc foo[TC],foo[TL]@ld # Variable offset, ld relocation specifier
.tc mh[TC],mh[TC]@ml # Module handle for the caller
lwz 3,mh[TC]\(2\) $$ For 64-bit: ld 3,mh[TC]\(2\)
bla .__tls_get_mod # Modifies r0,r3,r4,r5,r11,lr,cr0
#r3 = &TLS for module
lwz 4,foo[TC]\(2\) $$ For 64-bit: ld 4,foo[TC]\(2\)
add 5,3,4 # Compute &foo
.rename mh[TC], "\_$TLSML" # Symbol for the module handle must have the name "_$TLSML"
```
---------
Co-authored-by: tingwang <tingwang@tingwangs-MBP.lan>
Co-authored-by: tingwang <tingwang@tingwangs-MacBook-Pro.local>
|
|
In the beginning, Clang only emitted atomic IR for operations it knew
the
underlying microarch had instructions for, meaning it required
significant
knowledge of the target. Later, the backend acquired the ability to
lower
IR to libcalls. To avoid duplicating logic and improve logic locality,
we'd like to move as much as possible to the backend.
There are many ways to describe this change. For example, this change
reduces the variables Clang uses to decide whether to emit libcalls or
IR, down to only the atomic's size.
|
|
Moved from https://reviews.llvm.org/D126302
The current behaviour with these three options is quite undesirable:
-mno-altivec -mvsx allows VSX to override no Altivec, thereby turning on
both
-msoft-float -maltivec causes a crash if an actual Altivec instruction
is required because soft float turns of Altivec
-msoft-float -mvsx is also accepted with both Altivec and VSX turned off
(potentially causing crashes as above)
This patch diagnoses these impossible combinations in the driver so the
user does not end up with surprises in terms of their options being
ignored or silently overridden.
Fixes https://github.com/llvm/llvm-project/issues/55556
---------
Co-authored-by: Nemanja Ivanovic <nemanja.i.ibm@gmail.com>
|