summaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenModule.cpp
AgeCommit message (Collapse)Author
2024-02-15[AArch64] Add soft-float ABI (#74460)ostannard
This adds support for the AArch64 soft-float ABI. The specification for this ABI was added by https://github.com/ARM-software/abi-aa/pull/232. Because all existing AArch64 hardware has floating-point hardware, we expect this to be a niche option, only used for embedded systems on R-profile systems. We are going to document that SysV-like systems should only ever use the base (hard-float) PCS variant: https://github.com/ARM-software/abi-aa/pull/233. For that reason, I've not added an option to select the ABI independently of the FPU hardware, instead the new ABI is enabled iff the target architecture does not have an FPU. For testing, I have run this through an ABI fuzzer, but since this is the first implementation it can only test for internal consistency (callers and callees agree on the PCS), not for conformance to the ABI spec.
2024-02-13[RISCV] Add canonical ISA string as Module metadata in IR. (#80760)Craig Topper
In an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson. This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged. The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch.
2024-02-09[clang][fmv] Drop .ifunc from target_version's entrypoint's mangling (#81194)Jon Roelofs
Fixes: https://github.com/llvm/llvm-project/issues/81043
2024-02-05[NFC][Clang] Replace Arch with Triplet. (#80465)Dani
2024-02-01[AArch64] Replace LLVM IR function attributes for PSTATE.ZA. (#79166)Sander de Smalen
Since https://github.com/ARM-software/acle/pull/276 the ACLE defines attributes to better describe the use of a given SME state. Previously the attributes merely described the possibility of it being 'shared' or 'preserved', whereas the new attributes have more semantics and also describe how the data flows through the program. For ZT0 we already had to add new LLVM IR attributes: * aarch64_new_zt0 * aarch64_in_zt0 * aarch64_out_zt0 * aarch64_inout_zt0 * aarch64_preserves_zt0 We have now done the same for ZA, such that we add: * aarch64_new_za (previously `aarch64_pstate_za_new`) * aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`) * aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`) * aarch64_inout_za (more specific variation of `aarch64_pstate_za_shared`) * aarch64_preserves_za (previously `aarch64_pstate_za_shared, aarch64_pstate_za_preserved`) This explicitly removes 'pstate' from the name, because with SME2 and the new ACLE attributes there is a difference between "sharing ZA" (sharing the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
2024-01-27[clang, SystemZ] Support -munaligned-symbols (#73511)Jonas Paulsson
When this option is passed to clang, external (and/or weak) symbols are not assumed to have the minimum ABI alignment normally required. Symbols defined locally that are not weak are however still given the minimum alignment. This is implemented by passing a new parameter to getMinGlobalAlign() named HasNonWeakDef that is used to return the right alignment value. This is needed when external symbols created from a linker script may not get the ABI minimum alignment and must therefore be treated as unaligned by the compiler.
2024-01-23[Clang] Amend SME attributes with support for ZT0. (#77941)Sander de Smalen
This patch builds on top of #76971 and implements support for: * __arm_new("zt0") * __arm_in("zt0") * __arm_out("zt0") * __arm_inout("zt0") * __arm_preserves("zt0")
2024-01-22[AArch64][Clang] Fix linker error for function multiversioning (#74358)Dani
AArch64 part of https://github.com/llvm/llvm-project/pull/71706. Default version is now mangled with .default. Resolver for the TargetVersion need to be emitted from the CodeGenModule::EmitMultiVersionFunctionDefinition.
2024-01-19Add a "don't override" mapping for -fvisibility-from-dllstorageclass (#74629)bd1976bris
`-fvisibility-from-dllstorageclass` allows for overriding the visibility of globals from their DLL storage class. The visibility to apply can be customised for the different classes of globals via a set of dependent options that specify the mapping values: - `-fvisibility-dllexport=<value>` - `-fvisibility-nodllstorageclass=<value>` - `-fvisibility-externs-dllimport=<value>` - `-fvisibility-externs-nodllstorageclass=<value>` Currently, one of the existing LLVM visibilities, `hidden`, `protected`, `default`, can be used as a mapping value. This change adds a new mapping value: `keep`, which specifies that the visibility should not be overridden for that class of globals. The behaviour of `-fvisibility-from-dllstorageclass` is otherwise unchanged and existing uses of this set of options will be unaffected. The background to this change is that currently the PS4 and PS5 compilers effectively ignore visibility - dllimport/export is the supported method for export control in C/C++ source code. Now, we would like to support visibility attributes and options in our frontend, in addition to dllimport/export. To support this, we will override the visibility of globals with explicit dllimport/export annotations but use the `keep` setting for globals which do not have an explicit dllimport/export. There are also some minor improvements to the existing options: - Make the `LANGOPS` `BENIGN` as they don't involve the AST. - Correct/clarify the help text for the options.
2024-01-16[RISCV] CodeGen of RVE and ilp32e/lp64e ABIs (#76777)Wang Pengcheng
This commit includes the necessary changes to clang and LLVM to support codegen of `RVE` and the `ilp32e`/`lp64e` ABIs. The differences between `RVE` and `RVI` are: * `RVE` reduces the integer register count to 16(x0-x16). * The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits. `RVE` can be combined with all current standard extensions. The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are: * Only 6 integer argument registers (rather than 8). * Only 2 callee-saved registers (rather than 12). * A Stack Alignment of 32bits (rather than 128bits). * ilp32e isn't compatible with D ISA extension. If `ilp32e` or `lp64` is used with an ISA that has any of the registers x16-x31 and f0-f31, then these registers are considered temporaries. To be compatible with the implementation of ilp32e in GCC, we don't use aligned registers to pass variadic arguments and set stack alignment\ to 4-bytes for types with length of 2*XLEN. FastCC is also supported on RVE, while GHC isn't since there is only one avaiable register. Differential Revision: https://reviews.llvm.org/D70401
2024-01-15[Clang][AArch64] Change SME attributes for shared/new/preserved state. (#76971)Sander de Smalen
This patch replaces the `__arm_new_za`, `__arm_shared_za` and `__arm_preserves_za` attributes in favour of: * `__arm_new("za")` * `__arm_in("za")` * `__arm_out("za")` * `__arm_inout("za")` * `__arm_preserves("za")` As described in https://github.com/ARM-software/acle/pull/276. One change is that `__arm_in/out/inout/preserves(S)` are all mutually exclusive, whereas previously it was fine to write `__arm_shared_za __arm_preserves_za`. This case is now represented with `__arm_in("za")`. The current implementation uses the same LLVM attributes under the hood, since `__arm_in/out/inout` are all variations of "shared ZA", so can use the existing `aarch64_pstate_za_shared` attribute in LLVM. #77941 will add support for the new "zt0" state as introduced with SME2.
2024-01-12[clang] Adjust -mlarge-data-threshold handling (#77958)Arthur Eubanks
Make it apply to x86-64 medium and large code models since that's what the backend does. Limit logic to exclude x86-32. Default to 0, let the driver set it to 65536 for the medium code model if one is not passed. Set it to 0 for the large code model by default to match gcc and since some users make assumptions about the large code model that any small data will break.
2024-01-11[clang][AArch64] Add a -mbranch-protection option to enable GCS (#75486)John Brawn
-mbranch-protection=gcs (enabled by -mbranch-protection=standard) causes generated objects to be marked with the gcs feature. This is done via the guarded-control-stack module flag, in a similar way to branch-target-enforcement and sign-return-address. Enabling GCS causes the GNU_PROPERTY_AARCH64_FEATURE_1_GCS bit to be set on generated objects. No code generation changes are required, as GCS just requires that functions are called using BL and returned from using RET (or other similar variant instructions), which is already the case.
2024-01-06[clang] Add per-global code model attribute (#72078)hev
This patch adds a per-global code model attribute, which can override the target's code model to access global variables. Currently, the code model attribute is only supported on LoongArch. This patch also maps GCC's code model names to LLVM's, which allows for better compatibility between the two compilers. Suggested-by: Arthur Eubanks <aeubanks@google.com> Link: https://discourse.llvm.org/t/how-to-best-implement-code-model-overriding-for-certain-values/71816 Link: https://discourse.llvm.org/t/rfc-add-per-global-code-model-attribute/74944 --------- Signed-off-by: WANG Rui <wangrui@loongson.cn>
2023-12-21Re-land "[AArch64] Codegen support for FEAT_PAuthLR" (#75947)Tomas Matheson
This reverts commit 9f0f5587426a4ff24b240018cf8bf3acc3c566ae. Fix expensive checks failure by properly marking register def for ADR.
2023-12-21Revert "[AArch64] Codegen support for FEAT_PAuthLR"Tomas Matheson
This reverts commit 5992ce90b8c0fac06436c3c86621fbf6d5398ee5. Builtbot failures with expensive checks enabled.
2023-12-21[AArch64] Codegen support for FEAT_PAuthLRTomas Matheson
- Adds a new +pc option to -mbranch-protection that will enable the use of PC as a diversifier in PAC branch protection code. - When +pauth-lr is enabled (-march=armv9.5a+pauth-lr) in combination with -mbranch-protection=pac-ret+pc, the new 9.5-a instructions (pacibsppc, retaasppc, etc) are used. Documentation for the relevant instructions can be found here: https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/ Co-authored-by: Lucas Prates <lucas.prates@arm.com>
2023-12-20[clang] Add getClangVendor() and use it in CodeGenModule.cpp (#75935)Dimitry Andric
In 9a38a72f1d482 `ProductId` was assigned from the stringified value of `CLANG_VENDOR`, if that macro was defined. However, `CLANG_VENDOR` is supposed to be a string, as it is defined (optionally) as such in the top-level clang `CMakeLists.txt`. Furthermore, `CLANG_VENDOR` is only passed as a build-time define when compiling `Version.cpp`, so add a `getClangVendor()` function to `Version.h`, and use it in `CodegGenModule.cpp`, instead of relying on the macro. Fixes: 9a38a72f1d482
2023-12-20Revert "[clang] Add getClangVendor() and use it in CodeGenModule.cpp (#75935)"Dimitry Andric
This reverts commit 9055519103eadfba0b48810be926883a71890c55, due to an incorrectly chosen commit message.
2023-12-20[clang] Add getClangVendor() and use it in CodeGenModule.cpp (#75935)Dimitry Andric
In 9a38a72f1d482 `ProductId` was assigned from the stringified value of `CLANG_VENDOR`, if that macro was defined. However, `CLANG_VENDOR` is supposed to be a string, as it is defined (optionally) as such in the top-level clang `CMakeLists.txt`. Move the addition of `-DCLANG_VENDOR` to the compiler flags from `clang/lib/Basic/CMakeLists.txt` to the top-level `CMakeLists.txt`, so it is consistent across the whole clang codebase. Then remove the stringification from `CodeGenModule.cpp`, to make it work correctly. Fixes: 9a38a72f1d482
2023-12-13[clang] Use StringRef::{starts,ends}_with (NFC) (#75149)Kazu Hirata
This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.
2023-12-07[NFC] Remove unneeded nullptr checks after cast<> (#74674)Mike Rice
Since VD is assigned from a cast<VarDecl> it cannot be a nullptr or it would have asserted. Remove the subsequent checks to clear up any misunderstanding.
2023-12-05[Clang] Fix linker error for function multiversioning (#71706)elizabethandrews
Currently target_clones attribute results in a linker error when there are no multi-versioned function declarations in the calling TU. In the calling TU, the call is generated with the ‘normal’ assembly name. This does not match any of the versions or the ifunc, since version mangling includes a .versionstring, and the ifunc includes .ifunc suffix. The linker error is not seen with GCC since the mangling for the ifunc symbol in GCC is the ‘normal’ assembly name for function i.e. no ifunc suffix. This PR removes the .ifunc suffix to match GCC. It also adds alias with the .ifunc suffix so as to ensure backward compatibility. The changes exclude aarch64 target because the mangling for default versions on aarch64 does not include a .default suffix and is the 'normal' assembly name, unlike other targets. It is not clear to me what the correct behavior for this target is. Old Phabricator review - https://reviews.llvm.org/D158666 --------- Co-authored-by: Tom Honermann <tom@honermann.net>
2023-11-30[clang][AArch64] Pass down stack clash protection options to LLVM/Backend ↵Momchil Velikov
(#68993)
2023-11-29[clang][CodeGen] Emit annotations for function declarations. (#66716)Brendan Dahl
Previously, annotations were only emitted for function definitions. With this change annotations are also emitted for declarations. Also, emitting function annotations is now deferred until the end so that the most up to date declaration is used which will have any inherited annotations.
2023-11-29[Flang] Add code-object-version option (#72638)Dominik Adamski
Information about code object version can be configured by the user for AMD GPU target and it needs to be placed in LLVM IR generated by Flang. Information about code object version in MLIR generated by the parser can be reused by other tools. There is no need to specify extra flags if we want to invoke MLIR tools (like fir-opt) separately. Changes in comparison to a8ac93: * added information about required targets for test flang/test/Driver/driver-help.f90
2023-11-28Revert "[Flang] Add code-object-version option (#72638)"Dominik Adamski
This commit causes test errors on buildbots. This reverts commit a8ac930b99d93b2a539ada7e566993d148899144.
2023-11-28[Flang] Add code-object-version option (#72638)Dominik Adamski
Information about code object version can be configured by the user for AMD GPU target and it needs to be placed in LLVM IR generated by Flang. Information about code object version in MLIR generated by the parser can be reused by other tools. There is no need to specify extra flags if we want to invoke MLIR tools (like fir-opt) separately.
2023-11-27[SystemZ][z/OS] This change adds support for the PPA2 section in zOS (#68926)Yusra Syeda
This PR adds support for the PPA2 fields. --------- Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>
2023-11-20[CodeGenModule] Remove no-op ptr-to-ptr bitcasts (NFC)Youngsuk Kim
Opaque ptr cleanup effort (NFC)
2023-11-13-fcoverage-mapping: simplify. NFCFangrui Song
2023-11-09[CUDA][HIP] Make template implicitly host device (#70369)Yaxun (Sam) Liu
Added option -foffload-implicit-host-device-templates which is off by default. When the option is on, template functions and specializations without host/device attributes have implicit host device attributes. They can be overridden by device template functions with the same signagure. They are emitted on device side only if they are used on device side. This feature is added as an extension. `__has_extension(cuda_implicit_host_device_templates)` can be used to check whether it is enabled. This is to facilitate using standard C++ headers for device. Fixes: https://github.com/llvm/llvm-project/issues/69956 Fixes: SWDEV-428314
2023-11-09[C++20] [Modules] Allow export from language linkageChuanqi Xu
Close https://github.com/llvm/llvm-project/issues/71347 Previously I misread the concept of module purview. I thought if a declaration attached to a unnamed module, it can't be part of the module purview. But after the issue report, I recognized that module purview is more of a concept about locations instead of semantics. Concretely, the things in the language linkage after module declarations can be exported. This patch refactors `Module::isModulePurview()` and introduces some possible code cleanups.
2023-11-07[C++20] [Modules] Don't import function bodies from other module units even ↵Chuanqi Xu
with optimizations (#71031) Close https://github.com/llvm/llvm-project/issues/60996. Previously, clang will try to import function bodies from other module units to get more optimization oppotunities as much as possible. Then the motivation becomes the direct cause of the above issue. However, according to the discussion in SG15, the behavior of importing function bodies from other module units breaks the ABI compatibility. It is unwanted. So the original behavior of clang is incorrect. This patch choose to not import function bodies from other module units in all cases to follow the expectation. Note that the desired optimized BMI idea is discarded too. Since it will still break the ABI compatibility after we import function bodies seperately. The release note will be added seperately. There is a similar issue for variable definitions. I'll try to handle that in a different commit.
2023-11-05[clang][CodeGenModule] Remove no-op ptr-to-ptr bitcasts (NFC)Youngsuk Kim
Opaque ptr cleanup effort (NFC).
2023-11-02[clang][NFC] Refactor `clang::Linkage` (#71049)Vlad Serebrennikov
This patch introduces a new enumerator `Invalid = 0`, shifting other enumerators by +1. Contrary to how it might sound, this actually affirms status quo of how this enum is stored in `clang::Decl`: ``` /// If 0, we have not computed the linkage of this declaration. /// Otherwise, it is the linkage + 1. mutable unsigned CacheValidAndLinkage : 3; ``` This patch makes debuggers to not be mistaken about enumerator stored in this bit-field. It also converts `clang::Linkage` to a scoped enum.
2023-11-01[clang][NFC] Refactor `LinkageSpecDecl::LanguageIDs`Vlad Serebrennikov
This patch converts `LinkageSpecDecl::LanguageIDs` into scoped enum, and moves it to namespace scope, so that it can be forward-declared where required.
2023-11-01[clang][NFC] Refactor `ObjCMethodDecl::ImplementationControl`Vlad Serebrennikov
This patch moves `ObjCMethodDecl::ImplementationControl` to a DeclBase.h so that it's complete at the point where corresponsing bit-field is declared. This patch also converts it to a scoped enum `clang::ObjCImplementationControl`.
2023-10-31[StackProtector] Do not emit the stack protector on GPU architectures (#70799)Joseph Huber
Summary: This patch changes the code generation to not emit the stack protector metadata on unsupported architectures. The issue was caused by system toolchains emitting stack protector option by default which would lead to errors when compiling for the GPU. I elected to change the code generation as we may want to update this in the future so we should keep the `clang` Driver code common. Although the user can use some combination of `-Xarch-device -fno-stack-protector` to override this, it is very irritating to do when we shouldn't emit this incompatible IR anyway. Fixes: https://github.com/llvm/llvm-project/issues/65911
2023-10-31[clang][NFC] Refactor `ArrayType::ArraySizeModifier`Vlad Serebrennikov
This patch moves `ArraySizeModifier` before `Type` declaration so that it's complete at `ArrayTypeBitfields` declaration. It's also converted to scoped enum along the way.
2023-10-17[HIP][Clang][CodeGen] Add CodeGen support for `hipstdpar`Alex Voicu
This patch adds the CodeGen changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. This change relaxes restrictions on what gets emitted on the device path, when compiling in `hipstdpar` mode: 1. Unless a function is explicitly marked `__host__`, it will get emitted, whereas before only `__device__` and `__global__` functions would be emitted; 2. Unsupported builtins are ignored as opposed to being marked as an error, as the decision on their validity is deferred to the `hipstdpar` specific code selection pass; 3. We add a `hipstdpar` specific pass to the opt pipeline, independent of optimisation level: - When compiling for the host, iff the user requested it via the `--hipstdpar-interpose-alloc` flag, we add a pass which replaces canonical allocation / deallocation functions with accelerator aware equivalents. A test to validate that unannotated functions get correctly emitted is added as well. Reviewed by: yaxunl, efriedma Differential Revision: https://reviews.llvm.org/D155850
2023-10-10[C++20] [Modules] Don't emit function bodies which is noinline and av… ↵Chuanqi Xu
(#68501) …ailabl externally A workaround for https://github.com/llvm/llvm-project/issues/60996 As the title suggested, we can avoid emitting available externally functions which is marked as noinline already. Such functions should contribute nothing for optimizations. The update for docs will be sent seperately if this got approved.
2023-10-09[OpenMP] Fix setting visibility on declare target variablesJoseph Huber
Summary: A previous patch changed the logic to force external visibliity on declare target variables. This is because they need to be exported in the dynamic symbol table to be usable as the standard depicts. However, the logic was always setting the visibility to `protected`, which would override some symbols. For example, when calling `libc` functions for CPU offloading. This patch changes the logic to only fire if the variable has hidden visibliity to start with.
2023-10-05[OpenMP] Prevent AMDGPU from overriding visibility on DT_nohost variables ↵Joseph Huber
(#68264) Summary: There's some logic in the AMDGPU target that manually resets the requested visibility of certain variables. This was triggering when we set a constant variable in OpenMP. However, we shouldn't do this for OpenMP when the variable has the `nohost` type. That implies that the variable is not visible to the host and therefore does not need to be visible, so we should respect the original value of it.
2023-10-05[clang] Replace uses of Type::getPointerType (NFC)JOE1994
Opaque pointer clean-up effort
2023-10-02-fsanitize=function: fix MSVC hashing to sugared type (#66816)Matheus Izvekov
Hashing the sugared type instead of the canonical type meant that a simple example like this would always fail under MSVC: ``` static auto l() {} int main() { auto a = l; a(); } ``` `clang --target=x86_64-pc-windows-msvc -fno-exceptions -fsanitize=function -g -O0 -fuse-ld=lld -o test.exe test.cc` produces: ``` test.cc:4:3: runtime error: call to function l through pointer to incorrect function type 'void (*)()' ```
2023-10-02[C++] Implement "Deducing this" (P0847R7)Corentin Jabot
This patch implements P0847R7 (partially), CWG2561 and CWG2653. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D140828
2023-10-02[C++20] [Modules] Fix crash when emitting module inits for duplicated modulesChuanqi Xu
Close https://github.com/llvm/llvm-project/issues/67893 The root cause of the crash is an oversight that we missed the point that the same module can be imported multiple times. And we should use `SmallSetVector` instead of `SmallVector` to filter the case.
2023-09-30[clang] Remove uses of llvm::Type::getPointerTo() (NFC)JOE1994
* Remove if its sole use is to support an unnecessary ptr-to-ptr bitcast (remove the bitcast as well) * Replace with use of other APIs. NFC opaque pointer cleanup effort.
2023-09-28[NFC] [C++20] [Modules] Refactor Module::getGlobalModuleFragment and ↵Chuanqi Xu
Module::getPrivateModuleFragment The original implementation of `Module::getGlobalModuleFragment` and `Module::getPrivateModuleFragment` tried to find the global module fragment and the private module fragment by comparing strings, which smells bad. This patch tries to improve this.