summaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenModule.cpp
AgeCommit message (Collapse)Author
2025-06-05[clang] Simplify device kernel attributes (#137882)Nick Sarnie
We have multiple different attributes in clang representing device kernels for specific targets/languages. Refactor them into one attribute with different spellings to make it more easily scalable for new languages/targets. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-06-04[HLSL][SPIR-V] Implement vk::ext_builtin_input attribute (#138530)Nathan Gauër
This variable attribute is used in HLSL to add Vulkan specific builtins in a shader. The attribute is documented here: https://github.com/microsoft/hlsl-specs/blob/17727e88fd1cb09013cb3a144110826af05f4dd5/proposals/0011-inline-spirv.md Those variable, even if marked as `static` are externally initialized by the pipeline/driver/GPU. This is handled by moving them to a specific address space `hlsl_input`, also added by this commit. The design for input variables in Clang can be found here: https://github.com/llvm/wg-hlsl/blob/355771361ef69259fef39a65caef8bff9cb4046d/proposals/0019-spirv-input-builtin.md Co-authored-by: Justin Bogner <mail@justinbogner.com>
2025-06-02[Clang][FMV] Stop emitting implicit default version using target_clones. ↵Alexandros Lamprineas
(#141808) With the current behavior the following example yields a linker error: "multiple definition of `foo.default'" // Translation Unit 1 __attribute__((target_clones("dotprod, sve"))) int foo(void) { return 1; } // Translation Unit 2 int foo(void) { return 0; } __attribute__((target_version("dotprod"))) int foo(void); __attribute__((target_version("sve"))) int foo(void); int bar(void) { return foo(); } That is because foo.default is generated twice. As a user I don't find this particularly intuitive. If I wanted the default to be generated in TU1 I'd rather write target_clones("dotprod, sve", "default") explicitly. When changing the code I noticed that the RISC-V target defers the resolver emission when encountering a target_version definition. This seems accidental since it only makes sense for AArch64, where we only emit a resolver once we've processed the entire TU, and only if the default version is present. I've changed this so that RISC-V immediately emmits the resolver. I adjusted the codegen tests since the functions now appear in a different order. Implements https://github.com/ARM-software/acle/pull/377
2025-05-30Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang ↵Tarun Prabhu
compiler" (#142159) Reverts llvm/llvm-project#136098
2025-05-30Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler ↵FYK
(#136098) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: - `-fprofile-generate` for instrumentation-based profile generation - `-fprofile-use=<dir>/file` for profile-guided optimization Resolves #74216 (implements IR PGO support phase) **Key changes:** - Frontend flag handling aligned with Clang/GCC semantics - Instrumentation hooks into LLVM PGO infrastructure - LIT tests verifying: - Instrumentation metadata generation - Profile loading from specified path - Branch weight attribution (IR checks) **Tests:** - Added gcc-flag-compatibility.f90 test module verifying: - Flag parsing boundary conditions - IR-level profile annotation consistency - Profile input path normalization rules - SPEC2006 benchmark results will be shared in comments For details on LLVM's PGO framework, refer to [Clang PGO Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization). This implementation was developed by [XSCC Compiler Team](https://github.com/orgs/OpenXiangShan/teams/xscc). --------- Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com> Co-authored-by: Tom Eccles <t@freedommail.info>
2025-05-27Avoid emitting a linker options section in the compiler if it is empty. ↵dyung
(#139821) Recently in some of our internal testing, we noticed that the compiler was sometimes generating an empty linker.options section which seems unnecessary. This proposed change causes the compiler to simply omit emitting the linker.options section if it is empty.
2025-05-27[HIP] disable sanitizer for `__hip_cuid` (#141581)Yaxun (Sam) Liu
Global variable `__hip_cuid_*` is for identifying purpose and does not need sanitization, therefore disable it for sanitizers.
2025-05-14[Cygwin] Global symbols should be external by default (#139797)Tomohiro Kashiwada
Behaves as same as both of Clang and GCC targetting MinGW. Required for compatibility for Cygwin-GCC. Divided from https://github.com/llvm/llvm-project/pull/138773
2025-05-13Emit nested unused enum types with -fno-eliminate-unused-debug-types (#137818)ykhatav
Unused types are retained in the debug info when -fno-eliminate-unused-debug-types is specified. However, unused nested enums were not being emitted even with this option. This patch fixes the missing emission of unused nested enums with -fno-eliminate-unused-debug-types
2025-05-11[clang] Use StringRef::consume_front (NFC) (#139472)Kazu Hirata
2025-05-09[win][x64] Unwind v2 3/n: Add support for emitting unwind v2 information ↵Daniel Paoliello
(equivalent to MSVC /d2epilogunwind) (#129142) Adds support for emitting Windows x64 Unwind V2 information, includes support `/d2epilogunwind` in clang-cl. Unwind v2 adds information about the epilogs in functions such that the unwinder can unwind even in the middle of an epilog, without having to disassembly the function to see what has or has not been cleaned up. Unwind v2 requires that all epilogs are in "canonical" form: * If there was a stack allocation (fixed or dynamic) in the prolog, then the first instruction in the epilog must be a stack deallocation. * Next, for each `PUSH` in the prolog there must be a corresponding `POP` instruction in exact reverse order. * Finally, the epilog must end with the terminator. This change adds a pass to validate epilogs in modules that have Unwind v2 enabled and, if they pass, emits new pseudo instructions to MC that 1) note that the function is using unwind v2 and 2) mark the start of the epilog (this is either the first `POP` if there is one, otherwise the terminator instruction). If a function does not meet these requirements, it is downgraded to Unwind v1 (i.e., these new pseudo instructions are not emitted). Note that the unwind v2 table only marks the size of the epilog in the "header" unwind code, but it's possible for epilogs to use different terminator instructions thus they are not all the same size. As a work around for this, MC will assume that all terminator instructions are 1-byte long - this still works correctly with the Windows unwinder as it is only using the size to do a range check to see if a thread is in an epilog or not, and since the instruction pointer will never be in the middle of an instruction and the terminator is always at the end of an epilog the range check will function correctly. This does mean, however, that the "at end" optimization (where an epilog unwind code can be elided if the last epilog is at the end of the function) can only be used if the terminator is 1-byte long. One other complication with the implementation is that the unwind table for a function is emitted during streaming, however we can't calculate the distance between an epilog and the end of the function at that time as layout hasn't been completed yet (thus some instructions may be relaxed). To work around this, epilog unwind codes are emitted via a fixup. This also means that we can't pre-emptively downgrade a function to Unwind v1 if one of these offsets is too large, so instead we raise an error (but I've passed through the location information, so the user will know which of their functions is problematic).
2025-05-09clang: Remove dest LangAS argument from performAddrSpaceCast (#138866)Matt Arsenault
It isn't used and is redundant with the result pointer type argument. A more reasonable API would only have LangAS parameters, or IR parameters, not both. Not all values have a meaningful value for this. I'm also not sure why we have this at all, it's not overridden by any targets and further simplification is possible.
2025-05-07[Clang][OpenCL][AMDGPU] OpenCL Kernel stubs should be assigned alwaysinline ↵Aniket Lal
attribute (#137769) OpenCL Kernels body is emitted as stubs and the kernel is emitted as call to respective stub. (https://github.com/llvm/llvm-project/pull/115821). The stub function should be alwaysinlined, since call to stub can cause performance drop. Co-authored-by: anikelal <anikelal@amd.com>
2025-04-30[nfc][clang] Rename function (#137874)Prabhu Rajasekaran
Rename function to meet the coding guidelines.
2025-04-30[clang] Fix UEFI Target info (#127290)Prabhu Rajasekaran
For X64 UEFI targets set appropriate integer type sizes, and relevant ABI information. --------- Co-authored-by: Petr Hosek <phosek@google.com>
2025-04-29[HLSL] Resource initialization by constructors (#135120)Helena Kotas
- Adds resource constructor that takes explicit binding for all resource classes. - Updates implementation of default resource constructor to initialize resource handle to `poison`. - Removes initialization of resource classes from Codegen. - Initialization of `cbuffer` still needs to happen in `CGHLSLRuntime` because it does not have a corresponding resource class type. - Adds `ImplicitCastExpr` for builtin function calls. Sema adds these automatically when a method of a template class is instantiated, but some resource classes like `ByteAddressBuffer` are not templates so they need to have this added explicitly. Design proposal: llvm/wg-hlsl#197 Closes #134154
2025-04-25[NFC] move comment back to where it belongs.erichkeane
It appears that OpenMP changes caused this comment to get moved away from where it was intended, so this patch puts it back next to the multiversioning code, as intended.
2025-04-24[BPF] Fix issues with external declarations of C++ structor decls (#137079)Reid Kleckner
Use GetAddrOfGlobal, which is a more general API that takes a GlobalDecl, and handles declaring C++ destructors and other types in a general way. We can use this to generalize over functions and variable declarations. This fixes issues reported on #130674 by @lexi-nadia .
2025-04-23Revert unintentional diff from cd826d6e840ed33ad88458c862da5f9fcc6e908cReid Kleckner
This is part of a forthcoming fix for issues observed in #91310, and was unintentionally committed as part of the VTT type changes revert
2025-04-23Revert "[Clang,debuginfo] added vtt parameter in destructor DISubroutineType ↵Reid Kleckner
(#130674)" This reverts commit 27c1aa9b9cf9e0b14211758ff8f7d3aaba24ffcf. See comments on PR. After this change, Clang now asserts like this: clang: ../llvm/include/llvm/IR/Metadata.h:1435: const MDOperand &llvm::MDNode::getOperand(unsigned int) const: Assertion `I < getNumOperands() && "Out of range"' failed. ... #8 0x000055f345c4e4cb clang::CodeGen::CGDebugInfo::getOrCreateInstanceMethodType() #9 0x000055f345c5ba4f clang::CodeGen::CGDebugInfo::EmitFunctionDecl() #10 0x000055f345b52519 clang::CodeGen::CodeGenModule::EmitExternalFunctionDeclaration() This is due to pre-existing jankiness in the way BPF emits extra declarations for debug info, but we should rollback and then fix forward.
2025-04-17[SYCL] Basic code generation for SYCL kernel caller offload entry point ↵Tom Honermann
functions. (#133030) A function declared with the `sycl_kernel_entry_point` attribute, sometimes called a SYCL kernel entry point function, specifies a pattern from which the parameters and body of an offload entry point function, sometimes called a SYCL kernel caller function, are derived. SYCL kernel caller functions are emitted during SYCL device compilation. Their parameters and body are derived from the `SYCLKernelCallStmt` statement and `OutlinedFunctionDecl` declaration associated with their corresponding SYCL kernel entry point function. A distinct SYCL kernel caller function is generated for each SYCL kernel entry point function defined as a non-inline function or ODR-used in the translation unit. The name of each SYCL kernel caller function is parameterized by the SYCL kernel name type specified by the `sycl_kernel_entry_point` attribute attached to the corresponding SYCL kernel entry point function. For the moment, the Itanium ABI mangled name for typeinfo data (`_ZTS<type>`) is used to name these functions; a future change will switch to a more appropriate naming scheme. The calling convention used for a SYCL kernel caller function is target dependent. Support for AMDGCN, NVPTX, and SPIR targets is currently provided. These functions are required to observe the language restrictions for SYCL devices as specified by the SYCL 2020 specification; this includes a forward progress guarantee and prohibits recursion. Only SYCL kernel caller functions, functions declared as `SYCL_EXTERNAL`, and functions directly or indirectly referenced from those functions should be emitted during device compilation. Pruning of other declarations has not yet been implemented. --------- Co-authored-by: Elizabeth Andrews <elizabeth.andrews@intel.com>
2025-04-15Introduce -funique-source-file-names flag.Peter Collingbourne
The purpose of this flag is to allow the compiler to assume that each object file passed to the linker has been compiled using a unique source file name. This is useful for reducing link times when doing ThinLTO in combination with whole-program devirtualization or CFI, as it allows modules without exported symbols to be built with ThinLTO. Reviewers: vitalybuka, teresajohnson Reviewed By: teresajohnson Pull Request: https://github.com/llvm/llvm-project/pull/135728
2025-04-14[MS][clang] Revert vector deleting destructors support (#135611)Mariya Podchishchaeva
Finding operator delete[] is still problematic, without it the extension is a security hazard, so reverting until the problem with operator delete[] is figured out. This reverts the following PRs: Reland [MS][clang] Add support for vector deleting destructors (llvm#133451) [MS][clang] Make sure vector deleting dtor calls correct operator delete (llvm#133950) [MS][clang] Fix crash on deletion of array of pointers (llvm#134088) [clang] Do not diagnose unused deleted operator delete[] (llvm#134357) [MS][clang] Error about ambiguous operator delete[] only when required (llvm#135041)
2025-04-08[Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (#115821)Aniket Lal
This feature is currently not supported in the compiler. To facilitate this we emit a stub version of each kernel function body with different name mangling scheme, and replaces the respective kernel call-sites appropriately. Fixes https://github.com/llvm/llvm-project/issues/60313 D120566 was an earlier attempt made to upstream a solution for this issue. --------- Co-authored-by: anikelal <anikelal@amd.com>
2025-03-31Reland [MS][clang] Add support for vector deleting destructors (#133451)Mariya Podchishchaeva
Whereas it is UB in terms of the standard to delete an array of objects via pointer whose static type doesn't match its dynamic type, MSVC supports an extension allowing to do it. Aside from array deletion not working correctly in the mentioned case, currently not having this extension implemented causes clang to generate code that is not compatible with the code generated by MSVC, because clang always puts scalar deleting destructor to the vftable. This PR aims to resolve these problems. It was reverted due to link time errors in chromium with sanitizer coverage enabled, which is fixed by https://github.com/llvm/llvm-project/pull/131929 . The second commit of this PR also contains a fix for a runtime failure in chromium reported in https://github.com/llvm/llvm-project/pull/126240#issuecomment-2730216384 . Fixes https://github.com/llvm/llvm-project/issues/19772
2025-03-30[clang] Use DenseMap::insert_range (NFC) (#133655)Kazu Hirata
2025-03-28[clang][flang][Triple][llvm] Add isOffload function to LangOpts and isGPU ↵Nick Sarnie
function to Triple (#126956) I'm adding support for SPIR-V, so let's consolidate these checks. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-03-21Reland: [clang] preserve class type sugar when taking pointer to member ↵Matheus Izvekov
(#132401) Original PR: #130537 Originally reverted due to revert of dependent commit. Relanding with no changes. This changes the MemberPointerType representation to use a NestedNameSpecifier instead of a Type to represent the base class. Since the qualifiers are always parsed as nested names, there was an impedance mismatch when converting these back and forth into types, and this led to issues in preserving sugar. The nested names are indeed a better match for these, as the differences which a QualType can represent cannot be expressed syntatically, and they represent the use case more exactly, being either dependent or referring to a CXXRecord, unqualified. This patch also makes the MemberPointerType able to represent sugar for a {up/downcast}cast conversion of the base class, although for now the underlying type is canonical, as preserving the sugar up to that point requires further work. As usual, includes a few drive-by fixes in order to make use of the improvements.
2025-03-20Revert "Reland: [clang] preserve class type sugar when taking pointer to ↵Matheus Izvekov
member" (#132280) Reverts llvm/llvm-project#132234 Needs to be reverted due to dependency. This blocks reverting another PR, see here: https://github.com/llvm/llvm-project/pull/131965#issuecomment-2741619498
2025-03-20[CodeGen] Fix unused variable warning (NFC)Nikita Popov
2025-03-20Reland: [clang] preserve class type sugar when taking pointer to member ↵Matheus Izvekov
(#132234) Original PR: #130537 Reland after updating lldb too. This changes the MemberPointerType representation to use a NestedNameSpecifier instead of a Type to represent the base class. Since the qualifiers are always parsed as nested names, there was an impedance mismatch when converting these back and forth into types, and this led to issues in preserving sugar. The nested names are indeed a better match for these, as the differences which a QualType can represent cannot be expressed syntatically, and they represent the use case more exactly, being either dependent or referring to a CXXRecord, unqualified. This patch also makes the MemberPointerType able to represent sugar for a {up/downcast}cast conversion of the base class, although for now the underlying type is canonical, as preserving the sugar up to that point requires further work. As usual, includes a few drive-by fixes in order to make use of the improvements.
2025-03-20Revert "[clang] improve class type sugar preservation in pointers to ↵Matheus Izvekov
members" (#132215) Reverts llvm/llvm-project#130537 This missed updating lldb, which we didn't notice due to lack of pre-commit CI.
2025-03-20[CodeGen] Don't explicitly set intrinsic attributes (NFCI)Nikita Popov
The intrinsic attributes are automatically set when the function is created, there is no need to assign them explicitly.
2025-03-20[clang] improve class type sugar preservation in pointers to members (#130537)Matheus Izvekov
This changes the MemberPointerType representation to use a NestedNameSpecifier instead of a Type to represent the class. Since the qualifiers are always parsed as nested names, there was an impedance mismatch when converting these back and forth into types, and this led to issues in preserving sugar. The nested names are indeed a better match for these, as the differences which a QualType can represent cannot be expressed syntactically, and it also represents the use case more exactly, being either dependent or referring to a CXXRecord, unqualified. This patch also makes the MemberPointerType able to represent sugar for a {up/downcast}cast conversion of the base class, although for now the underlying type is canonical, as preserving the sugar up to that point requires further work. As usual, includes a few drive-by fixes in order to make use of the improvements, and removing some duplications, for example CheckBaseClassAccess is deduplicated from across SemaAccess and SemaCast.
2025-03-12Revert "[MS][clang] Add support for vector deleting destructors (#126240)"Hans Wennborg
This caused link errors when building with sancov. See comment on the PR. > Whereas it is UB in terms of the standard to delete an array of objects > via pointer whose static type doesn't match its dynamic type, MSVC > supports an extension allowing to do it. > Aside from array deletion not working correctly in the mentioned case, > currently not having this extension implemented causes clang to generate > code that is not compatible with the code generated by MSVC, because > clang always puts scalar deleting destructor to the vftable. This PR > aims to resolve these problems. > > Fixes https://github.com/llvm/llvm-project/issues/19772 This reverts commit d6942d54f677000cf713d2b0eba57b641452beb4.
2025-03-09[Clang][CodeGen] Fix demangler invariant comment assertion (#130522)Aiden Grossman
This patch makes the assertion (that is currently in a comment) that validates that names mangled by clang can be demangled by LLVM actually compile/work. There were some minor issues that needed to be fixed (like starts_with not being available on std::string and needing to call getDecl() on GD), and a logic issue that should be fixed in this patch. This enables just uncommenting the assertion to enable it within the compiler (minus needing to add the header file).
2025-03-07[OpenACC] Ensure decl OpenACC constructs don't crasherichkeane
I initially implemented codegen to be a 'no-op' for these declarations, which I thought was properly implemented. However, when they are a top-level decl, we have a separate switch. This patch makes sure they are properly emitted at top-level as a no-op, and adds a test for both top-level and not top-level.
2025-03-05[HLSL] Fix resource wrapper declaration (#129100)Steven Perron
The resource wrapper should have internal linkage because it contains a handle to the global resource, and it not the actual global. Makeing this changed exposed that we were zeroinitializing the resouce, which is a problem. The handle cannot be zeroinitialized. This is changed to use poison instead. Fixes https://github.com/llvm/llvm-project/issues/122767. --------- Co-authored-by: Helena Kotas <hekotas@microsoft.com>
2025-03-04[MS][clang] Add support for vector deleting destructors (#126240)Mariya Podchishchaeva
Whereas it is UB in terms of the standard to delete an array of objects via pointer whose static type doesn't match its dynamic type, MSVC supports an extension allowing to do it. Aside from array deletion not working correctly in the mentioned case, currently not having this extension implemented causes clang to generate code that is not compatible with the code generated by MSVC, because clang always puts scalar deleting destructor to the vftable. This PR aims to resolve these problems. Fixes https://github.com/llvm/llvm-project/issues/19772
2025-02-27Add clang atomic control options and attribute (#114841)Yaxun (Sam) Liu
Add option and statement attribute for controlling emitting of target-specific metadata to atomicrmw instructions in IR. The RFC for this attribute and option is https://discourse.llvm.org/t/rfc-add-clang-atomic-control-options-and-pragmas/80641, Originally a pragma was proposed, then it was changed to clang attribute. This attribute allows users to specify one, two, or all three options and must be applied to a compound statement. The attribute can also be nested, with inner attributes overriding the options specified by outer attributes or the target's default options. These options will then determine the target-specific metadata added to atomic instructions in the IR. In addition to the attribute, three new compiler options are introduced: `-f[no-]atomic-remote-memory`, `-f[no-]atomic-fine-grained-memory`, `-f[no-]atomic-ignore-denormal-mode`. These compiler options allow users to override the default options through the Clang driver and front end. `-m[no-]unsafe-fp-atomics` is aliased to `-f[no-]ignore-denormal-mode`. In terms of implementation, the atomic attribute is represented in the AST by the existing AttributedStmt, with minimal changes to AST and Sema. During code generation in Clang, the CodeGenModule maintains the current atomic options, which are used to emit the relevant metadata for atomic instructions. RAII is used to manage the saving and restoring of atomic options when entering and exiting nested AttributedStmt.
2025-02-25[HLSL] Implement default constant buffer $Globals (2nd attempt) (#128589)Helena Kotas
All variable declarations in the global scope that are not resources, static or empty are implicitly added to implicit constant buffer `$Globals`. They are created in `hlsl_constant` address space and collected in an implicit `HLSLBufferDecl` node that is added to the AST at the end of the translation unit. Codegen is the same as for explicit constant buffers. Fixes #123801 This is a second attempt to implement this feature. The first attempt had to be reverted because of memory leaks. The problem was adding a `SmallVector` member on `HLSLBufferDecl` node to represent a list of default buffer declarations. When this vector needed to grow, it allocated memory that was never released, because all memory used by AST nodes must be allocated by `ASTContext` allocator and is released all at once. Destructors on AST nodes are never called. It this change the list of default buffer declarations is collected in a `SmallVector` instance on `SemaHLSL`. The `HLSLBufDecl` representing `$Globals` is created at the end of the translation unit when the number of declarations is known, and the list is copied into an array allocated by the `ASTContext` allocator.
2025-02-20Revert "[HLSL] Implement default constant buffer `$Globals`" (#128112)Helena Kotas
Reverts llvm/llvm-project#125807 Reverting this change because of failing tests.
2025-02-20[HLSL] Implement default constant buffer `$Globals` (#125807)Helena Kotas
All variable declarations in the global scope that are not resources, static or empty are implicitly added to implicit constant buffer `$Globals`. They are created in `hlsl_constant` address space and collected in an implicit `HLSLBufferDecl` node that is added to the AST at the end of the translation unit. Codegen is the same as for explicit constant buffers. Fixes #123801
2025-02-10[OpenMP][OpenMPIRBuilder] Add initial changes for SPIR-V target frontend ↵Nick Sarnie
support (#125920) As Intel is working to add support for SPIR-V OpenMP device offloading in upstream clang/liboffload, we need to modify the OpenMP frontend to allow SPIR-V as well as generate valid IR for SPIR-V. For example, we need the frontend to generate code to define and interact with device globals used in the DeviceRTL. This is the beginning of what I expect will be (many) other changes, but let's get started with something simple. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-02-08[CodeGen] Replace of PointerType::get(Type) with opaque version (NFC) (#124771)Mats Jun Larsen
Follow-up to https://github.com/llvm/llvm-project/issues/123569
2025-02-06[X86] Extend kCFI with a 3-bit arity indicator (#121070)Scott Constable
Kernel Control Flow Integrity (kCFI) is a feature that hardens indirect calls by comparing a 32-bit hash of the function pointer's type against a hash of the target function's type. If the hashes do not match, the kernel may panic (or log the hash check failure, depending on the kernel's configuration). These hashes are computed at compile time by applying the xxHash64 algorithm to each mangled canonical function (or function pointer) type, then truncating the result to 32 bits. This hash is written into each indirect-callable function header by encoding it as the 32-bit immediate operand to a `MOVri` instruction, e.g.: ``` __cfi_foo: nop nop nop nop nop nop nop nop nop nop nop movl $199571451, %eax # hash of foo's type = 0xBE537FB foo: ... ``` This PR extends x86-based kCFI with a 3-bit arity indicator encoded in the `MOVri` instruction's register (reg) field as follows: | Arity Indicator | Description | Encoding in reg field | | --------------- | --------------- | --------------- | | 0 | 0 parameters | EAX | | 1 | 1 parameter in RDI | ECX | | 2 | 2 parameters in RDI and RSI | EDX | | 3 | 3 parameters in RDI, RSI, and RDX | EBX | | 4 | 4 parameters in RDI, RSI, RDX, and RCX | ESP | | 5 | 5 parameters in RDI, RSI, RDX, RCX, and R8 | EBP | | 6 | 6 parameters in RDI, RSI, RDX, RCX, R8, and R9 | ESI | | 7 | At least one parameter may be passed on the stack | EDI | For example, if `foo` takes 3 register arguments and no stack arguments then the `MOVri` instruction in its kCFI header would instead be written as: ``` movl $199571451, %ebx # hash of foo's type = 0xBE537FB ``` This PR will benefit other CFI approaches that build on kCFI, such as FineIBT. For example, this proposed enhancement to FineIBT must be able to infer (at kernel init time) which registers are live at an indirect call target: https://lkml.org/lkml/2024/9/27/982. If the arity bits are available in the kCFI function header, then this information is trivial to infer. Note that there is another existing PR proposal that includes the 3-bit arity within the existing 32-bit immediate field, which introduces different security properties: https://github.com/llvm/llvm-project/pull/117121.
2025-02-04[StrTable] Switch Clang builtins to use string tablesChandler Carruth
This both reapplies #118734, the initial attempt at this, and updates it significantly. First, it uses the newly added `StringTable` abstraction for string tables, and simplifies the construction to build the string table and info arrays separately. This should reduce any `constexpr` compile time memory or CPU cost of the original PR while significantly improving the APIs throughout. It also restructures the builtins to support sharding across several independent tables. This accomplishes two improvements from the original PR: 1) It improves the APIs used significantly. 2) When builtins are defined from different sources (like SVE vs MVE in AArch64), this allows each of them to build their own string table independently rather than having to merge the string tables and info structures. 3) It allows each shard to factor out a common prefix, often cutting the size of the strings needed for the builtins by a factor two. The second point is important both to allow different mechanisms of construction (for example a `.def` file and a tablegen'ed `.inc` file, or different tablegen'ed `.inc files), it also simply reduces the sizes of these tables which is valuable given how large they are in some cases. The third builds on that size reduction. Initially, we use this new sharding rather than merging tables in AArch64, LoongArch, RISCV, and X86. Mostly this helps ensure the system works, as without further changes these still push scaling limits. Subsequent commits will more deeply leverage the new structure, including using the prefix capabilities which cannot be easily factored out here and requires deep changes to the targets.
2025-02-03[clang] Do not emit template parameter objects as COMDATs when they have ↵Owen Anderson
internal linkage. (#125448) Per the ELF spec, section groups may only contain local symbols if those symbols are only referenced from within the section group. [1] In the case of template parameter objects, they can be referenced from outside the group when the type of the object was declared in an anonymous namespace. In that case, we can't place the object in a COMDAT. This matches GCC's linkage behavior on the test input. [1]: https://www.sco.com/developers/gabi/latest/ch4.sheader.html#section_groups
2025-01-30[clang][llvm][aarch64][win] Add a clang flag and module attribute for import ↵Daniel Paoliello
call optimization, and remove LLVM flag (#122831) Switches import call optimization from being enabled by an LLVM flag to instead using a module attribute, and creates a new Clang flag that will set that attribute. This addresses the concern raised in the original PR: <https://github.com/llvm/llvm-project/pull/121516#discussion_r1911763991> This change also only creates the Called Global info if the module attribute is present, addressing this concern: <https://github.com/llvm/llvm-project/pull/122762#pullrequestreview-2547595934>
2025-01-29[Clang][P1061] Add stuctured binding packs (#121417)Jason Rice
This is an implementation of P1061 Structure Bindings Introduce a Pack without the ability to use packs outside of templates. There is a couple of ways the AST could have been sliced so let me know what you think. The only part of this change that I am unsure of is the serialization/deserialization stuff. I followed the implementation of other Exprs, but I do not really know how it is tested. Thank you for your time considering this. --------- Co-authored-by: Yanzuo Liu <zwuis@outlook.com>