llvm-project.git/llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp, branch main

AMDGPU: Add instruction flags when lowering ctor/dtor (#111652)

2024-10-09T14:03:35+00:00

These should be well behaved address computations.

AMDGPU: Use pointer types more consistently (#111651)

2024-10-09T13:23:50+00:00

This was using addrspace 0 and 1 pointers interchangably. This works
out since they happen to use the same size, but consistently query
or use the correct one.

AMDGPU: Avoid using hardcoded address space number

2024-10-09T09:15:35+00:00

[llvm] Replace calls to Type::getPointerTo (NFC)

2023-11-30T19:18:51+00:00

Clean-up towards removing method Type::getPointerTo.

[AMDGPU] Call the `FINI_ARRAY` destructors in the correct order (#71815)

2023-11-10T17:01:02+00:00

Summary:
The AMDGPU backend uses the linker-provided INIT_ARRAY and FINI_ARRAY
sections to call all the global constructors in a single kernel.
Previously this mistakenly used the same iteration logic for both
arrays. The destructors stored in FINI_ARRAY are stored in the same
order as
the ones in the INIT_ARRAY section so we need to traverse it in reverse
order.

Relanding after the revert in fe7b5e2cfcf6848287010291081f85fa1f6bb2ef
using the IR builder interface instead of ConstantExpr.

Revert "[AMDGPU] Call the `FINI_ARRAY` destructors in the correct order (#71815)"

2023-11-10T16:01:06+00:00

This reverts commit c1d5865a313d0a8a254b37c852bdd444453c0f73.

Introduces a new use of ConstantExpr::getAShr().

[AMDGPU] Call the `FINI_ARRAY` destructors in the correct order (#71815)

2023-11-10T15:34:04+00:00

Summary:
The AMDGPU backend uses the linker-provided INIT_ARRAY and FINI_ARRAY
sections to call all the global constructors in a single kernel.
Previously this mistakenly used the same iteration logic for both
arrays. The destructors stored in FINI_ARRAY are stored in the same
order as
the ones in the INIT_ARRAY section so we need to traverse it in reverse
order.

[AMDGPU] Add attribute to AMDGPU ctor / dtor to indicate single threadedness

2023-05-24T12:24:17+00:00

We only expect these ctor / dtor functions to be called with a single
thread. Add the appropriate attributes to indicate this to the backend.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D151153

[AMDGPU] Rewrite device ctor / dtor handling to use .init / .fini sections

2023-05-19T21:22:01+00:00

Currently, AMDGPU has special handling for constructors and destructors.
We manuall emit a kernel that calls the functoins listed in the global
constructor / destructor list. This currently has two main problems. The
first is that we do not repsect the priortiy and simply call them in any
order. The second is that we redefine the symbol unconditionally which
coulid have a different definition, meaning we cannot merge any code
with a constructor post-codegen. This patch changes the handling to
instead use the standard support for travering the `.init_array` and
`.fini_array` sections the compiler creates. This allows us to emit a
single kernel with `odr` semantics, so even if we emit this multiple
times they will be merged into a single kernel.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D150675

[AMDGPU] Place global constructors in .init_array and .fini_array

2023-04-29T13:40:19+00:00

For the GPU, we emit external kernels that call the initializers and
constructors, however if we had a persistent kernel like in the `_start`
kernel for the `libc` project, we could initialize the standard way of
calling constructors. This patch adds new global variables containing
pointers to the constructors to be called. If these are placed in the
`.init_array` and `.fini_array` sections, then the backend will handle
them specially. The linker will then provide the `__init_array_` and
`__fini_array_` sections to traverse them. An implementation would look
like this.

```
extern uintptr_t __init_array_start[];
extern uintptr_t __init_array_end[];
extern uintptr_t __fini_array_start[];
extern uintptr_t __fini_array_end[];

using InitCallback = void(int, char **, char **);
using FiniCallback = void(void);

extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
_start(int argc, char **argv, char **envp) {
  uint64_t init_array_size = __init_array_end - __init_array_start;
  for (uint64_t i = 0; i < init_array_size; ++i)
    reinterpret_cast(__init_array_start[i])(argc, argv, env);
  uint64_t fini_array_size = __fini_array_end - __fini_array_start;
  for (uint64_t i = 0; i < fini_array_size; ++i)
    reinterpret_cast(__fini_array_start[i])();
}
```

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D149340