llvm-project.git/llvm/lib/Target/AMDGPU/AMDGPUArgumentUsageInfo.cpp, branch users/koachan/spr/main.sparcias-enable-parseforallfeatures-in-matchoperandparserimpl

AMDGPU: Add plumbing for private segment size argument (#96445)

2024-06-25T14:20:51+00:00

The actual size of scratch/private is determined at dispatch time, so
add more plumbing to request it. Will be used in subsequent change.

[AMDGPU] Add DAG ISel support for preloaded kernel arguments

2023-09-25T16:32:59+00:00

This patch adds the DAG isel changes for kernel argument preloading.
These changes are not usable with older firmware but subsequent patches
in the series will make the codegen backwards compatible. This patch
should only be submitted alongside that subsequent patch.

Preloading here begins from the start of the kernel arguments until the
amount of arguments indicated by the CL flag
amdgpu-kernarg-preload-count.

Aggregates and arguments passed by-ref are not supported.

Special care for the alignment of the kernarg segment is needed as well
as consideration of the alignment of addressable SGPR tuples when we
cannot directly use misaligned large tuples that the arguments are
loaded to.

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D158579

[AMDGPU] Stop using make_pair and make_tuple. NFC.

2022-12-14T13:22:26+00:00

C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.llvm.org/D139828

[amdgpu] Implement lds kernel id intrinsic

2022-07-19T16:46:19+00:00

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060

AMDGPU: Remove fixed function ABI option

2021-12-11T00:41:19+00:00

[GlobalISel] NFC: Change LLT::vector to take ElementCount.

2021-06-24T10:26:12+00:00

This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector

The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
  same number of elements, then use LLT::vector(OtherTy.getElementCount())
  or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
  or operator*. That is because there is no reason to specifically restrict
  the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
  just use fixed_vector. This will need to be fixed up in the future when
  modifying the algorithm to also work for scalable vectors, and will need
  then need additional tests to confirm the behaviour works the same for
  scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
  this is replaced by LLT::scalable_vector.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104451

[NFC][AMDGPU] Reduce include files dependency.

2021-01-07T19:22:05+00:00

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813

AMDGPU/GlobalISel: Add types to special inputs

2020-07-06T21:00:55+00:00

When passing special ABI inputs, we have no existing context for the
type to use.

AMDGPU: Fix fixed ABI SGPR arguments

2020-07-06T13:01:18+00:00

The default constructor wasn't setting isSet o the ArgDescriptor, so
while these had the value set, they were treated as missing. This only
ended up mattering in the indirect call case (and for regular calls in
GlobalISel, which current doesn't have a way to support the variable
ABI).

AMDGPU: Add flag to used fixed function ABI

2020-03-13T20:27:05+00:00

Pass all arguments to every function, rather than only passing the
minimum set of inputs needed for the call graph.