<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/AMDGPU/AMDGPUArgumentUsageInfo.cpp, branch users/koachan/spr/main.sparcias-enable-parseforallfeatures-in-matchoperandparserimpl</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>AMDGPU: Add plumbing for private segment size argument (#96445)</title>
<updated>2024-06-25T14:20:51+00:00</updated>
<author>
<name>Nicolai Hähnle</name>
<email>nicolai.haehnle@amd.com</email>
</author>
<published>2024-06-25T14:20:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=7e9b49f6b86c8616e6211ec02dbccc3ebb615e79'/>
<id>7e9b49f6b86c8616e6211ec02dbccc3ebb615e79</id>
<content type='text'>
The actual size of scratch/private is determined at dispatch time, so
add more plumbing to request it. Will be used in subsequent change.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The actual size of scratch/private is determined at dispatch time, so
add more plumbing to request it. Will be used in subsequent change.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Add DAG ISel support for preloaded kernel arguments</title>
<updated>2023-09-25T16:32:59+00:00</updated>
<author>
<name>Austin Kerbow</name>
<email>Austin.Kerbow@amd.com</email>
</author>
<published>2023-08-02T00:51:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=0455596e1e7fecb6c76de9eba4e2ffc8772eadc2'/>
<id>0455596e1e7fecb6c76de9eba4e2ffc8772eadc2</id>
<content type='text'>
This patch adds the DAG isel changes for kernel argument preloading.
These changes are not usable with older firmware but subsequent patches
in the series will make the codegen backwards compatible. This patch
should only be submitted alongside that subsequent patch.

Preloading here begins from the start of the kernel arguments until the
amount of arguments indicated by the CL flag
amdgpu-kernarg-preload-count.

Aggregates and arguments passed by-ref are not supported.

Special care for the alignment of the kernarg segment is needed as well
as consideration of the alignment of addressable SGPR tuples when we
cannot directly use misaligned large tuples that the arguments are
loaded to.

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D158579
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds the DAG isel changes for kernel argument preloading.
These changes are not usable with older firmware but subsequent patches
in the series will make the codegen backwards compatible. This patch
should only be submitted alongside that subsequent patch.

Preloading here begins from the start of the kernel arguments until the
amount of arguments indicated by the CL flag
amdgpu-kernarg-preload-count.

Aggregates and arguments passed by-ref are not supported.

Special care for the alignment of the kernarg segment is needed as well
as consideration of the alignment of addressable SGPR tuples when we
cannot directly use misaligned large tuples that the arguments are
loaded to.

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D158579
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Stop using make_pair and make_tuple. NFC.</title>
<updated>2022-12-14T13:22:26+00:00</updated>
<author>
<name>Jay Foad</name>
<email>jay.foad@amd.com</email>
</author>
<published>2022-12-12T10:58:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6443c0ee02c7785bc917fcf508f4cc7ded38487a'/>
<id>6443c0ee02c7785bc917fcf508f4cc7ded38487a</id>
<content type='text'>
C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.llvm.org/D139828
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.llvm.org/D139828
</pre>
</div>
</content>
</entry>
<entry>
<title>[amdgpu] Implement lds kernel id intrinsic</title>
<updated>2022-07-19T16:46:19+00:00</updated>
<author>
<name>Jon Chesterfield</name>
<email>jonathanchesterfield@gmail.com</email>
</author>
<published>2022-07-19T16:46:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3a20597776a5d2920e511d81653b4d2b6ca0c855'/>
<id>3a20597776a5d2920e511d81653b4d2b6ca0c855</id>
<content type='text'>
Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Remove fixed function ABI option</title>
<updated>2021-12-11T00:41:19+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2021-08-14T19:52:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=06b90175e7e3ff2c9298ad3a15a00c1f04ae7029'/>
<id>06b90175e7e3ff2c9298ad3a15a00c1f04ae7029</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[GlobalISel] NFC: Change LLT::vector to take ElementCount.</title>
<updated>2021-06-24T10:26:12+00:00</updated>
<author>
<name>Sander de Smalen</name>
<email>sander.desmalen@arm.com</email>
</author>
<published>2021-06-24T08:58:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d5e14ba88cbf353236faa45caf626c2a30a1cb0c'/>
<id>d5e14ba88cbf353236faa45caf626c2a30a1cb0c</id>
<content type='text'>
This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector

The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
  same number of elements, then use LLT::vector(OtherTy.getElementCount())
  or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
  or operator*. That is because there is no reason to specifically restrict
  the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
  just use fixed_vector. This will need to be fixed up in the future when
  modifying the algorithm to also work for scalable vectors, and will need
  then need additional tests to confirm the behaviour works the same for
  scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
  this is replaced by LLT::scalable_vector.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104451
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector

The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
  same number of elements, then use LLT::vector(OtherTy.getElementCount())
  or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
  or operator*. That is because there is no reason to specifically restrict
  the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
  just use fixed_vector. This will need to be fixed up in the future when
  modifying the algorithm to also work for scalable vectors, and will need
  then need additional tests to confirm the behaviour works the same for
  scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
  this is replaced by LLT::scalable_vector.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104451
</pre>
</div>
</content>
</entry>
<entry>
<title>[NFC][AMDGPU] Reduce include files dependency.</title>
<updated>2021-01-07T19:22:05+00:00</updated>
<author>
<name>dfukalov</name>
<email>daniil.fukalov@amd.com</email>
</author>
<published>2020-12-25T15:52:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6a87e9b08bf093ba3ccba8650b89f4d337c497f4'/>
<id>6a87e9b08bf093ba3ccba8650b89f4d337c497f4</id>
<content type='text'>
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU/GlobalISel: Add types to special inputs</title>
<updated>2020-07-06T21:00:55+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2020-07-05T17:17:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=f25d020c2ec7cb1971fa56b99381d416799d8145'/>
<id>f25d020c2ec7cb1971fa56b99381d416799d8145</id>
<content type='text'>
When passing special ABI inputs, we have no existing context for the
type to use.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When passing special ABI inputs, we have no existing context for the
type to use.
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Fix fixed ABI SGPR arguments</title>
<updated>2020-07-06T13:01:18+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2020-07-05T17:55:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=7b76a5c8a2a66684bffb19b37e851ebd39519541'/>
<id>7b76a5c8a2a66684bffb19b37e851ebd39519541</id>
<content type='text'>
The default constructor wasn't setting isSet o the ArgDescriptor, so
while these had the value set, they were treated as missing. This only
ended up mattering in the indirect call case (and for regular calls in
GlobalISel, which current doesn't have a way to support the variable
ABI).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The default constructor wasn't setting isSet o the ArgDescriptor, so
while these had the value set, they were treated as missing. This only
ended up mattering in the indirect call case (and for regular calls in
GlobalISel, which current doesn't have a way to support the variable
ABI).
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Add flag to used fixed function ABI</title>
<updated>2020-03-13T20:27:05+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2020-03-11T20:13:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=015b640be4c6ce35969be5c38d628feeadb48634'/>
<id>015b640be4c6ce35969be5c38d628feeadb48634</id>
<content type='text'>
Pass all arguments to every function, rather than only passing the
minimum set of inputs needed for the call graph.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pass all arguments to every function, rather than only passing the
minimum set of inputs needed for the call graph.
</pre>
</div>
</content>
</entry>
</feed>
