<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/AMDGPU/AMDGPU.h, branch users/pcc/spr/elf-add-preferred-function-alignment-flag</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>AMDGPU: Add pass to replace constant materialize with AV pseudos (#149292)</title>
<updated>2025-07-18T08:15:38+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-07-18T08:15:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=8f3e78f9715cb7085d03686c7bd72e20ce248b04'/>
<id>8f3e78f9715cb7085d03686c7bd72e20ce248b04</id>
<content type='text'>
If we have a v_mov_b32 or v_accvgpr_write_b32 with an inline immediate,
replace it with a pseudo which writes to the combined AV_* class. This
relaxes the operand constraints, which will allow the allocator to
inflate the register class to AV_* to potentially avoid spilling.

The allocator does not know how to replace an instruction to enable
the change of register class. I originally tried to do this by changing
all of the places we introduce v_mov_b32 with immediate, but it's along
tail of niche cases that require manual updating. Plus we can restrict
this to only run on functions where we know we will be allocating AGPRs.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we have a v_mov_b32 or v_accvgpr_write_b32 with an inline immediate,
replace it with a pseudo which writes to the combined AV_* class. This
relaxes the operand constraints, which will allow the allocator to
inflate the register class to AV_* to potentially avoid spilling.

The allocator does not know how to replace an instruction to enable
the change of register class. I originally tried to do this by changing
all of the places we introduce v_mov_b32 with immediate, but it's along
tail of niche cases that require manual updating. Plus we can restrict
this to only run on functions where we know we will be allocating AGPRs.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][NewPM] Port "AMDGPUResourceUsageAnalysis" to NPM (#130959)</title>
<updated>2025-07-10T08:05:43+00:00</updated>
<author>
<name>Vikram Hegde</name>
<email>115221833+vikramRH@users.noreply.github.com</email>
</author>
<published>2025-07-10T08:05:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1ccd7793247139e55aec986e6d86c50d97f9a755'/>
<id>1ccd7793247139e55aec986e6d86c50d97f9a755</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Introduce a pass to replace VGPR MFMAs with AGPR (#145024)</title>
<updated>2025-06-27T12:05:03+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-06-27T12:05:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=c8ea114741cecf9c812b5e90eaa28919328df650'/>
<id>c8ea114741cecf9c812b5e90eaa28919328df650</id>
<content type='text'>
In gfx90a-gfx950, it's possible to emit MFMAs which use AGPRs or VGPRs
for vdst and src2. We do not want to do use the AGPR form, unless
required by register pressure as it requires cross bank register
copies from most other instructions. Currently we select the AGPR
or VGPR version depending on a crude heuristic for whether it's possible
AGPRs will be required. We really need the register allocation to
be complete to make a good decision, which is what this pass is for.
    
This adds the pass, but does not yet remove the selection patterns
for AGPRs. This is a WIP, and NFC-ish. It should be a no-op on any
currently selected code. It also does not yet trigger on the real
examples of interest, which require handling batches of MFMAs at
once.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In gfx90a-gfx950, it's possible to emit MFMAs which use AGPRs or VGPRs
for vdst and src2. We do not want to do use the AGPR form, unless
required by register pressure as it requires cross bank register
copies from most other instructions. Currently we select the AGPR
or VGPR version depending on a crude heuristic for whether it's possible
AGPRs will be required. We really need the register allocation to
be complete to make a good decision, which is what this pass is for.
    
This adds the pass, but does not yet remove the selection patterns
for AGPRs. This is a WIP, and NFC-ish. It should be a no-op on any
currently selected code. It also does not yet trigger on the real
examples of interest, which require handling batches of MFMAs at
once.</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Remove legacy pass manager version of AMDGPUAttributor (#145262)</title>
<updated>2025-06-23T07:49:46+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-06-23T07:49:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2dcf436340849d757432dc76ac46bf537e10fd8c'/>
<id>2dcf436340849d757432dc76ac46bf537e10fd8c</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Remove legacy pass manager version of AMDGPUUnifyMetadata (#144985)</title>
<updated>2025-06-20T07:44:06+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-06-20T07:44:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=c361bffa50f1ed790c393ffbab39c2e07dfcb242'/>
<id>c361bffa50f1ed790c393ffbab39c2e07dfcb242</id>
<content type='text'>
This is only run in the new pass manager now.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is only run in the new pass manager now.</pre>
</div>
</content>
</entry>
<entry>
<title>AMDGPU: Remove legacy PM version of AMDGPUPromoteAllocaToVector (#144986)</title>
<updated>2025-06-20T07:43:39+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-06-20T07:43:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1cae21da47b1f53c3946534b12507a035fb283d2'/>
<id>1cae21da47b1f53c3946534b12507a035fb283d2</id>
<content type='text'>
This is only run in the middle end with the new pass manager now,
so garbage collect the old PM version.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is only run in the middle end with the new pass manager now,
so garbage collect the old PM version.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Move kernarg preload logic to separate pass (#130434)</title>
<updated>2025-05-12T04:18:11+00:00</updated>
<author>
<name>Austin Kerbow</name>
<email>Austin.Kerbow@amd.com</email>
</author>
<published>2025-05-12T04:18:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2c9a46cce3ba32f36fcaa127d57006db00726a8a'/>
<id>2c9a46cce3ba32f36fcaa127d57006db00726a8a</id>
<content type='text'>
Moves kernarg preload logic to its own module pass. Cloned function
declarations are removed when preloading hidden arguments. The inreg
attribute is now added in this pass instead of AMDGPUAttributor. The
rest of the logic is copied from AMDGPULowerKernelArguments which now
only check whether an arguments is marked inreg to avoid replacing
direct uses of preloaded arguments. This change requires test updates to
remove inreg from lit tests with kernels that don't actually want
preloading.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Moves kernarg preload logic to its own module pass. Cloned function
declarations are removed when preloading hidden arguments. The inreg
attribute is now added in this pass instead of AMDGPUAttributor. The
rest of the logic is copied from AMDGPULowerKernelArguments which now
only check whether an arguments is marked inreg to avoid replacing
direct uses of preloaded arguments. This change requires test updates to
remove inreg from lit tests with kernels that don't actually want
preloading.</pre>
</div>
</content>
</entry>
<entry>
<title>Register assembly printer passes (#138348)</title>
<updated>2025-05-07T01:01:17+00:00</updated>
<author>
<name>Matthias Braun</name>
<email>matze@braunis.de</email>
</author>
<published>2025-05-07T01:01:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=675cb706411ced3172bd21def5b38f5ee7cca308'/>
<id>675cb706411ced3172bd21def5b38f5ee7cca308</id>
<content type='text'>
Register assembly printer passes in the pass registry.

This makes it possible to use `llc -start-before=&lt;target&gt;-asm-printer ...` in tests.

Adds a `char &amp;ID` parameter to the AssemblyPrinter constructor to allow
targets to use the `INITIALIZE_PASS` macros and register the pass in the
pass registry. This currently has a default parameter so it won't break
any targets that have not been updated.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Register assembly printer passes in the pass registry.

This makes it possible to use `llc -start-before=&lt;target&gt;-asm-printer ...` in tests.

Adds a `char &amp;ID` parameter to the AssemblyPrinter constructor to allow
targets to use the `INITIALIZE_PASS` macros and register the pass in the
pass registry. This currently has a default parameter so it won't break
any targets that have not been updated.</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU][Attributor] Add `ThinOrFullLTOPhase` as an argument (#123994)</title>
<updated>2025-05-02T15:33:56+00:00</updated>
<author>
<name>Shilei Tian</name>
<email>i@tianshilei.me</email>
</author>
<published>2025-05-02T15:33:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d6dbe7799e638702309e23fc7b73d4be2231e2ac'/>
<id>d6dbe7799e638702309e23fc7b73d4be2231e2ac</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[AMDGPU] Avoid referring to specific number of address spaces. NFC. (#137842)</title>
<updated>2025-04-29T21:33:53+00:00</updated>
<author>
<name>Jay Foad</name>
<email>jay.foad@amd.com</email>
</author>
<published>2025-04-29T21:33:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6a16da75d1d52de2211993e25a6485281defe097'/>
<id>6a16da75d1d52de2211993e25a6485281defe097</id>
<content type='text'>
This just avoids some hard coded numbers that would have to be updated
every time we add an address space.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This just avoids some hard coded numbers that would have to be updated
every time we add an address space.</pre>
</div>
</content>
</entry>
</feed>
