llvm-project.git/llvm/test/CodeGen/AMDGPU/function-resource-usage.ll, branch main

Revert "[AMDGPU] Skip register uses in AMDGPUResourceUsageAnalysis (#… (#144039)

2025-06-13T10:48:24+00:00

…133242)"

This reverts commit 130080fab11cde5efcb338b77f5c3b31097df6e6 because it
causes issues in testcases similar to coalescer_remat.ll [1], i.e. when
we use a VGPR tuple but only write to its lower parts. The high VGPRs
would then not be included in the vgpr_count, and accessing them would
be an out of bounds violation.

[1]
https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AMDGPU/coalescer_remat.ll

[AMDGPU] Flatten recursive register resource info propagation (#142766)

2025-06-12T13:35:28+00:00

In #112251 I had mentioned I'd follow up with flattening of recursion
for register resource info propagation

Behaviour prior to this patch when a recursive call is used is to take
the module scope worst case function register use (even prior to
AMDGPUMCResourceInfo). With this patch it will, when a cycle is
detected, attempt to do a simple cycle avoidant dfs to find the worst
case constant within the cycle and the cycle's propagates. In other
words, it will attempt to look for the cycle scope worst case rather
than module scope worst case.

[AMDGPU] Skip register uses in AMDGPUResourceUsageAnalysis (#133242)

2025-06-03T09:20:48+00:00

Don't count register uses when determining the maximum number of
registers used by a function. Count only the defs. This is really an
underestimate of the true register usage, but in practice that's not
a problem because if a function uses a register, then it has either
defined it earlier, or some other function that executed before has
defined it.

In particular, the register counts are used:
1. When launching an entry function - in which case we're safe because
   the register counts of the entry function will include the register
   counts of all callees.
2. At function boundaries in dynamic VGPR mode. In this case it's safe
   because whenever we set the new VGPR allocation we take into account
   the outgoing_vgpr_count set by the middle-end.

The main advantage of doing this is that the artificial VGPR arguments
used only for preserving the inactive lanes when using the
llvm.amdgcn.init.whole.wave intrinsic are no longer counted. This
enables us to allocate only the registers we need in dynamic VGPR mode.

---------

Co-authored-by: Thomas Symalla <5754458+tsymalla@users.noreply.github.com>

[MC] Simplify MCBinaryExpr/MCUnaryExpr printing by reducing parentheses (#133674)

2025-03-31T05:03:14+00:00

The existing pretty printer generates excessive parentheses for
MCBinaryExpr expressions. This update removes unnecessary parentheses
of MCBinaryExpr with +/- operators and MCUnaryExpr.
Since relocatable expressions only use + and -, this change improves
readability in most cases.

Examples:

- (SymA - SymB) + C now prints as SymA - SymB + C.
  This updates the output of -fexperimental-relative-c++-abi-vtables for
  AArch64 and x86 to `.long _ZN1B3fooEv@PLT-_ZTV1B-8`
- expr + (MCTargetExpr) now prints as expr + MCTargetExpr, with this
  change primarily affecting AMDGPUMCExpr.

[AMDGPU] Change SGPR layout to striped caller/callee saved (#127353)

2025-03-08T14:28:20+00:00

This PR updates the SGPR layout to a striped caller/callee-saved design,
similar
to the VGPR layout.

To ensure that s30-s31 (return address), s32 (stack pointer), s33 (frame
pointer), and s34 (base pointer) remain callee-saved, the striped layout
starts
from s40, with a stripe width of 8. The last stripe is 10 wide instead
of 8 to
avoid ending with a 2-wide stripe.

Fixes #113782.

[AMDGPU] Newly added test modified for recent SGPR use change (#116427)

2024-11-15T19:51:58+00:00

Mistimed rebase for #112251 which added new tests which did not consider
the changes introduced in #112403 yet

Reapply [AMDGPU] Avoid resource propagation for recursion through multiple functions (#112251)

2024-11-15T18:40:05+00:00

I was wrong last patch. I viewed the `Visited` set purely as a possible
recursion deterrent where functions calling a callee multiple times are
handled elsewhere. This doesn't consider cases where a function is
called multiple times by different callers still part of the same call
graph. New test shows the aforementioned case.

Reapplies #111004, fixes #115562.

Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

2024-11-09T01:21:16+00:00

This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.

Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)"

2024-11-08T21:36:35+00:00

This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both
hip and openmp buildbots.

[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)

2024-11-08T18:05:35+00:00