llvm-project.git/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp, branch main

[AMDGPU] Make use of getFunction and getMF. NFC. (#167872)

2025-11-14T11:00:57+00:00

[AMDGPUPromoteAlloca][NFC] Avoid unnecessary APInt/int64_t conversions (#157864)

2025-09-12T07:51:55+00:00

Follow-up to #157682

[AMDGPU] Generate canonical additions in AMDGPUPromoteAlloca (#157810)

2025-09-10T12:46:46+00:00

When we know that one operand of an addition is a constant, we might was
well put it on the right-hand side and avoid the work to canonicalize it
in a later pass.

[AMDGPU] Treat GEP offsets as signed in AMDGPUPromoteAlloca (#157682)

2025-09-10T09:32:14+00:00

[AMDGPU] Treat GEP offsets as signed in AMDGPUPromoteAlloca

AMDGPUPromoteAlloca can transform i32 GEP offsets that operate on
allocas into i64 extractelement indices. Before this patch, negative GEP
offsets would be zero-extended, leading to wrong extractelement indices
with values around (2**32-1).

This fixes failing LlvmLibcCharacterConverterUTF32To8Test tests for
AMDGPU.

[AMDGPU] AMDGPUPromoteAlloca: increase default max-regs to 32 (#155076)

2025-08-26T00:30:16+00:00

Increase promote-alloca-to-vector-max-regs to 32 from 16.
This restores default promotion of 16 x double which was disabled by
#127973.

Fixes SWDEV-525817.

[AMDGPU] Replace dynamic VGPR feature with attribute (#133444)

2025-06-24T09:09:36+00:00

Use a function attribute (amdgpu-dynamic-vgpr) instead of a subtarget
feature, as requested in #130030.

AMDGPU: Remove legacy PM version of AMDGPUPromoteAllocaToVector (#144986)

2025-06-20T07:43:39+00:00

This is only run in the middle end with the new pass manager now,
so garbage collect the old PM version.

Revert "[AMDGPU] Extended vector promotion to aggregate types." (#144366)

2025-06-16T15:06:18+00:00

Reverts llvm/llvm-project#143784

Patch fails some internal tests. Will investigate more thoroughly before
attempting to remerge.

[AMDGPU] Extended vector promotion to aggregate types. (#143784)

2025-06-13T18:22:21+00:00

Extends the `amdgpu-promote-alloca-to-vector` pass to also promote
aggregate types whose elements are all the same type to vector
registers.

The motivation for this extension was to account for IR generated by the
frontend containing several singleton struct types containing vectors or
vector-like elements, though the implementation is strictly more
general.

[AMDGPU] Promote nestedGEP allocas to vectors (#141199)

2025-06-02T08:20:14+00:00

Supports the `nestedGEP`pattern that
 appears when an alloca is first indexed as an array element and then
 shifted with a byte‑offset GEP:

```llvm
  %SortedFragments = alloca [10 x <2 x i32>], addrspace(5), align 8
  %row  = getelementptr [10 x <2 x i32>], ptr addrspace(5) %SortedFragments, i32 0, i32 %j
  %elt1 = getelementptr i8, ptr addrspace(5) %row, i32 4
  %val  = load i32, ptr addrspace(5) %elt1
```

The pass folds the two levels of addressing into a single vector lane
 index and keeps the whole object in a VGPR:

```llvm
  %vec  = freeze <20 x i32> poison              ; alloca promote  <20 x i32>
  %idx0 = mul i32 %j, 2                         ; j * 2
  %idx  = add i32 %idx0, 1                      ; j * 2 + 1
  %val  = extractelement <20 x i32> %vec, i32 %idx
```

This eliminates the scratch read.