llvm-project.git/llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp, branch users/mingmingl-llvm/samplefdo-profile-format

[AMDGPU] upstream barrier count reporting part1 (#154409)

2025-08-19T23:42:31+00:00

[AMDGPU] Remove unused includes (NFC) (#116154)

2024-11-14T05:10:03+00:00

Identified with misc-include-cleaner.

Remove unused variable to fix '[AMDGPU] modify named barrier builtins and intrinsics (#114550)'

2024-11-06T20:49:39+00:00

https://github.com/llvm/llvm-project/pull/114550 caused a buildbot breakage (https://lab.llvm.org/buildbot/#/builders/66/builds/5853) because of an unused variable. This patch attempts to fix forward:

/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:106:24: error: variable 'TTy' set but not used [-Werror,-Wunused-but-set-variable]
106 |     if (TargetExtType *TTy = AMDGPU::isNamedBarrier(GV)) {
    |                        ^

[AMDGPU] modify named barrier builtins and intrinsics (#114550)

2024-11-06T18:37:22+00:00

Use a local pointer type to represent the named barrier in builtin and
intrinsic. This makes the definitions more user friendly
bacause they do not need to worry about the hardware ID assignment. Also
this approach is more like the other popular GPU programming language.
Named barriers should be represented as global variables of addrspace(3)
in LLVM-IR. Compiler assigns the special LDS offsets for those variables
during AMDGPULowerModuleLDS pass. Those addresses are converted to hw
barrier ID during instruction selection. The rest of the
instruction-selection changes are primarily due to the
intrinsic-definition changes.

[AMDGPU] Qualify auto. NFC. (#110878)

2024-10-03T12:07:54+00:00

Generated automatically with:
$ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find
lib/Target/AMDGPU/ -type f)

[AMDGPU] Use member initializers. NFC.

2024-07-16T14:29:10+00:00

[AMDGPU] Add dynamic LDS size implicit kernel argument to CO-v5 (#65273)

2024-01-04T13:35:12+00:00

"hidden_dynamic_lds_size" argument will be added in the reserved section
at offset 120 of the implicit argument layout.
Add "isDynamicLDSUsed" flag to AMDGPUMachineFunction to identify if a
function uses dynamic LDS.

hidden argument will be added in below cases:

- LDS global is used in the kernel.
- Kernel calls a function which uses LDS global.
- LDS pointer is passed as argument to kernel itself.

[AMDGPU] Add IsChainFunction to the MachineFunctionInfo

2023-08-21T10:37:32+00:00

This will represent functions with the amdgpu_cs_chain or
amdgpu_cs_chain_preserve calling conventions.

Differential Revision: https://reviews.llvm.org/D156410

[amdgpu] Accept an optional max to amdgpu-lds-size attribute for use in PromoteAlloca

2023-07-15T20:37:21+00:00

[amdgpu][lds] Remove recalculation of LDS frame from backend

2023-07-13T22:54:38+00:00

Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.

Prior to this patch:
The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol
metadata so that the assembler can build lookup tables out of it. There is a fragile association between
kernel functions and named structs which is used to recompute the frame layout in the backend, with
fatal_errors catching inconsistencies in the second calculation.

After this patch:
The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same
absolute_symbol metadata that the assembler uses to place objects within that frame size.

Deleted the now dead allocation code from the backend. Left for a later cleanup:
- enabling lowering for anonymous functions
- removing the elide-module-lds attribute (test churn, it's not used by llc any more)
- adjusting the dynamic alignment check to not use symbol names

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D155190