diff options
| author | Jon Chesterfield <jon@spectralcompute.co.uk> | 2025-10-02 21:15:48 +0100 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-10-02 16:15:48 -0400 |
| commit | 0ebd4334021e7579bfba7a92b692e0e4ece56cb9 (patch) | |
| tree | 7c5f85510715a7298321dd14f52472e84e8dff4e /llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp | |
| parent | a2b6602aa46827cd0a3a379980dfbe1688f4049e (diff) | |
[AMDGPU] Be less optimistic when allocating module scope lds (#161464)
Make the test for when additional variables can be added to the struct
allocated at address zero more stringent. Previously, variables can be
added to it (for faster access) even when that increases the lds
requested by a kernel. This corrects that oversight.
Test case diff shows the change from all variables being allocated into
the module lds to only some being, in particular the introduction of
uses of the offset table and that some kernels now use less lds than
before.
Alternative to PR 160181
Diffstat (limited to 'llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp')
| -rw-r--r-- | llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp | 5 |
1 files changed, 4 insertions, 1 deletions
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp b/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp index f01d5f672682..6efa78ef902c 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp @@ -608,6 +608,8 @@ public: ? LDSToKernelsThatNeedToAccessItIndirectly[HybridModuleRoot] : EmptySet; + const size_t HybridModuleRootKernelsSize = HybridModuleRootKernels.size(); + for (auto &K : LDSToKernelsThatNeedToAccessItIndirectly) { // Each iteration of this loop assigns exactly one global variable to // exactly one of the implementation strategies. @@ -647,7 +649,8 @@ public: ModuleScopeVariables.insert(GV); } else if (K.second.size() == 1) { KernelAccessVariables.insert(GV); - } else if (set_is_subset(K.second, HybridModuleRootKernels)) { + } else if (K.second.size() == HybridModuleRootKernelsSize && + set_is_subset(K.second, HybridModuleRootKernels)) { ModuleScopeVariables.insert(GV); } else { TableLookupVariables.insert(GV); |
