llvm-project.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2025-11-14	[OpenMP][Flang] Emit default declare mappers implicitly for derived types ↵	Akash Banerjee
	(#140562) This patch adds support to emit default declare mappers for implicit mapping of derived types when not supplied by user. This especially helps tackle mapping of allocatables of derived types.
2025-11-10	[PGO][Offload] Fix missing names bug in GPU PGO (#166444)	Ethan Luis McDonough
	After #163011 was merged, the tests in [`offload/test/offloading/gpupgo`](https://github.com/llvm/llvm-project/compare/main...EthanLuisMcDonough:llvm-project:gpupgo-names-fix-pr?expand=1#diff-f769f6cebd25fa527bd1c1150cc64eb585c41cb8a8b325c2bc80c690e47506a1) broke because the offload plugins were no longer able to find `__llvm_prf_nm`. This pull request explicitly makes `__llvm_prf_nm` visible to the host on GPU targets and reverses the changes made in f7e9968a5ba99521e6e51161f789f0cc1745193f.
2025-11-06	[OpenMP] Fix tests relying on the heap size variable	Joseph Huber
	Summary: I made that an unimplemented error, but forgot that it was used for this environment variable.
2025-11-06	[Offload] Remove handling for device memory pool (#163629)	Joseph Huber
	Summary: This was a lot of code that was only used for upstream LLVM builds of AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so just use that. Simplifies code, can be added back if we start providing alternate forms but I don't think there's a single use-case that would justify it yet.
2025-10-31	[MLIR][OpenMP] Fix and simplify bounds offset calculation for 1-D GEP ↵	agozillon
	offsets (#165486) Currently this is being calculated incorrectly and will result in incorrect index offsets in more complicated array slices. This PR tries to address it by refactoring and changing the calculation to be more correct.
2025-10-16	[Offload] XFAIL pgo tests until resolved (#163722)	Jan Patrick Lehr
	While people look into it, xfail the tests.
2025-10-15	[OpenMP] Disable a few more tests to get the bot green (#163614)	Joseph Huber

2025-10-15	[OpenMP] Add test to print interop identifiers (#161434)	Jan Patrick Lehr
	The test covers some of the identifier symbols in the interop runtime. This test, for now, is to guard against complete breakage, which was the result of the other `interop.c` test not being enabled on AMD and thus, not caught by our buildbots.
2025-10-09	[Flang][OpenMP] Defer descriptor mapping for assumed dummy argument types ↵	agozillon
	(#154349) This PR adds deferral of descriptor maps until they are necessary for assumed dummy argument types. The intent is to avoid a problem where a user can inadvertently map a temporary local descriptor to device without their knowledge and proceed to never unmap it. This temporary local descriptor remains lodged in OpenMP device memory and the next time another variable or descriptor residing in the same stack address is mapped we incur a runtime OpenMP map error as we try to remap the same address. This fix was discussed with the OpenMP committee and applies to OpenMP 5.2 and below, future versions of OpenMP can avoid this issue via the attach semantics added to the specification.
2025-10-02	[Flang][OpenMP] Implicitly map nested allocatable components in derived ↵	Akash Banerjee
	types (#160766) This PR adds support for nested derived types and their mappers to the MapInfoFinalization pass. - Generalize MapInfoFinalization to add child maps for arbitrarily nested allocatables when a derived object is mapped via declare mapper. - Traverse HLFIR designates rooted at the target block arg and build full coordinate_of chains; append members with correct membersIndex. This fixes #156461.
2025-09-29	[OpenMP] Mark problematic tests as XFAIL / UNSUPPORTED (#161267)	Joseph Huber
	Summary: Several of these tests have been failing for literal years. Ideally we make efforts to fix this, but keeping these broken has had serious consequences on our testing infrastructure where failures are the norm so almost all test failures are disregarded. I made a tracking issue for the ones that have been disabled. https://github.com/llvm/llvm-project/issues/161265
2025-09-29	[OpenMP][Flang] Fix no-loop test (#161162)	Dominik Adamski
	Fortran no-loop test is supported only for GPU.
2025-09-26	[Flang][OpenMP] Enable no-loop kernels (#155818)	Dominik Adamski
	Enable the generation of no-loop kernels for Fortran OpenMP code. target teams distribute parallel do pragmas can be promoted to no-loop kernels if the user adds the -fopenmp-assume-teams-oversubscription and -fopenmp-assume-threads-oversubscription flags. If the OpenMP kernel contains reduction or num_teams clauses, it is not promoted to no-loop mode. The global OpenMP device RTL oversubscription flags no longer force no-loop code generation for Fortran.
2025-09-25	Revert "[Flang][OpenMP] Implicitly map nested allocatable components in ↵	Akash Banerjee
	derived types" (#160759) Reverts llvm/llvm-project#160116
2025-09-24	[Flang][OpenMP] Implicitly map nested allocatable components in derived ↵	Akash Banerjee
	types (#160116) This PR adds support for nested derived types and their mappers to the MapInfoFinalization pass. - Generalize MapInfoFinalization to add child maps for arbitrarily nested allocatables when a derived object is mapped via declare mapper. - Traverse HLFIR designates rooted at the target block arg and build full coordinate_of chains; append members with correct membersIndex. This fixes #156461.
2025-09-19	[OpenMP][MLIR] Preserve to/from flags in mapper base entry for mappers (#159799)	Akash Banerjee
	With declare mapper, the parent base entry was emitted as `TARGET_PARAM` only. The mapper received a map-type without `to/from`, causing components to degrade to `alloc`-only (no copies), breaking allocatable payload mapping. This PR preserves the map-type bits from the parent. This fixes #156466.
2025-09-17	[NFC][flang][do concurent] Add saxpy offload tests for OpenMP mapping (#155993)	Kareem Ergawy
	Adds end-to-end tests for `do concurrent` offloading to the device. PR stack: - https://github.com/llvm/llvm-project/pull/155754 - https://github.com/llvm/llvm-project/pull/155987 - https://github.com/llvm/llvm-project/pull/155992 - https://github.com/llvm/llvm-project/pull/155993 ◀️ - https://github.com/llvm/llvm-project/pull/157638 - https://github.com/llvm/llvm-project/pull/156610 - https://github.com/llvm/llvm-project/pull/156837
2025-09-16	[OpenMP] Fix force-usm test after #157182 (#159095)	Jan Patrick Lehr
	The refactoring lead to an additional data transfer. This changes the assumed transfers in the check-strings to work with that changed behavior.
2025-09-09	[Flang][OpenMP] Fix mapping of character type with LEN > 1 specified (#154172)	agozillon
	Currently, there's a number of issues with mapping characters with LEN's specified (strings effectively). They're represented as a char type in FIR with a len parameter, and then later on they're expanded into an array of characters when we're translating to the LLVM dialect. However, we don't generate a bounds for these at lowering. The fix in this PR for this is to generate a bounds from the LEN parameter and attatch it to the map on lowering from FIR to the LLVM dialect when we encounter this type.
2025-09-03	[Offload][OpenMP] Enable more tests on AMDGPU (#156626)	Jan Patrick Lehr
	(Re)enables a couple of tests that were disabled on AMDGPU for some reason. Pass for me locally.
2025-08-18	Fix test added in 1fd1d634630754cc9b9c4b5526961d5856f64ff9	Akash Banerjee

2025-08-15	[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048)	Akash Banerjee
	Add a new AutomapToTargetData pass. This gathers the declare target enter variables which have the AUTOMAP modifier. And adds omp.declare_target_enter/exit mapping directives for fir.alloca and fir.free oeprations on the AUTOMAP enabled variables. Automap Ref: OpenMP 6.0 section 7.9.7.
2025-08-12	Revert "[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR ↵	Akash Banerjee
	(#153048)" This reverts commit 4e6d510eb3ec5b5e5ea234756ea1f0b283feee4a.
2025-08-12	[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048)	Akash Banerjee
	Add a new AutomapToTargetData pass. This gathers the declare target enter variables which have the AUTOMAP modifier. And adds omp.declare_target_enter/exit mapping directives for fir.alloca and fir.free oeprations on the AUTOMAP enabled variables. Automap Ref: OpenMP 6.0 section 7.9.7.
2025-08-12	[Clang][OpenMP] Non-contiguous strided update (#144635)	Amit Tiwari
	This patch handles the strided update in the `#pragma omp target update from(data[a:b:c])` directive where 'c' represents the strided access leading to non-contiguous update in the `data` array when the offloaded execution returns the control back to host from device using the `from` clause. Issue: Clang CodeGen where info is generated for the particular `MapType` (to, from, etc), it was failing to detect the strided access. Because of this, the `MapType` bits were incorrect when passed to runtime. This led to incorrect execution (contiguous) in the libomptarget runtime code. Added a minimal testcase that verifies the working of the patch.
2025-08-11	Revert "[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR ↵	Akash Banerjee
	(#151989)" This reverts commit 5a5e8ba0c388d57aecb359ed67919cda429fc7b1.
2025-08-11	[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#151989)	Akash Banerjee
	Add a new `AutomapToTargetData` pass. This gathers the declare target enter variables which have the `AUTOMAP` modifier. And adds `omp.declare_target_enter/exit` mapping directives for `fir.allocmem` and `fir.freemem` oeprations on the `AUTOMAP` enabled variables. Automap Ref: OpenMP 6.0 section 7.9.7.
2025-07-25	[Offload] Fix typo in shared_lib_fp_mapping.c	Aiden Grossman
	Made a typo in 963259ef6be4871e5252ff3ac9df737af5d2b4cb because I cannot run tests and also did not review it. This should fix it...
2025-07-25	[Offload] Remove uses of %T from lit tests (#150721)	Aiden Grossman
	This patch removes all the instances of %T from offload/ (only one test contained this construction). %T has been deprecated for ~7 years and is not reccomended as it does not use a unique directory per test. Switch to using %t to ensure we use a unique dir per test and so that we can eventually remove %T. I did not actually test this. A couple feeble attempts at building/running the offload tests just leaves me with a ton of test failures. Given how small this is I'm reasonably sure it works though.
2025-07-25	[Flang][OpenMP] Appropriately emit present/load/store in all cases in ↵	agozillon
	MapInfoFinalization (#150311) Currently, we return early whenever we've already generated an allocation for intermediate descriptor variables (required in certain cases when we can't directly access the base address of a passes in descriptor function argument due to HLFIR/FIR restrictions). This unfortunately, skips over the presence check and load/store required to set the intermediate descriptor allocations values/data. This is fine in most cases, but if a function happens to have a series of branches with seperate target regions capturing the same input argument, we'd emit the present/load/store into the first branch with the first target inside of it, the secondary (or any preceding) branches would not have the present/load/store, this would lead to the subsequent mapped values in that branch being empty and then leading to a memory access violation on device. The fix for the moment is to emit a present/load/store at the relevant location of every target utilising the input argument, this likely will also lead to fixing possible issues with the input argument being manipulated inbetween target regions (primarily resizing, the data should remain the same as we're just copying an address around, in theory at least). There's possible optimizations/simplifications to emit less load/stores such as by raising the load/store out of the branches when we can, but I'm inclined to leave this sort of optimization to lower level passes such as an LLVM pass (which very possibly already covers it).
2025-07-09	[libomptarget] fix test offloading/disable_default_device.c	Ye Luo
	Fixes the incorrect lit command line introduced in 536ba87726d8dea862d964678dbb761ca32e21fb
2025-06-30	[libomptarget] Add a test for OMP_TARGET_OFFLOAD=disabled (#146385)	Ye Luo
	closes https://github.com/llvm/llvm-project/issues/144786
2025-06-12	[Offload][PGO] Fix new GPU PGO tests (#143645)	Ethan Luis McDonough
	`pgo_atomic_teams.c` and `pgo_atomic_threads.c` currently are set to run on NVPTX despite the changes for that target not being upstreamed yet. This patch also replaces instances of `llvm-profdata` with `%profdata` in those tests.
2025-05-20	[OpenMP][GPU][FIX] Enable generic barriers in single threaded contexts (#140786)	Johannes Doerfert
	The generic GPU barrier implementation checked if it was the main thread in generic mode to identify single threaded regions. This doesn't work since inside of a non-active (=sequential) parallel, that thread becomes the main thread of a team, and is not the main thread in generic mode. At least that is the implementation of the APIs today. To identify single threaded regions we now check the team size explicitly. This exposed three other issues; one is, for now, expected and not a bug, the second one is a bug and has a FIXME in the single_threaded_for_barrier_hang_1.c file, and the final one is also benign as described in the end. The non-bug issue comes up if we ever initialize a thread state. Afterwards we will never run any region in parallel. This is a little conservative, but I guess thread states are really bad for performance anyway. The bug comes up if we optimize single_threaded_for_barrier_hang_1 and execute it in Generic-SPMD mode. For some reason we loose all the updates to b. This looks very much like a compiler bug, but could also be another logic issue in the runtime. Needs to be investigated. Issue number 3 comes up if we have nested parallels inside of a target region. The clang SPMD-check logic gets confused, determines SPMD (which is fine) but picks an unreasonable thread count. This is all benign, I think, just weird: ``` #pragma omp target teams #pragma omp parallel num_threads(64) #pragma omp parallel num_threads(10) {} ``` Was launched with 10 threads, not 64.
2025-05-14	[PGO][Offload] Update PGO GPU tests (#132262)	Ethan Luis McDonough

2025-05-12	[Flang][OpenMP] Initial defaultmap implementation (#135226)	agozillon
	This aims to implement most of the initial arguments for defaultmap aside from firstprivate and none, and some of the more recent OpenMP 6 additions which will come in subsequent updates (with the OpenMP 6 variants needing parsing/semantic support first).
2025-05-09	[Flang][OpenMP] Generate correct present checks for implicit maps of ↵	agozillon
	optional allocatables (#138210) Currently, we do not generate the appropriate checks to check if an optional allocatable argument is present before accessing relevant components of it, in particular when creating bounds, we must generate a presence check and we must make sure we do not generate/keep an load external to the presence check by utilising the raw address rather than the regular address of the info data structure. Similarly in cases for optional allocatables we must treat them like non-allocatable arguments and generate an intermediate allocation that we can have as a location in memory that we can access later in the lowering without causing segfaults when we perform "mapping" on it, even if the end result is an empty allocatable (basically, we shouldn't explode if someone tries to map a non-present optional, similar to C++ when mapping null data).
2025-04-23	[Offload] Fix handling of 'bare' mode when environment missing (#136794)	Joseph Huber
	Summary: We treated the missing kernel environment as a unique mode, but it was kind of this random bool that was doing the same thing and it explicitly expects the kernel environment to be zero. It broke after the previous change since it used to default to SPMD and didn't handle zero in any of the other cases despite being used. This fixes that and queries for it without needing to consume an error.
2025-04-14	[Flang][OpenMP][MLIR] Check for presence of Box type before emitting store ↵	agozillon
	in MapInfoFinalization pass (#135477) Currently we don't check for the presence of descriptor/BoxTypes before emitting stores which lower to memcpys, the issue with this is that users can have optional arguments, where they don't provide an input, making the argument effectively null. This can still be mapped and this causes issues at the moment as we'll emit a memcpy for function arguments to store to a local variable for certain edge cases, when we perform this memcpy on a null input, we cause a segfault at runtime. The fix to this is to simply create a branch around the store that checks if the data we're copying from is actually present. If it is, we proceed with the store, if it isn't we skip it.
2025-04-09	[PGO][Offload] Use %profdata in PGO tests (#135015)	Joel E. Denny
	So that the wrong llvm-profdata is not picked up from PATH.
2025-04-07	[MLIR][OpenMP] Add codegen for teams reductions (#133310)	Jan Leyonberg
	This patch adds the lowering of teams reductions from the omp dialect to LLVM-IR. Some minor cleanup was done in clang to remove an unused parameter.
2025-03-28	[PGO][Offload] Disable PGO on NVPTX (#133522)	Ethan Luis McDonough

2025-03-28	[offload] Remove bad assert in StaticLoopChunker::Distribute (#132705)	macurtis-amd
	When building with asserts enabled, this can actually cause strange miscompilations because an incorrect llvm.assume is generated at the point of the assertion.
2025-03-19	[PGO][Offload] Allow PGO flags to be used on GPU targets (#94268)	Ethan Luis McDonough
	This pull request is the third part of an ongoing effort to extends PGO instrumentation to GPU device code and depends on https://github.com/llvm/llvm-project/pull/93365. This PR makes the following changes: - Allows PGO flags to be supplied to GPU targets - Pulls version global from device - Modifies `__llvm_write_custom_profile` and `lprofWriteDataImpl` to allow the PGO version to be overridden
2025-03-12	[flang][OpenMP] Map ByRef if size/alignment exceed that of a pointer (#130832)	Krzysztof Parzyszek
	Improve the check for whether a type can be passed by copy. Currently, passing by copy is done via the OMP_MAP_LITERAL mapping, which can only transfer as much data as can be contained in a pointer representation.
2025-03-10	[flang][OpenMP] Implement HAS_DEVICE_ADDR clause (#128568)	Krzysztof Parzyszek
	The HAS_DEVICE_ADDR indicates that the object(s) listed exists at an address that is a valid device address. Specifically, `has_device_addr(x)` means that (in C/C++ terms) `&x` is a device address. When entering a target region, `x` does not need to be allocated on the device, or have its contents copied over (in the absence of additional mapping clauses). Passing its address verbatim to the region for use is sufficient, and is the intended goal of the clause. Some Fortran objects use descriptors in their in-memory representation. If `x` had a descriptor, both the descriptor and the contents of `x` would be located in the device memory. However, the descriptors are managed by the compiler, and can be regenerated at various points as needed. The address of the effective descriptor may change, hence it's not safe to pass the address of the descriptor to the target region. Instead, the descriptor itself is always copied, but for objects like `x`, no further mapping takes place (as this keeps the storage pointer in the descriptor unchanged). --------- Co-authored-by: Sergio Afonso <safonsof@amd.com>
2025-03-07	[Flang][OpenMP][MLIR] Implement close, present and ompx_hold modifiers for ↵	agozillon
	Flang maps (#129586) This PR adds an initial implementation for the map modifiers close, present and ompx_hold, primarily just required adding the appropriate map type flags to the map type bits. In the case of ompx_hold it required adding the map type to the OpenMP dialect. Close has a bit of a problem when utilised with the ALWAYS map type on descriptors, so it is likely we'll have to make sure close and always are not applied to the descriptor simultaneously in the future when we apply always to the descriptors to facilitate movement of descriptor information to device for consistency, however, we may find an alternative to this with further investigation. For the moment, it is a TODO/Note to keep track of it.
2025-02-18	[MLIR][OpenMP] Add LLVM translation support for OpenMP UserDefinedMappers ↵	Akash Banerjee
	(#124746) This patch adds OpenMPToLLVMIRTranslation support for the OpenMP Declare Mapper directive. Since both MLIR and Clang now support custom mappers, I've changed the respective function params to no longer be optional as well. Depends on #121005
2025-02-14	Reapply "[LinkerWrapper] Clean up options after proper forwarding" (#126495)	Joseph Huber
	Summary: The test failed because it no longer passed Rpass by default without LTO. I think that's desirable as it matches the standard behavior. This reverts commit 6fd99de31864a5ef84ae8613b3a9034e05293461.
2025-02-12	[PGO][Offload] Fix pgo1.c (#126864)	Ethan Luis McDonough
	pgo1.c had outdated test checks