llvm-project.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2025-11-14	[mlir] Remove filtering of deprecated rocm-agent-enumerator value gfx000 ↵	alessandra simmons
	(#166634) Getting a gfx000 result from the `rocm-agent-enumerator` command was deprecated beginning with the release of ROCm 7, but the MLIR build system still filters it from results when looking for ROCm agents. This PR removes that filtering. There are a few other uses of "gfx000" in MLIR source, but those are used as default options for running some passes, and, to my understanding, have a semantically different meaning to the dummy result returned from `rocm-agent-enumerator` and don't need to be changed.
2025-11-13	[mlir] Fix build after #167848 (#167855)	Matthias Springer
	Fix build after #167848.
2025-11-13	[mlir] Add FP software implementation lowering pass: `arith-to-apfloat` ↵	Matthias Springer
	(#167848) Reland pass and fix linker errors. --------- Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
2025-11-12	Revert "Reland yet again: [mlir] Add FP software implementation lowering ↵	Maksim Levental
	pass: `arith-to-apfloat`" (#167834) Reverts llvm/llvm-project#167608 Broken builder https://lab.llvm.org/buildbot/#/builders/52/builds/12781
2025-11-12	Reland yet again: [mlir] Add FP software implementation lowering pass: ↵	Maksim Levental
	`arith-to-apfloat` (#167608) Fix both symbol visibility issue in the mlir_apfloat_wrappers lib and the linkage issue in ArithToAPFloat.
2025-11-11	Revert "Reapply "Reapply "[mlir] Add FP software implementation lowering ↵	Maksim Levental
	pass: `arith-to-apfloat` (#166618)" (#167431)"" (#167549) Reverts llvm/llvm-project#167436 to fix sanitizers
2025-11-11	Reapply "Reapply "[mlir] Add FP software implementation lowering pass: ↵	Maksim Levental
	`arith-to-apfloat` (#166618)" (#167431)" (#167436) Reland https://github.com/llvm/llvm-project/pull/166618 by fixing missing symbol issues by explicitly loading `--shared-libs=%mlir_apfloat_wrappers` as well as `--shared-libs=%mlir_c_runner_utils`.
2025-11-10	Revert "Reapply "[mlir] Add FP software implementation lowering pass: ↵	Maksim Levental
	`arith-to-apfloat` (#166618)" (#167431)" (#167435) This reverts commit 0e639ae6e030ade849fa7a09cb7dc40b42f25373.
2025-11-10	Reapply "[mlir] Add FP software implementation lowering pass: ↵	Maksim Levental
	`arith-to-apfloat` (#166618)" (#167431) Reland https://github.com/llvm/llvm-project/pull/166618 with MLIRFuncUtils linked in.
2025-11-10	Revert "[mlir] Add FP software implementation lowering pass: ↵	Maksim Levental
	`arith-to-apfloat` (#166618)" (#167429) This reverts commit 222f4e494a0cd9515c242fd083c2776772734385.
2025-11-10	[mlir] Add FP software implementation lowering pass: `arith-to-apfloat` ↵	Maksim Levental
	(#166618) This commit adds a new pass that lowers floating-point `arith` operations to calls into the execution engine runtime library. Currently supported operations: `addf`, `subf`, `mulf`, `divf`, `remf`. All floating-point types that have an APFloat semantics are supported. This includes low-precision floating-point types such as `f4E2M1FN` that cannot execute natively on CPUs. This commit also improves the `vector.print` lowering pattern to call into the runtime library for floating-point types that are not supported by LLVM. This is necessary to write a meaningful integration test. The way it works is ```mlir func.func @full_example() { %a = arith.constant 1.4 : f8E4M3FN %b = func.call @foo() : () -> (f8E4M3FN) %c = arith.addf %a, %b : f8E4M3FN vector.print %c : f8E4M3FN return } ``` gets transformed to ```mlir func.func private @__mlir_apfloat_add(i32, i64, i64) -> i6 func.func @full_example() { %cst = arith.constant 1.375000e+00 : f8E4M3FN %0 = call @foo() : () -> f8E4M3FN // bitcast operand A to integer of equal width %1 = arith.bitcast %cst : f8E4M3FN to i8 // zext A to i64 %2 = arith.extui %1 : i8 to i64 // same for operand B %3 = arith.bitcast %0 : f8E4M3FN to i8 %4 = arith.extui %3 : i8 to i64 // get the llvm::fltSemantics(f8E4M3FN) as an enum %c10_i32 = arith.constant 10 : i32 // call the impl against APFloat in mlir_apfloat_wrappers %5 = call @__mlir_apfloat_add(%c10_i32, %2, %4) : (i32, i64, i64) -> i64 // "cast" back to the original fp type %6 = arith.trunci %5 : i64 to i8 %7 = arith.bitcast %6 : i8 to f8E4M3FN vector.print %7 : f8E4M3FN } ``` Note, `llvm::fltSemantics(f8E4M3FN)` is emitted by the pattern each time an `arith` op is transformed, thereby making the call to `__mlir_apfloat_add` correct (i.e., no name mangling on type necessary). RFC: https://discourse.llvm.org/t/rfc-software-implementation-for-unsupported-fp-types-in-convert-arith-to-llvm/88785 --------- Co-authored-by: Matthias Springer <me@m-sp.org>
2025-10-27	[MLIR][ExecutionEngine] don't dump decls (#164478)	Maksim Levental
	Currently ExecutionEngine tries to dump all functions declared in the module, even those which are "external" (i.e., linked/loaded at runtime). E.g. ```mlir func.func private @printF32(f32) func.func @supported_arg_types(%arg0: i32, %arg1: f32) { call @printF32(%arg1) : (f32) -> () return } ``` fails with ``` Could not compile printF32: Symbols not found: [ __mlir_printF32 ] Program aborted due to an unhandled Error: Symbols not found: [ __mlir_printF32 ] ``` even though `printF32` can be provided at final build time (i.e., when the object file is linked to some executable or shlib). E.g, if our own `libmlir_c_runner_utils` is linked. So just skip functions which have no bodies during dump (i.e., are decls without defns).
2025-10-23	[MLIR][ExecutionEngine] don't leak -Wweak-vtables (#164498)	Maksim Levental
	I'm not 100% what this is used for in this lib but the compile flag leaks out and prevents (in certain compile scenarios) linking `mlir_c_runner_utils`.
2025-09-30	[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in ↵	Mehdi Amini
	LevelZeroRuntimeWrappers.cpp (NFC)
2025-09-29	[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in JitRunner.cpp ↵	Mehdi Amini
	(NFC)
2025-08-27	[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in ↵	Mehdi Amini
	VulkanRuntime.cpp (NFC)
2025-08-17	[MLIR] Split ExecutionEngine Initialization out of ctor into an explicit ↵	Shenghang Tsai
	method call (#153524) Retry landing https://github.com/llvm/llvm-project/pull/153373 ## Major changes from previous attempt - remove the test in CAPI because no existing tests in CAPI deal with sanitizer exemptions - update `mlir/docs/Dialects/GPU.md` to reflect the new behavior: load GPU binary in global ctors, instead of loading them at call site. - skip the test on Aarch64 since we have an issue with initialization there --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-08-13	Revert "[MLIR] Split ExecutionEngine Initialization out of ctor into an ↵	Mehdi Amini
	explicit method call" (#153477) Reverts llvm/llvm-project#153373 Sanitizer bot is broken
2025-08-13	[MLIR] Split ExecutionEngine Initialization out of ctor into an explicit ↵	Shenghang Tsai
	method call (#153373) This PR introduces a mechanism to defer JIT engine initialization, enabling registration of required symbols before global constructor execution. ## Problem Modules containing `gpu.module` generate global constructors (e.g., kernel load/unload) that execute during engine creation. This can force premature symbol resolution, causing failures when: - Symbols are registered via `mlirExecutionEngineRegisterSymbol` after creation - Global constructors exist (even if not directly using unresolved symbols, e.g., an external function declaration) - GPU modules introduce mandatory binary loading logic ## Usage ```c // Create engine without initialization MlirExecutionEngine jit = mlirExecutionEngineCreate(...); // Register required symbols mlirExecutionEngineRegisterSymbol(jit, ...); // Explicitly initialize (runs global constructors) mlirExecutionEngineInitialize(jit); ``` --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-08-06	[mlir][ExecutionEngine] Add LevelZeroRuntimeWrapper. (#151038)	Md Abdullah Shahneous Bari
	Adds LevelZeroRuntime wrapper and tests. Co-authored-by: Artem Kroviakov <artem.kroviakov@intel.com> Co-authored-by: Nishant Patel <nishant.b.patel@intel.com> --------- Co-authored-by: Artem Kroviakov <artem.kroviakov@intel.com> Co-authored-by: Nishant Patel <nishant.b.patel@intel.com>
2025-07-09	[MLIR][AArch64] Change some tests to ensure SVE vector length is the same ↵	Momchil Velikov
	throughout the function (#147506) This change only applies to functions the can be reasonably expected to use SVE registers. Modifying vector length in the middle of a function might cause incorrect stack deallocation if there are callee-saved SVE registers or incorrect access to SVE stack slots. Addresses (non-issue) https://github.com/llvm/llvm-project/issues/143670
2025-06-06	Pass memory buffer to RuntimeDyld::MemoryManager factory (#142930)	Karlo Basioli
	`RTDyldObjectLinkingLayer` is currently creating a memory manager without any parameters. In this PR I am passing the MemoryBuffer that will be emitted to the MemoryManager so that the user can use it to configure the behaviour of the MemoryManager.
2025-05-28	[mlir] SYCL runtime wrapper: add memcpy support. (#141647)	Sang Ik Lee

2025-05-15	[mlir-runner] Check entry function does not expect arguments (#136825)	Longsheng Mou
	This PR fixes a crash if entry function has inputs. Fixes #136143.
2025-05-12	[NFC][MLIR] Add {} for `else` when `if` body has {} (#139422)	Rahul Joshi

2025-04-22	[mlir][gpu] Change GPU modules to globals (#135478)	Christian Sigg
	Load/unload GPU modules in global ctors/dtors instead of each time when launching a kernel. Loading GPU modules is a heavy-weight operation and synchronizes the GPU context. Now that the modules are loaded ahead of time, asynchronously launched kernels can run concurrently, see https://discourse.llvm.org/t/how-to-lower-the-combination-of-async-gpu-ops-in-gpu-dialect. The implementations of `embedBinary()` and `launchKernel()` use slightly different mechanics at the moment but I prefer to not change the latter more than necessary as part of this PR. I will prepare a follow-up NFC for `launchKernel()` to align them again.
2025-03-06	[IR] Store Triple in Module (NFC) (#129868)	Nikita Popov
	The module currently stores the target triple as a string. This means that any code that wants to actually use the triple first has to instantiate a Triple, which is somewhat expensive. The change in #121652 caused a moderate compile-time regression due to this. While it would be easy enough to work around, I think that architecturally, it makes more sense to store the parsed Triple in the module, so that it can always be directly queried. For this change, I've opted not to add any magic conversions between std::string and Triple for backwards-compatibilty purses, and instead write out needed Triple()s or str()s explicitly. This is because I think a decent number of them should be changed to work on Triple as well, to avoid unnecessary conversions back and forth. The only interesting part in this patch is that the default triple is Triple("") instead of Triple() to preserve existing behavior. The former defaults to using the ELF object format instead of unknown object format. We should fix that as well.
2025-03-06	Re-apply "[ORC] Remove the Triple argument from LLJITBuilder::..." with fixes.	Lang Hames
	This re-applies f905bf3e1ef860c4d6fe67fb64901b6bbe698a91, which was reverted in c861c1a046eb8c1e546a8767e0010904a3c8c385 due to compiler errors, with a fix for MLIR.
2025-01-24	[mlir] Rename mlir-cpu-runner to mlir-runner (#123776)	Andrea Faulds
	With the removal of mlir-vulkan-runner (as part of #73457) in e7e3c45bc70904e24e2b3221ac8521e67eb84668, mlir-cpu-runner is now the only runner for all CPU and GPU targets, and the "cpu" name has been misleading for some time already. This commit renames it to mlir-runner.
2025-01-22	Reapply "[mlir] Link libraries that aren't included in libMLIR to libMLIR" ↵	Michał Górny
	(#123910) Use `mlir_target_link_libraries()` to link dependencies of libraries that are not included in libMLIR, to ensure that they link to the dylib when they are used in Flang. Otherwise, they implicitly pull in all their static dependencies, effectively causing Flang binaries to simultaneously link to the dylib and to static libraries, which is never a good idea. I have only covered the libraries that are used by Flang. If you wish, I can extend this approach to all non-libMLIR libraries in MLIR, making MLIR itself also link to the dylib consistently. [v3 with more `-DBUILD_SHARED_LIBS=ON` fixes]
2025-01-22	Revert "[mlir] Link libraries that aren't included in libMLIR to libMLIR ↵	Michał Górny
	(#123781)" This reverts commit 4c6242ebf50dde0597df2bace49d534b61122496. More BUILD_SHARED_LIBS=ON regressions, sigh.
2025-01-22	[mlir] Link libraries that aren't included in libMLIR to libMLIR (#123781)	Michał Górny
	Use `mlir_target_link_libraries()` to link dependencies of libraries that are not included in libMLIR, to ensure that they link to the dylib when they are used in Flang. Otherwise, they implicitly pull in all their static dependencies, effectively causing Flang binaries to simultaneously link to the dylib and to static libraries, which is never a good idea. I have only covered the libraries that are used by Flang. If you wish, I can extend this approach to all non-libMLIR libraries in MLIR, making MLIR itself also link to the dylib consistently. [v2 with fixed `-DBUILD_SHARED_LIBS=ON` build]
2025-01-21	[mlir] Remove mlir-vulkan-runner and GPUToVulkan conversion passes (#123750)	Andrea Faulds
	This follows up on 733be4ed7dcf976719f424c0cb81b77a14f91f5a, which made mlir-vulkan-runner and its associated passes redundant, and completes the main goal of #73457. The mlir-vulkan-runner tests become part of the integration test suite, and the Vulkan runner runtime components become part of ExecutionEngine, just as was done when removing other target-specific runners.
2025-01-20	Revert "[mlir] Link libraries that aren't included in libMLIR to libMLIR ↵	Michał Górny
	(#123477)" This reverts commit af6616676fb7f9dd4898290ea684ee0c90f1701d. It broke builds with `-DBUILD_SHARED_LIBS=ON`.
2025-01-20	[mlir] Link libraries that aren't included in libMLIR to libMLIR (#123477)	Michał Górny
	Use `mlir_target_link_libraries()` to link dependencies of libraries that are not included in libMLIR, to ensure that they link to the dylib when they are used in Flang. Otherwise, they implicitly pull in all their static dependencies, effectively causing Flang binaries to simultaneously link to the dylib and to static libraries, which is never a good idea. I have only covered the libraries that are used by Flang. If you wish, I can extend this approach to all non-libMLIR libraries in MLIR, making MLIR itself also link to the dylib consistently.
2024-11-08	[mlir] Remove the mlir-spirv-cpu-runner (move to mlir-cpu-runner) (#114563)	Andrea Faulds
	This commit builds on and completes the work done in 9f6c632ecda08bfff76b798c46d5d7cfde57b5e9 to eliminate the need for a separate mlir-spirv-cpu-runner binary. Since the MLIR processing is already done outside this runner, the only real difference between it and the mlir-cpu-runner is the final linking step between the nested LLVM IR modules. By moving this step into mlir-cpu-runner behind a new command-line flag (`--link-nested-modules`), this commit is able to completely remove the runner component of the mlir-spirv-cpu-runner. The runtime libraries and the tests are moved and renamed to fit into the Execution Engine and Integration tests, following the model of the similar migration done for the CUDA Runner in D97463.
2024-10-09	[MLIR] Don't build MLIRExecutionEngineShared on Windows (#109524)	Zentrik
	This disabled the build of `MLIRExecutionEngineShared` because this causes linkage issues in windows for currently unknown reasons. Related issue: https://github.com/llvm/llvm-project/issues/106859.
2024-09-20	[MLIR][AMDGPU] Add ability to do 16-bit Memset with HIP APIs (#108587)	Umang Yadav
	CC: @krzysz00 @manupak
2024-09-16	[mlir] Tidy uses of llvm::raw_stream_ostream (NFC)	JOE1994
	As specified in the docs, 1) raw_string_ostream is always unbuffered and 2) the underlying buffer may be used directly ( 65b13610a5226b84889b923bae884ba395ad084d for further reference ) * Don't call raw_string_ostream::flush(), which is essentially a no-op. * Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
2024-07-17	[mlir][gpu] Use alloc OP's `host_shared` in cuda runtime (#99035)	Guray Ozen

2024-03-30	[MLIR][ExecutionEngine] Introduce shared library (#87067)	Christian Ulmann
	This commit introduces a shared library for the MLIR execution engine. This library is only built when `LLVM_BUILD_LLVM_DYLIB` is set. Having such a library allows downstream users to depend on the execution engine without giving up dynamic linkage. This is especially important for CPU runner-style tools, as they link against large parts of MLIR and LLVM. It is alternatively possible to modify the `MLIRExecutionEngine` target when `LLVM_BUILD_LLVM_DYLIB` is set, to avoid duplicated libraries.
2024-03-29	[mlir][sparse] provide an AoS "view" into sparse runtime support lib (#87116)	Aart Bik
	Note that even though the sparse runtime support lib always uses SoA storage for COO storage (and provides correct codegen by means of views into this storage), in some rare cases we need the true physical SoA storage as a coordinate buffer. This PR provides that functionality by means of a (costly) coordinate buffer call. Since this is currently only used for testing/debugging by means of the sparse_tensor.print method, this solution is acceptable. If we ever want a performing version of this, we should truly support AoS storage of COO in addition to the SoA used right now.
2024-03-28	[mlir] Make the print function in CRunnerUtil platform agnostic (#86767)	Kai Sasaki
	The platform running on Apple Silicon does not seem to support the negative nan. It causes the test failure where we explicitly specify the negative nan bit pattern and check the output printed by the CRunnerUtil function. We can make the print function in the utility platform agnostic by using the standard library functions (i.e. `std::isnan` and `std::signbit`) so that we can run the test across platforms that do not support the negative bit pattern. I have added two test cases that would fail in the Apple Silicon platform without print function changes. ``` $ uname -a Darwin Kernel Version 23.3.0: Wed Dec 20 21:30:44 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6000 arm64 ``` See: https://discourse.llvm.org/t/test-failure-of-sparse-sign-test-in-apple-silicon/77876/3
2024-03-20	[MLIR][CUDA] Use _alloca instead of alloca on Windows (#85853)	Justin Holewinski
	MSVC/Windows does not support `alloca()`; instead it defines `_alloca()` in `malloc.h`.
2024-03-19	[mlir][sparse] Fix the calling convention of __truncsfbf2 on windows x64	Benjamin Kramer
	It also wants us to return the value in XMM0.
2024-03-18	[mlir][nvgpu] Support strided memref when creating TMA descriptor (#85652)	Guray Ozen

2024-03-14	[mlir][sparse] refactoring sparse runtime lib into less paths (#85332)	Aart Bik
	Two constructors could be easily refactored into one after a lot of previous deprecated code has been removed.
2024-03-05	Rename llvm::ThreadPool -> llvm::DefaultThreadPool (NFC) (#83702)	Mehdi Amini
	The base class llvm::ThreadPoolInterface will be renamed llvm::ThreadPool in a subsequent commit. This is a breaking change: clients who use to create a ThreadPool must now create a DefaultThreadPool instead.
2024-03-05	Use the new ThreadPoolInterface base class instead of the concrete ↵	Mehdi Amini
	implementation (NFC) (#84056)
2024-02-23	[mlir][sparse] remove very thin header file from sparse runtime support (#82820)	Aart Bik