llvm-project.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2025-11-20	[LoopPeel] Fix BFI when peeling last iteration without guard (#168250)	Joel E. Denny
	LoopPeel sometimes proves that, when reached, the original loop always executes at least two iterations. LoopPeel then unconditionally executes both the remaining loop's initial iteration and the peeled final iteration. But that increases the latter's frequency above its frequency in the original loop. To maintain the total frequency, this patch compensates by decreasing the remaininng loop's latch probability. This is another step in issue #135812 and was discussed at <https://github.com/llvm/llvm-project/pull/166858#discussion_r2528968542>.
2025-11-04	[LoopUnroll] Fix division by zero (#166258)	Joel E. Denny
	PR #159163's probability computation for epilogue loops does not handle the possibility of an original loop probability of one. Runtime loop unrolling does not make sense for such an infinite loop, and a division by zero results. This patch works around that case. Issue #165998.
2025-10-31	[LoopUnroll] Fix block frequencies for epilogue (#159163)	Joel E. Denny
	As another step in issue #135812, this patch fixes block frequencies for partial loop unrolling with an epilogue remainder loop. It does not fully handle the case when the epilogue loop itself is unrolled. That will be handled in the next patch. For the guard and latch of each of the unrolled loop and epilogue loop, this patch sets branch weights derived directly from the original loop latch branch weights. The total frequency of the original loop body, summed across all its occurrences in the unrolled loop and epilogue loop, is the same as in the original loop. This patch also sets `llvm.loop.estimated_trip_count` for the epilogue loop instead of relying on the epilogue's latch branch weights to imply it. This patch fixes branch weights in tests that PR #157754 adversely affected.
2025-10-30	[LoopUnroll][NFCI] Clean up remainder followup metadata handling (#165272)	Joel E. Denny
	Followup metadata for remainder loops is handled by two implementations, both added by 7244852557ca6: 1. `tryToUnrollLoop` in `LoopUnrollPass.cpp`. 2. `CloneLoopBlocks` in `LoopUnrollRuntime.cpp`. As far as I can tell, 2 is useless: I added `assert(!NewLoopID)` for the `NewLoopID` returned by the `makeFollowupLoopID` call, and it never fails throughout check-all for my build. Moreover, if 2 were useful, it appears it would have a bug caused by 7cd826a321d9. That commit skips adding loop metadata to a new remainder loop if the remainder loop itself is to be completely unrolled because it will then no longer be a loop. However, that commit incorrectly assumes that `UnrollRemainder` dictates complete unrolling of a remainder loop, and thus it skips adding loop metadata even if the remainder loop will be only partially unrolled. To avoid further confusion here, this patch removes 2. check-all continues to pass for my build. If 2 actually is useful, please advise so we can create a test that covers that usage. Near 2, this patch retains the `UnrollRemainder` guard on the `setLoopAlreadyUnrolled` call, which adds `llvm.loop.unroll.disable` to the remainder loop. That behavior exists both before and after 7cd826a321d9. The logic appears to be that remainder loop unrolling (whether complete or partial) is opt-in. That is, unless `UnrollRemainder` is true, `UnrollRuntimeLoopRemainder` skips running remainder loop unrolling, and `llvm.loop.unroll.disable` suppresses any later attempt at it. This patch also extends testing of remainder loop followup metadata to be sure remainder loop partial unrolling is handled correctly by 1.
2025-10-07	[LoopUnroll] Skip remainder loop guard if skip unrolled loop (#156549)	Joel E. Denny
	The original loop (OL) that serves as input to LoopUnroll has basic blocks that are arranged as follows: ``` OLPreHeader OLHeader <-. ... \| OLLatch ---' OLExit ``` In this depiction, every block has an implicit edge to the next block below, so any explicit edge indicates a conditional branch. Given OL and unroll count N, LoopUnroll sometimes creates an unrolled loop (UL) with a remainder loop (RL) epilogue arranged like this: ``` ,-- ULGuard \| ULPreHeader \| ULHeader <-. \| ... \| \| ULLatch ---' \| ULExit `-> RLGuard -----. RLPreHeader \| ,-> RLHeader \| \| ... \| `-- RLLatch \| RLExit \| OLExit <-----' ``` Each UL iteration executes N OL iterations, but each RL iteration executes 1 OL iteration. ULGuard or RLGuard checks whether the first iteration of UL or RL should execute, respectively. If so, ULLatch or RLLatch checks whether to execute each subsequent iteration. Once reached, OL always executes its first iteration but not necessarily the next N-1 iterations. Thus, ULGuard is always required before the first UL iteration. However, when control flows from ULGuard directly to RLGuard, the first OL iteration has yet to execute, so RLGuard is then redundant before the first RL iteration. Thus, this patch makes the following changes: - Adjust ULGuard to branch to RLPreHeader instead of RLGuard, thus eliminating RLGuard's unnecessary branch instruction for that path. - Eliminate the creation of RLGuard phi node poison values. Without this patch, RLGuard has such a phi node for each value that is defined by any OL iteration and used in OLExit. The poison value is required where ULGuard is the predecessor. The poison value indicates that control flow from ULGuard to RLGuard to Exit has no counterpart in OL because the first OL iteration must execute either in UL or RL. - Simplify the CFG by not splitting ULExit and RLGuard because, without the ULGuard predecessor, the single block can now be a dedicated UL exit. - To RLPreHeader, add an `llvm.assume` call that asserts the RL trip count is non-zero. Without this patch, RLPreHeader is reachable only when RLGuard guarantees that assertion is true. With this patch, RLGuard guarantees it only when RLGuard is the predecessor, and the OL structure guarantees it when ULGuard is the predecessor. If RL itself is unrolled later, this guarantee somehow prevents ScalarEvolution from giving up when trying to compute a maximum trip count for RL. That maximum trip count enables the branch instruction in the final unrolled instance of RLLatch to be eliminated. Without the `llvm.assume` call, some existing unroll tests start to fail because that instruction is not eliminated. The original motivation for this patch is to facilitate later patches that fix LoopUnroll's computation of branch weights so that they maintain the block frequency of OL's body (see #135812). Specifically, this patch ensures RLGuard's branch weights do not affect RL's contribution to the block frequency of OL's body in the case that ULGuard skips UL.
2025-04-04	[LoopUnroll] UnrollRuntimeMultiExit takes precedence over TTI. (#134259)	Florian Hahn
	Update UnrollRuntimeLoopRemainder to always give priority to the UnrollRuntimeMultiExit option, if provided. After ad9da92cf6f7357 (https://github.com/llvm/llvm-project/pull/124462), we would ignore the option if the backend indicates multi-exit is profitable. This means it cannot be used to disable runtime unrolling. To be consistent with canProfitablyRuntimeUnrollMultiExitLoop, always respect the option. This surfaced while discussing https://github.com/llvm/llvm-project/pull/131998. PR: https://github.com/llvm/llvm-project/pull/134259
2025-01-27	[LoopUnroll] Add RuntimeUnrollMultiExit to loop unroll options (NFC) (#124462)	Florian Hahn
	Add an extra knob to RuntimeUnrollMultiExit to let backends control whether to allow multi-exit unrolling on a per-loop basis. This gives backends more fine-grained control on deciding if multi-exit unrolling is profitable for a given loop and uarch. Similar to 4226e0a0c75. PR: https://github.com/llvm/llvm-project/pull/124462
2024-12-02	[TTI] Add SCEVExpansionBudget to loop unrolling options. (#118316)	Florian Hahn
	Add an extra know to UnrollingPreferences to let backends control the maximum budget for SCEV expansions. This gives backends more fine-grained control on the cost of the runtime checks for runtime unrolling. PR: https://github.com/llvm/llvm-project/pull/118316
2024-11-04	[Utils] Remove unused includes (NFC) (#114748)	Kazu Hirata
	Identified with misc-include-cleaner.
2024-09-23	[Loops] Use forgetLcssaPhiWithNewPredecessor() in more places	Nikita Popov
	Use the more aggressive invalidation method in a number of places that add incoming values to lcssa phi nodes. It is likely that it's possible to construct cases with incorrect SCEV preservation similar to https://github.com/llvm/llvm-project/issues/109333 for these.
2024-08-03	[Transforms] Construct SmallVector with ArrayRef (NFC) (#101851)	Kazu Hirata

2024-06-27	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)	Nikita Popov
	This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.
2024-06-25	[LoopUnroll] Use poison instead of undef for another preheader value	Nikita Popov

2024-06-25	[LoopUnroll] Use poison instead of undef for preheader value	Nikita Popov

2024-06-13	[llvm-project] Fix typo "seperate" (#95373)	Jay Foad

2024-06-06	[LoopUnroll] Consider convergence control tokens when unrolling (#91715)	Sameer Sahasrabuddhe
	- There is no restriction on a loop with controlled convergent operations when the relevant tokens are defined and used within the loop. - When a token defined outside a loop is used inside (also called a loop convergence heart), unrolling is allowed only in the absence of remainder or runtime checks. - When a token defined inside a loop is used outside, such a loop is said to be "extended". This loop can only be unrolled by also duplicating the extended part lying outside the loop. Such unrolling is disabled for now. - Clean up loop hearts: When unrolling a loop with a heart, duplicating the heart will introduce multiple static uses of a convergence control token in a cycle that does not contain its definition. This violates the static rules for tokens, and needs to be cleaned up into a single occurrence of the intrinsic. - Spell out the initializer for UnrollLoopOptions to improve readability. Original implementation [D85605] by Nicolai Haehnle <nicolai.haehnle@amd.com>.
2024-05-08	[RemoveDIs] Change remapDbgVariableRecord to remapDbgRecord (#91456)	Harald van Dijk
	We need to remap any DbgRecord, not just DbgVariableRecords. This is the followup to #91447. Co-authored-by: PietroGhg <pietro.ghiglio@codeplay.com>
2024-03-19	[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)	Stephen Tozer
	This is the major rename patch that prior patches have built towards. The DPValue class is being renamed to DbgVariableRecord, which reflects the updated terminology for the "final" implementation of the RemoveDI feature. This is a pure string substitution + clang-format patch. The only manual component of this patch was determining where to perform these string substitutions: `DPValue` and `DPV` are almost exclusively used for DbgRecords, except for: - llvm/lib/target, where 'DP' is used to mean double-precision, and so appears as part of .td files and in variable names. NB: There is a single existing use of `DPValue` here that refers to debug info, which I've manually updated. - llvm/tools/gold, where 'LDPV' is used as a prefix for symbol visibility enums. Outside of these places, I've applied several basic string substitutions, with the intent that they only affect DbgRecord-related identifiers; I've checked them as I went through to verify this, with reasonable confidence that there are no unintended changes that slipped through the cracks. The substitutions applied are all case-sensitive, and are applied in the order shown: ``` DPValue -> DbgVariableRecord DPVal -> DbgVarRec DPV -> DVR ``` Following the previous rename patches, it should be the case that there are no instances of any of these strings that are meant to refer to the general case of DbgRecords, or anything other than the DPValue class. The idea behind this patch is therefore that pure string substitution is correct in all cases as long as these assumptions hold.
2024-03-12	[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords ↵	Stephen Tozer
	(#84793) As part of the effort to rename the DbgRecord classes, this patch renames the widely-used functions that operate on DbgRecords but refer to DbgValues or DPValues in their names to refer to DbgRecords instead; all such functions are defined in one of `BasicBlock.h`, `Instruction.h`, and `DebugProgramInstruction.h`. This patch explicitly does not change the names of any comments or variables, except for where they use the exact name of one of the renamed functions. The reason for this is reviewability; this patch can be trivially examined to determine that the only changes are direct string substitutions and any results from clang-format responding to the changed line lengths. Future patches will cover renaming variables and comments, and then renaming the classes themselves.
2024-02-01	[LoopUnroll] Fix missing sign extension	Nikita Popov
	For integers larger than 64-bit, this would zero-extend a -1 value, instead of sign-extending it. Fixes https://github.com/llvm/llvm-project/issues/80289.
2023-11-24	[DebugInfo][RemoveDIs] Support cloning and remapping DPValues (#72546)	Jeremy Morse
	This patch adds support for CloneBasicBlock duplicating the DPValues attached to instructions, and adds facilities to remap them into their new context. The plumbing to achieve this is fairly straightforwards and mechanical. I've also added illustrative uses to LoopUnrollRuntime, SimpleLoopUnswitch and SimplifyCFG. The former only updates for the epilogue right now so I've added CHECK lines just for the end of an unrolled loop (further updates coming later). SimpleLoopUnswitch had no debug-info tests so I've added a new one. The two modified parts of SimplifyCFG are covered by the two modified SimplifyCFG tests. These are scenarios where we have to do extra cloning for copying of DPValues because they're no longer instructions, and remap them too.
2023-09-11	LoopUnrollRuntime: Add weights to all branches	Matthias Braun
	Make sure every conditional branch constructed by `LoopUnrollRuntime` code sets branch weights. - Add new 1:127 weights for the conditional jumps checking whether the whole (unrolled) loop should be skipped in the generated prolog or epilog code. - Remove `updateLatchBranchWeightsForRemainderLoop` function and just add weights immediately when constructing the relevant branches. This leads to simpler code and makes the code more obvious as every call to `CreateCondBr` now has a `BranchWeights` parameter. - Rework formula for epilogue latch weights, to assume equal distribution of remainders and remove `assert` (as I was able to reach this code when forcing small unroll factors on the commandline). Differential Revision: https://reviews.llvm.org/D158642
2023-09-11	[NFC][RemoveDIs] Prefer iterator-insertion over instructions	Jeremy Morse
	Continuing the patch series to get rid of debug intrinsics [0], instruction insertion needs to be done with iterators rather than instruction pointers, so that we can communicate information in the iterator class. This patch adds an iterator-taking insertBefore method and converts various call sites to take iterators. These are all sites where such debug-info needs to be preserved so that a stage2 clang can be built identically; it's likely that many more will need to be changed in the future. At this stage, this is just changing the spelling of a few operations, which will eventually become signifiant once the debug-info bearing iterator is used. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152537
2023-06-19	[LoopUnrollRuntime] Allow indirect transition to deopt non-latch exit blocks	Yevgeny Rouban
	Relax condition on runtime trip count unrolling loops with 1 non-latch exit that leads to a deop block. There are cases when the deopt blocks are common exits for different loops. LoopSimplify pass splits such edges to the common deopting blocks to make sure that all exit nodes of the loop only have predecessors that are inside of the loop (See simplifyOneLoop()). This breaks the current condition for unrolling. This patch allows the split transitive blocks that still lead to the deopting blocks. Differential Revision: https://reviews.llvm.org/D152639
2022-12-16	[Transforms,InstCombine] std::optional::value => operator*/operator->	Fangrui Song
	value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
2022-12-14	[NFC] Cleanup: Replace Function::getBasicBlockList().splice() with ↵	Vasileios Porpodas
	Function::splice() This is part of a series of patches that aim at making Function::getBasicBlockList() private. Differential Revision: https://reviews.llvm.org/D139984
2022-12-12	Transforms/Utils: llvm::Optional => std::optional	Fangrui Song

2022-08-07	[Transforms] Fix comment typos (NFC)	Kazu Hirata

2022-08-03	[llvm][NFC] Refactor code to use ProfDataUtils	Paul Kirth
	In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860
2022-07-27	Revert "[llvm][NFC] Refactor code to use ProfDataUtils"	Paul Kirth
	This reverts commit 300c9a78819b4608b96bb26f9320bea6b8a0c4d0. We will reland once these issues are ironed out.
2022-07-27	[llvm][NFC] Refactor code to use ProfDataUtils	Paul Kirth
	In this patch we replace common code patterns with the use of utility functions for dealing with profiling metadata. There should be no change in functionality, as the existing checks should be preserved in all cases. Reviewed By: bogner, davidxl Differential Revision: https://reviews.llvm.org/D128860
2022-07-13	[llvm] Use value instead of getValue (NFC)	Kazu Hirata

2022-06-29	[LoopUnrollRuntime] Invalidate SCEV for exit phi in ConnectProlog.	Florian Hahn
	ConnectProlog adds new incoming values to exit phi nodes which can change the SCEV for the phi after 20d798bd47ec51. Fix is analog to cfc741bc0e029. Fixes #56286.
2022-06-29	[UnrollRuntime] Invalidate SCEVs for modified phis in ConnectEpilog.	Florian Hahn
	ConnectEpilog adds new incoming values to exit phi nodes which can change the SCEV for the phi after 20d798bd47ec51. Fix is analog to cfc741bc0e029. Fixes #56282.
2022-06-25	[llvm] Don't use Optional::hasValue (NFC)	Kazu Hirata
	This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.
2022-06-25	Revert "Don't use Optional::hasValue (NFC)"	Kazu Hirata
	This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25	Don't use Optional::hasValue (NFC)	Kazu Hirata

2022-06-09	[NFC] format InstructionSimplify & lowerCaseFunctionNames	Simon Moll
	Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889
2022-05-24	[LoopUnroll] Freeze tripcount rather than condition	Nikita Popov
	This is a followup to D125754. We introduce two branches, one before the unrolled loop and one before the epilogue (and similar for the prologue case). The previous patch only froze the condition on the first branch. Rather than independently freezing the second condition, this patch instead freezes TripCount and bases BECount on it. These are the two quantities involved in the conditions, and this ensures that both work on a consistent, non-poisonous trip count. Differential Revision: https://reviews.llvm.org/D125896
2022-05-18	[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits	Nikita Popov
	When performing runtime unrolling with multiple exits, one of the earlier (non-latch) exits may exit the loop on the first iteration, such that we never branch on the latch exit condition. As such, we need to freeze the condition of the new branch that is introduced before the loop, as it now executes unconditionally. Differential Revision: https://reviews.llvm.org/D125754
2022-03-01	Cleanup includes: TransformsUtils	serge-sans-paille
	Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120741
2021-12-01	[LoopUnrollRuntime] Remove unnecessary pointer BECount check (NFC)	Nikita Popov
	BECounts are guaranteed to be integers nowadays.
2021-11-15	[unroll-runtime] Relax two profitability limitations on multi-exit unrolling	Philip Reames
	This change is mostly about getting rid of some "uninteresting" cases in a follow on deeper heuristic change. If anyone sees actually interesting code differences out of this, please let me know. I'm not expecting this to have much impact at all. Case 1 - With the single deoptimize non-latch exit, we can't have two exiting blocks sharing an exit block. We can only hit this with a poorly documented debug flag. Case 2 - Why should we treat epilog cases differently from prolog cases? Or to say it differently, why should starting with a constant control whether a multiple exit loop gets unrolled? Sorry for the lack of tests here. These are both exceedingly narrow cases in practice, and after a while trying, I couldn't come up with a test which did anything "useful" as opposed to simply exercise a random combination of force flags. Note that the legality cases for each are already exercised with force flags.
2021-11-15	[runtime-unroll] Inline canSafelyUnrollMultiExitLoop [NFC]	Philip Reames
	All of the interesting logic from this routine has been removed, inline the single check into the sole non-assert caller. The assert use has little value with the restructured code and is simply dropped.
2021-11-15	[runtime-unroll] Restructure if-clause to improve readability [NFC]	Philip Reames

2021-11-12	[runtime-unroll] Use incrementing IVs instead of decrementing ones	Philip Reames
	This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the IVs inserted by the runtime unroller to clamp iteration count of the loops* from decrementing to incrementing. Why does this matter? A couple of reasons: * SCEV doesn't have a native subtract node. Instead, all subtracts (A - B) are represented as A + -1 * B and drops any flags invalidated by such. As a result, SCEV is slightly less good at reasoning about edge cases involving decrementing addrecs than incrementing ones. (You can see this in the inferred flags in some of the test cases.) * Other parts of the optimizer produce incrementing IVs, and they're common in idiomatic source language. We do have support for reversing IVs, but in general if we produce one of each, the pair will persist surprisingly far through the optimizer before being coalesced. (You can see this looking at nearby phis in the test cases.) Note that if the hardware prefers decrementing (i.e. zero tested) loops, LSR should convert back immediately before codegen. * Mostly irrelevant detail: The main loop of the prolog case is handled independently and will simple use the original IV with a changed start value. We could in theory use this scheme for all iteration clamping, but that's a larger and more invasive change.
2021-10-14	[llvm] Use llvm::is_contained (NFC)	Kazu Hirata

2021-09-13	[Utils] Use make_early_inc_range (NFC)	Kazu Hirata

2021-09-02	[runtimeunroll] Support epilogue unrolling with a parent loop	Philip Reames
	This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the inner loop could be outside the outer loop. When we clone the inner loop and split the latch exit, the cloned blocks need to be in the outer loop. Differential Revision: https://reviews.llvm.org/D108476
2021-09-02	[runtimeunroll] Under EXPENSIVE_CHECKS, validate loop info	Philip Reames
	Requested in review comment on D108476