llvm-project.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2025-11-11	MachineCombiner: Partially fix losing subregister indexes (#141661)	Matt Arsenault
	This fixes verifier errors in this test after earlier passes start introducing more subregister uses. This probably isn't adequately tested but I know nothing about this pass.
2025-11-10	CodeGen: Remove TRI arguments from stack load/store hooks (#158240)	Matt Arsenault
	This is directly available in TargetInstrInfo
2025-11-10	CodeGen: Remove TRI argument from reMaterialize (#158229)	Matt Arsenault

2025-11-10	CodeGen: Remove TRI argument from getRegClass (#158225)	Matt Arsenault
	TargetInstrInfo now directly holds a reference to TargetRegisterInfo and does not need TRI passed in anywhere.
2025-11-10	CodeGen: Keep reference to TargetRegisterInfo in TargetInstrInfo (#158224)	Matt Arsenault
	Both conceptually belong to the same subtarget, so it should not be necessary to pass in the context TargetRegisterInfo to any TargetInstrInfo member. Add this reference so those superfluous arguments can be removed. Most targets placed their TargetRegisterInfo as a member in TargetInstrInfo. A few had this owned by the TargetSubtargetInfo, so unify all targets to look the same.
2025-10-04	[RegAlloc] Remove default restriction on non-trivial rematerialization (#159211)	Luke Lau
	In the register allocator we define non-trivial rematerialization as the rematerlization of an instruction with virtual register uses. We have been able to perform non-trivial rematerialization for a while, but it has been prevented by default unless specifically overriden by the target in `TargetTransformInfo::isReMaterializableImpl`. The original reasoning for this given by the comment in the default implementation is because we might increase a live range of the virtual register, but we don't actually do this. LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize instructions whose virtual registers are already live at the use sites. https://reviews.llvm.org/D106408 had originally tried to remove this restriction but it was reverted after some performance regressions were reported. We think it is likely that the regressions were caused by the fact that the old isTriviallyReMaterializable API sometimes returned true for non-trivial rematerializations. However https://github.com/llvm/llvm-project/pull/160377 recently split the API out into a separate non-trivial and trivial version and updated the call-sites accordingly, and https://github.com/llvm/llvm-project/pull/160709 and #159180 fixed heuristics which weren't accounting for the difference between non-trivial and trivial. With these fixes in place, this patch proposes to again allow non-trivial rematerialization by default which reduces a significant amount of spills and reloads across various targets. For llvm-test-suite built with -O3 -flto, we get the following geomean reduction in reloads: - arm64-apple-darwin: 11.6% - riscv64-linux-gnu: 8.1% - x86_64-linux-gnu: 6.5%
2025-09-23	[CodeGen] Rename isReallyTriviallyReMaterializable [nfc]	Philip Reames
	.. to isReMaterializableImpl. The "Really" naming has always been awkward, and we're working towards removing the "Trivial" part now, so go ehead and remove both pieces in a single rename. Note that this doesn't change any aspect of the current implementation; we still "mostly" only return instructions which are trivial (meaning no virtual register uses), but some targets do lie about that today.
2025-09-19	CodeGen: Add RegisterClass by HwMode (#158269)	Matt Arsenault
	This is a generalization of the LookupPtrRegClass mechanism. AMDGPU has several use cases for swapping the register class of instruction operands based on the subtarget, but none of them really fit into the box of being pointer-like. The current system requires manual management of an arbitrary integer ID. For the AMDGPU use case, this would end up being around 40 new entries to manage. This just introduces the base infrastructure. I have ports of all the target specific usage of PointerLikeRegClass ready.
2025-09-12	CodeGen: Remove MachineFunction argument from getRegClass (#158188)	Matt Arsenault
	This is a low level utility to parse the MCInstrInfo and should not depend on the state of the function.
2025-09-12	CodeGen: Remove MachineFunction argument from getPointerRegClass (#158185)	Matt Arsenault
	getPointerRegClass is a layering violation. Its primary purpose is to determine how to interpret an MCInstrDesc's operands RegClass fields. This should be context free, and only depend on the subtarget. The model of this is also wrong, since this should be an instruction / operand specific property, not a global pointer class. Remove the the function argument to help stage removal of this hook and avoid introducing any new obstacles to replacing it. The remaining uses of the function were to get the subtarget, which TargetRegisterInfo already belongs to. A few targets needed new subtarget derived properties copied there.
2025-09-05	[TargetInstrInfo][AArch64] Don't assume register came from operand 0 in ↵	Craig Topper
	canCombine (#157210) We already have the register number from the user operand. Use it instead of assuming it must be operand 0 of the producing instruction. Fixes #157118
2025-08-31	[CodeGen] Drop disjoint flag when reassociating (#156218)	Philip Reames
	This fixes a latent miscompile. To understand why the flag can't be preserved, consider the case where a0=0, a1=0, a2=-1, and s3=-1.
2025-08-04	[CodeGen] Remove an unnecessary cast (NFC) (#151901)	Kazu Hirata
	getOpcode() already returns unsigned.
2025-07-31	MachineInstrBuilder: Introduce copyMIMetadata() function.	Peter Collingbourne
	This reduces the amount of boilerplate required when adding a new field to MIMetadata and reduces the chance of bugs like the one I fixed in TargetInstrInfo::reassociateOps. Reviewers: arsenm, nikic Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/133535
2025-07-17	[TII] Do not fold undef copies (#147392)	Jeffrey Byrnes
	RegallocBase::cleanupFailedVReg hacks up the state of the liveness in order to facilitate producing valid IR. During this process, we may end up producing undef copies. If the destination of these copies is a spill candidate, we will attempt to fold the source register when issuing the spill. The undef of the source is not propagated to storeRegToStackSlot , thus we end up dropping the undef, issuing a spill, and producing an illegal liveness state. This checks for undef copies, and, if found, inserts a kill instead of spill.
2025-07-11	[CodeGen] Do not use subsituteRegister to update implicit def (#148068)	Peiming Liu
	It seems `subsituteRegister` checks `FromReg == ToReg` instead of `TRI->isSubRegisterEq`. This PR simply reverts the original PR (https://github.com/llvm/llvm-project/pull/131361) to its initial implementation (without using `subsituteRegister`). Not sure whether it is a desired fix (and by no means that I am an expert on LLVM backend), but it does fix a numeric error on our internal workload. Original author: @sdesmalen-arm
2025-07-10	[CodeGen] commuteInstruction should update implicit-def (#131361)	Sander de Smalen
	When the RegisterCoalescer adds an implicit-def when coalescing a SUBREG_TO_REG (#123632), this causes issues when removing other COPY nodes by commuting the instruction because it doesn't take the implicit-def into consideration. This PR fixes that.
2025-05-22	[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties ↵users/pcc/spr/main.elf-add-branch-to-branch-optimization	Rahul Joshi
	(#140002) Add per-property has<Prop>/set<Prop>/reset<Prop> functions to MachineFunctionProperties.
2025-04-27	[llvm] Use range constructors of *Set (NFC) (#137552)	Kazu Hirata

2025-04-18	[Analysis] Remove implicit LocationSize conversion from uint64_t (#133342)	Philip Reames
	This change removes the uint64_t constructor on LocationSize preventing implicit conversion, and fixes up the using APIs to adapt to the change. Note that I'm adding a couple of explicit conversion points on routines where passing in a fixed offset as an integer seems likely to have well understood semantics. We had an unfortunate case which arose if you tried to pass a TypeSize value to a parameter of LocationSize type. We'd find the implicit conversion path through TypeSize -> uint64_t -> LocationSize which works just fine for fixed values, but looses information and fails assertions if the TypeSize was scalable. This change breaks the first link in that implicit conversion chain since that seemed to be the easier one.
2025-03-26	[CodeGen] Provide a target independent default for optimizeLoadInst [NFC]	Philip Reames
	This just moves the x86 implementation into generic code since it appears to be suitable for any target. The heart of this transform is inside foldMemoryOperand so other targets won't actually kick in until they implement said API. This just removes one piece to implement in the process of enabling foldMemoryOperand.
2025-03-25	[Machine-Combiner] Add a pass to reassociate chains of accumulation ↵	Jonathan Cohen
	instructions into a tree (#132728) This pass is designed to increase ILP by performing accumulation into multiple registers. It currently supports only the S/UABAL accumulation instruction, but can be extended to support additional instructions. Reland of #126060 which was reverted due to a conflict with #131272.
2025-03-23	Revert "[AArch64][MachineCombiner] Recombine long chains of accumulation ↵	Jonathan Cohen
	instructions into a tree to increase ILP (#126060) (#132607) This reverts commit c4caf949aa934a219e84d4ba0530bd535e698cdb.
2025-03-23	[AArch64][MachineCombiner] Recombine long chains of accumulation ↵	Jonathan Cohen
	instructions into a tree to increase ILP (#126060) This pattern shows up often in media libraries. The optimization should only kick in for O3. Currently only supports a single family of accumulation instructions, but can easily be expanded to support additional instructions in the future.
2025-03-13	[MachineCombiner][Targets] Use Register in TII genAlternativeCodeSequence ↵	Craig Topper
	interface. NFC (#131272)
2025-01-13	[aarch64][win] Update Called Globals info when updating Call Site info (#122762)	Daniel Paoliello
	Fixes the "use after poison" issue introduced by #121516 (see <https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>). The root cause of this issue is that #121516 introduced "Called Global" information for call instructions modeling how "Call Site" info is stored in the machine function, HOWEVER it didn't copy the copy/move/erase operations for call site information. The fix is to rename and update the existing copy/move/erase functions so they also take care of Called Global info.
2025-01-11	[AMDGPU] Add target hook to isGlobalMemoryObject (#112781)	Austin Kerbow
	We want special handing for IGLP instructions in the scheduler but they should still be treated like they have side effects by other passes. Add a target hook to the ScheduleDAGInstrs DAG builder so that we have more control over this.
2024-08-27	[TII][RISCV] Add renamable bit to copyPhysReg (#91179)	Piyou Chen
	The renamable flag is useful during MachineCopyPropagation but renamable flag will be dropped after lowerCopy in some case. This patch introduces extra arguments to pass the renamable flag to copyPhysReg.
2024-07-24	MachineOutliner: Use PM to query MachineModuleInfo (#99688)	Matt Arsenault
	Avoid getting this from the MachineFunction
2024-07-01	[llvm][CodeGen] Avoid 'raw_string_ostream::str' (NFC) (#97318)	Youngsuk Kim
	Since `raw_string_ostream` doesn't own the string buffer, it is desirable (in terms of memory safety) for users to directly reference the string buffer rather than use `raw_string_ostream::str()`. Work towards TODO comment to remove `raw_string_ostream::str()`.
2024-04-23	[CodeGen][TII] Allow reassociation on custom operand indices (#88306)	Min-Yih Hsu
	This opens up a door for reusing reassociation optimizations on target-specific binary operations with non-standard operand list. This is effectively a NFC.
2024-04-11	[clang][llvm] Remove "implicit-section-name" attribute (#87906)	Arthur Eubanks
	D33412/D33413 introduced this to support a clang pragma to set section names for a symbol depending on if it would be placed in bss/data/rodata/text, which may not be known until the backend. However, for text we know that only functions will go there, so just directly set the section in clang instead of going through a completely separate attribute. Autoupgrade the "implicit-section-name" attribute to directly setting the section on a Fuction.
2024-04-11	[MachineCombiner][NFC] Split target-dependent patterns	Pengcheng Wang
	We split target-dependent MachineCombiner patterns into their target folder. This makes MachineCombiner much more target-independent. Reviewers: davemgreen, asavonic, rotateright, RKSimon, lukel97, LuoYuanke, topperc, mshockwave, asi-sc Reviewed By: topperc, mshockwave Pull Request: https://github.com/llvm/llvm-project/pull/87991
2024-03-17	[CodeGen] Use LocationSize for MMO getSize (#84751)	David Green
	This is part of #70452 that changes the type used for the external interface of MMO to LocationSize as opposed to uint64_t. This means the constructors take LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The getSize methods return a LocationSize. This allows us to be more precise with unknown sizes, not accidentally treating them as unsigned values, and in the future should allow us to add proper scalable vector support but none of that is included in this patch. It should mostly be an NFC. Global ISel is still expected to use the underlying LLT as it needs, and are not expected to see unknown sizes for generic operations. Most of the changes are hopefully fairly mechanical, adding a lot of getValue() calls and protecting them with hasValue() where needed.
2024-03-06	[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875)	David Green
	This is another part of #70452 which makes getMemOperandsWithOffsetWidth use a LocationSize for Width, as opposed to the unsigned it currently uses. The advantages on it's own are not super high if getMemOperandsWithOffsetWidth usually uses known sizes, but if the values can come from an MMO it can help be more accurate in case they are Unknown (and in the future, scalable).
2023-12-05	TargetInstrInfo, TargetSchedule: fix non-NFC parts of 9468de4 (#74338)	Ramkumar Ramachandra
	Follow up on a post-commit review of 9468de4 (TargetInstrInfo: make getOperandLatency return optional (NFC)) by Bjorn Pettersson to fix a couple of things that are not NFC: - std::optional<T>::operator<= returns true if the first operand is a std::nullopt and second operand is T. Fix a couple of places where we assumed it would return false. - In TargetSchedule, computeInstrCost could take another codepath, returning InstrLatency instead of DefaultDefLatency. Fix one instance not accounting for this behavior.
2023-12-04	[TargetInstrInfo] update INLINEASM memoperands once (#74135)	Nick Desaulniers
	In commit b05335989239 ("[X86InstrInfo] support memfold on spillable inline asm (#70832)"), I had a last minute fix to update the memoperands. I originally did this in the parent foldInlineAsmMemOperand call, updated the mir test via update_mir_test_checks.py, but then decided to move it to the child call of foldInlineAsmMemOperand. But I forgot to rerun update_mir_test_checks.py. That last minute change caused the same memoperand to be added twice when recursion occurred (for tied operands). I happened to get lucky that trailing content omitted from the CHECK line doesn't result in test failure. But rerunning update_mir_test_checks.py on the mir test added in that commit produces updated output. This is resulting in updates to the test that: 1. conflate additions to the test in child commits with simply updating the test as it should have been when first committed. 2. look wrong because the same memoperand is specified twice (we don't deduplicate memoperands when added). Example: INLINEASM ... :: (load (s32) from %stack.0) (load (s32) from %stack.0) Fix the bug, so that in child commits, we don't have additional unrelated test changes (which would be wrong anyways) from simply running update_mir_test_checks.py. Link: #20571
2023-12-01	Fix MSVC signed/unsigned mismatch warning. NFC.	Simon Pilgrim

2023-12-01	TargetInstrInfo: make getOperandLatency return optional (NFC) (#73769)	Ramkumar Ramachandra
	getOperandLatency has the following behavior: it returns -1 as a special value, negative numbers other than -1 on some target-specific overrides, or a valid non-negative latency. This behavior can be surprising, as some callers do arithmetic on these negative values. Change the interface of getOperandLatency to return a std::optional<unsigned> to prevent surprises in callers. While at it, change the interface of getInstrLatency to return unsigned instead of int. This change was inspired by a refactoring in TargetSchedModel::computeOperandLatency.
2023-11-29	[X86InstrInfo] support memfold on spillable inline asm (#70832)	Nick Desaulniers
	This enables -regalloc=greedy to memfold spillable inline asm MachineOperands. Because no instruction selection framework marks MachineOperands as spillable, no language frontend can observe functional changes from this patch. That will change once instruction selection frameworks are updated. Link: https://github.com/llvm/llvm-project/issues/20571
2023-11-22	[AArch64] Use the same fast math preservation for MachineCombiner ↵	Craig Topper
	reassociation as X86/PowerPC/RISCV. (#72820) Don't blindly copy the original flags from the pre-reassociated instrutions. This copied the integer poison flags which are not safe to preserve after reassociation. For the FP flags, I think we should only keep the intersection of the flags. Override setSpecialOperandAttr to do this. Fixes #72777.
2023-11-21	reapply "[TargetInstrInfo] enable foldMemoryOperand for InlineAsm (#70743)" ↵	Nick Desaulniers
	(#72910) This reverts commit 42204c94ba9fcb0b4b1335e648ce140a3eef8a9d. It was accidentally backed out. #20571 #70743
2023-11-19	Revert "[TargetInstrInfo] enable foldMemoryOperand for InlineAsm (#70743)"	Bill Wendling
	This reverts commit 99ee2db198d86f685bcb07a1495a7115ffc31d7e. It's causing ICEs in the ARM tests. See the comment here: https://github.com/llvm/llvm-project/commit/99ee2db198d86f685bcb07a1495a7115ffc31d7e
2023-11-17	[TargetInstrInfo] enable foldMemoryOperand for InlineAsm (#70743)	Nick Desaulniers
	foldMemoryOperand looks at pairs of instructions (generally a load to virt reg then use of the virtreg, or def of a virtreg then a store) and attempts to combine them. This can reduce register pressure. A prior commit added the ability to mark such a MachineOperand as foldable. In terms of INLINEASM, this means that "rm" was used (rather than just "r") to denote that the INLINEASM may use a memory operand rather than a register operand. This effectively undoes decisions made by the instruction selection framework. Callers will be added in the register allocation frameworks. This has been tested with all of the above (which will come as follow up patches). Thanks to @topperc who suggested this at last years LLVM US Dev Meeting and @qcolombet who confirmed this was the right approach. Link: https://github.com/llvm/llvm-project/issues/20571
2023-11-03	[InlineAsm] Steal a bit to denote a register is foldable (#70738)	Nick Desaulniers
	When using the inline asm constraint string "rm" (or "g"), we generally would like the compiler to choose "r", but it is permitted to choose "m" if there's register pressure. This is distinct from "r" in which the register is not permitted to be spilled to the stack. The decision of which to use must be made at some point. Currently, the instruction selection frameworks (ISELs) make the choice, and the register allocators had better be able to handle the result. Steal a bit from Storage when using register operands to disambiguate between the two cases. Add helpers/getters/setters, and print in MIR when such a register is foldable. The getter will later be used by the register allocation frameworks (and asserted by the ISELs) while the setters will be used by the instruction selection frameworks. Link: https://github.com/llvm/llvm-project/issues/20571
2023-10-27	[BasicBlockSections] Apply path cloning with -basic-block-sections. (#68860)	Rahman Lavaee
	https://github.com/llvm/llvm-project/commit/28b912687900bc0a67cd61c374fce296b09963c4 introduced the path cloning format in the basic-block-sections profile. This PR validates and applies path clonings. A path cloning is valid if all of these conditions hold: 1. All bb ids in the path are mapped to existing blocks. 2. Each two consecutive bb ids in the path have a successor relationship in the CFG. 3. The path does not include a block with indirect branches, except possibly as the last block. Applying a path cloning involves cloning all blocks in the path (except the first one) and setting up their branches. Once all clonings are applied, the cluster information is used to guide block layout in the modified function.
2023-09-13	reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66264)	Nick Desaulniers
	reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) This reverts commit ee643b706be2b6bef9980b25cc9cc988dab94bb5. Fix up build failures in targets I missed in #66003 Kept as 3 commits for reviewers to see better what's changed. Will squash when merging. - reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) - fix all the targets I missed in #66003 - fix off by one found by llvm/test/CodeGen/SystemZ/inline-asm-addr.ll
2023-09-13	Revert "[InlineAsm] wrap ConstraintCode in enum class NFC (#66003)"	Reid Kleckner
	This reverts commit 2ca4d136124d151216aac77a0403dcb5c5835bcd. Also revert the followup, "[InlineAsm] fix botched merge conflict resolution" This reverts commit 8b9bf3a9f715ee5dce96eb1194441850c3663da1. There were SystemZ and Mips build errors, too many to fix forward.
2023-09-13	[InlineAsm] wrap ConstraintCode in enum class NFC (#66003)	Nick Desaulniers
	Similar to commit 2fad6e69851e ("[InlineAsm] wrap Kind in enum class NFC") Fix the TODOs added in commit 93bd428742f9 ("[InlineAsm] refactor InlineAsm class NFC (#65649)")
2023-09-11	[InlineAsm] refactor InlineAsm class NFC (#65649)	Nick Desaulniers
	I would like to steal one of these bits to denote whether a kind may be spilled by the register allocator or not, but I'm afraid to touch of any this code using bitwise operands. Make flags a first class type using bitfields, rather than launder data around via `unsigned`.