llvm-project.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2025-11-10	CodeGen: Remove TRI arguments from stack load/store hooks (#158240)	Matt Arsenault
	This is directly available in TargetInstrInfo
2025-11-10	CodeGen: Keep reference to TargetRegisterInfo in TargetInstrInfo (#158224)	Matt Arsenault
	Both conceptually belong to the same subtarget, so it should not be necessary to pass in the context TargetRegisterInfo to any TargetInstrInfo member. Add this reference so those superfluous arguments can be removed. Most targets placed their TargetRegisterInfo as a member in TargetInstrInfo. A few had this owned by the TargetSubtargetInfo, so unify all targets to look the same.
2025-09-23	[CodeGen] Rename isReallyTriviallyReMaterializable [nfc]	Philip Reames
	.. to isReMaterializableImpl. The "Really" naming has always been awkward, and we're working towards removing the "Trivial" part now, so go ehead and remove both pieces in a single rename. Note that this doesn't change any aspect of the current implementation; we still "mostly" only return instructions which are trivial (meaning no virtual register uses), but some targets do lie about that today.
2025-09-19	PPC: Replace PointerLikeRegClass with RegClassByHwMode (#158777)	Matt Arsenault

2025-09-08	CodeGen: Pass SubtargetInfo to TargetGenInstrInfo constructors (#157337)	Matt Arsenault
	This will make it possible for tablegen to make subtarget dependent decisions without adding new arguments to every target. --------- Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-09-06	PPC: Fix missing const on TargetInstrInfo's subtarget reference (#157201)	Matt Arsenault

2025-08-27	[PowerPC] Add DMR and WACC COPY support (#149129)	Maryam Moghadas
	This patch updates PPCInstrInfo::copyPhysReg to support DMR and WACC register classes and extends the PPCVSXCopy pass to handle specific WACC copy patterns.
2025-06-18	[PowerPC] Add code to spill and restore DMRp registers (#142443)	Lei Huang

2025-06-02	[PowerPC] Spill and restore DMR register (#141530)	Lei Huang
	Add spilling and restoring of DMR registers.
2025-05-26	[PowerPC] Update DMF VSX ACC data transfer instructions (#138897)	Lei Huang
	For cpu=future, acc registers no longer overlap VSRs and are prefixed with `dm`. The original, xxmfacc/xxmtacc instructions are now extended menemonics to it's dm* equivalents.
2025-04-18	[Analysis] Remove implicit LocationSize conversion from uint64_t (#133342)	Philip Reames
	This change removes the uint64_t constructor on LocationSize preventing implicit conversion, and fixes up the using APIs to adapt to the change. Note that I'm adding a couple of explicit conversion points on routines where passing in a fixed offset as an integer seems likely to have well understood semantics. We had an unfortunate case which arose if you tried to pass a TypeSize value to a parameter of LocationSize type. We'd find the implicit conversion path through TypeSize -> uint64_t -> LocationSize which works just fine for fixed values, but looses information and fails assertions if the TypeSize was scalable. This change breaks the first link in that implicit conversion chain since that seemed to be the easier one.
2025-04-03	[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#133155)	zhijian lin
	ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated, using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO, UADDO_CARRY, USUBO, USUBO_CARRY in the patch.
2025-03-13	[MachineCombiner][Targets] Use Register in TII genAlternativeCodeSequence ↵	Craig Topper
	interface. NFC (#131272)
2025-02-27	[PowerPC] Avoid repeated hash lookups (NFC) (#129193)	Kazu Hirata

2025-02-24	[CodeGen] Change copyPhysReg interface to use Register instead of ↵	Craig Topper
	MCRegister. (#128473) NVPTX, SPIRV, and WebAssembly pass virtual registers to this function since they don't perform register allocation. We need to use Register to avoid a virtual register being converted to MCRegister by the caller.
2025-02-20	Revert "[CodeGen] Remove static member function Register::isVirtualRegister. ↵	Christopher Di Bella
	NFC (#127968)" This reverts commit ff99af7ea03b3be46bec7203bd2b74048d29a52a.
2025-02-20	[CodeGen] Remove static member function Register::isVirtualRegister. NFC ↵	Craig Topper
	(#127968) Use nonstatic member instead. This requires explicit conversions, but many will go away as we continue converting unsigned to Register. In a few places where it was simple, I changed unsigned to Register.
2025-02-19	Revert "[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE ↵	David Tenty
	(#116984)" This reverts commit 7763119c6eb0976e4836f81c9876c49a36d46d73 (leaving the modifications from 03cb46d248b08)..
2025-02-13	[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#116984)	zhijian lin
	ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated, using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO, UADDO_CARRY, USUBO, USUBO_CARRY in the patch.
2025-01-23	[llvm][CodeGen] Fix the issue caused by live interval checking in window ↵	Hua Tian
	scheduler (#123184) At some corner cases, the cloned MI still retains an old slot index, which leads to the compiler crashing. This patch update the slot index map before delete the recycled MI. https://github.com/llvm/llvm-project/issues/123165
2025-01-22	[llvm] Pass MachineInstr flags to storeRegToStackSlot/loadRegFromStackSlot ↵	Venkata Ramanaiah Nalamothu
	(NFC) (#120622) This patch is in preparation to enable setting the MachineInstr::MIFlag flags, i.e. FrameSetup/FrameDestroy, on callee saved register spill/reload instructions in prologue/epilogue. This eventually helps in setting the prologue_end and epilogue_begin markers more accurately. The DWARF Spec in "6.4 Call Frame Information" says: The code that allocates space on the call frame stack and performs the save operation is called the subroutine’s prologue, and the code that performs the restore operation and deallocates the frame is called its epilogue. which means the callee saved register spills and reloads are part of prologue (a.k.a frame setup) and epilogue (a.k.a frame destruction), respectively. And, IIUC, LLVM backend uses FrameSetup/FrameDestroy flags to identify instructions that are part of call frame setup and destruction. In the trunk, while most targets consistently set FrameSetup/FrameDestroy on save/restore call frame information (CFI) instructions of callee saved registers, they do not consistently set those flags on the actual callee saved register spill/reload instructions. I believe this patch provides a clean mechanism to set FrameSetup/FrameDestroy flags on the actual callee saved register spill/reload instructions as needed. And, by having default argument of MachineInstr::NoFlags for Flags, this patch is a NFC. With this patch, the targets have to just pass FrameSetup/FrameDestroy flag to the storeRegToStackSlot/loadRegFromStackSlot calls from the target derived spillCalleeSavedRegisters and restoreCalleeSavedRegisters to set those flags on callee saved register spill/reload instructions. Also, this patch makes it very easy to set the source line information on callee saved register spill/reload instructions which is needed by the DwarfDebug.cpp implementation to set prologue_end and epilogue_begin markers more accurately. As per DwarfDebug.cpp implementation: prologue_end is the first known non-DBG_VALUE and non-FrameSetup location that marks the beginning of the function body epilogue_begin is the first FrameDestroy location that has been seen in the epilogue basic block With this patch, the targets have to just do the following to set the source line information on callee saved register spill/reload instructions, without hampering the LLVM's efforts to avoid adding source line information on the artificial code generated by the compiler. <Foo>InstrInfo::storeRegToStackSlot() { ... DebugLoc DL = Flags & MachineInstr::FrameSetup ? DebugLoc() : MBB.findDebugLoc(I); ... } <Foo>InstrInfo::loadRegFromStackSlot() { ... DebugLoc DL = Flags & MachineInstr::FrameDestroy ? MBB.findDebugLoc(I) : DebugLoc(); ... } While I understand this patch would break out-of-tree backend builds, I think it is in the right direction. One immediate use case that can benefit from this patch is fixing #120553 becomes simpler.
2025-01-03	[PowerPC] Use `RegisterClassInfo::getRegPressureSetLimit` (#120383)	Pengcheng Wang
	`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from https://github.com/llvm/llvm-project/pull/118787
2024-11-14	[PowerPC] Remove unused includes (NFC) (#116163)	Kazu Hirata
	Identified with misc-include-cleaner.
2024-11-12	[llvm] Remove redundant control flow statements (NFC) (#115831)	Kazu Hirata
	Identified with readability-redundant-control-flow.
2024-10-31	Promote 32bit pseudo instr that infer extsw removal to 64bit in ↵	zhijian lin
	PPCMIPeephole (#85451) Fixes: https://github.com/llvm/llvm-project/issues/71030 Bug only happens in 64bit involving spills. Since we don't know when the spill will happen, all instructions in the chain used to deduce sign extension for eliminating 'extsw' will need to be promoted to 64-bit pseudo instructions. The following instruction will promoted in PPCMIPeepholes: EXTSH, LHA, ISEL to EXTSH8, LHA8, ISEL8
2024-10-17	[PowerPC][ISelLowering] Support -mstack-protector-guard=tls (#110928)	Keith Packard
	Add support for using a thread-local variable with a specified offset for holding the stack guard canary value. This supports both 32- and 64- bit PowerPC targets. This mirrors changes from #108942 but targeting PowerPC instead of RISCV. Because both of these PRs modify the same driver functions, this series is stack on top of the RISC-V one. --------- Signed-off-by: Keith Packard <keithp@keithp.com>
2024-08-27	[TII][RISCV] Add renamable bit to copyPhysReg (#91179)	Piyou Chen
	The renamable flag is useful during MachineCopyPropagation but renamable flag will be dropped after lowerCopy in some case. This patch introduces extra arguments to pass the renamable flag to copyPhysReg.
2024-07-23	[PowerPC] Add support for -mcpu=pwr11 / -mtune=pwr11 (#99511)	azhan92
	This PR adds support for -mcpu=pwr11/power11 and -mtune=pwr11/power11 in clang and llvm.
2024-07-13	[Target] Use range-based for loops (NFC) (#98705)	Kazu Hirata

2024-07-03	[PowerPC] refactor CPU info in PPCTargetParser.def, NFC	Chen Zheng
	CPU features will be done in follow up patches.
2024-06-28	[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)	Nikita Popov
	Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.
2024-05-21	[PowerPC][AIX] 64-bit large code-model support for toc-data (#90619)	Zaara Syeda
	This patch adds support for toc-data for 64-bit large code-model on AIX. The sequence ADDIStocHA8/ADDItocL8 is used to access the data directly from the TOC. When emitting the instruction ADDIStocHA8, we check if the symbol has toc-data attribute before creating a toc entry for it. When emitting the instruction ADDItocL8, we use the LA8 instruction to load the address.
2024-04-24	[CodeGen] Make the parameter TRI required in some functions. (#85968)	Xu Zhang
	Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-17	[PowerPC] 32-bit large code-model support for toc-data (#85129)	Zaara Syeda
	This patch adds the pseudo op ADDItocL for 32-bit large code-model support for toc-data.
2024-04-11	[MachineCombiner][NFC] Split target-dependent patterns	Pengcheng Wang
	We split target-dependent MachineCombiner patterns into their target folder. This makes MachineCombiner much more target-independent. Reviewers: davemgreen, asavonic, rotateright, RKSimon, lukel97, LuoYuanke, topperc, mshockwave, asi-sc Reviewed By: topperc, mshockwave Pull Request: https://github.com/llvm/llvm-project/pull/87991
2024-03-13	[PowerPC][NFC] Rename ADDItocL to match the 64-bit naming convention (#85099)	Zaara Syeda
	In preparation of adding a similar instruction for large code model on AIX for 32-bit, rename the exisitng ADDItocL 64-instruction to ADDItocL8 to match the naming convention of other instructions with 32-bit and 64-bit variants.
2024-03-06	[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875)	David Green
	This is another part of #70452 which makes getMemOperandsWithOffsetWidth use a LocationSize for Width, as opposed to the unsigned it currently uses. The advantages on it's own are not super high if getMemOperandsWithOffsetWidth usually uses known sizes, but if the values can come from an MMO it can help be more accurate in case they are Unknown (and in the future, scalable).
2024-03-01	[PowerPC] Support local-dynamic TLS relocation on AIX (#66316)	Felix (Ting Wang)
	Supports TLS local-dynamic on AIX, generates below sequence of code: ``` .tc foo[TC],foo[TL]@ld # Variable offset, ld relocation specifier .tc mh[TC],mh[TC]@ml # Module handle for the caller lwz 3,mh[TC]$2$ $$ For 64-bit: ld 3,mh[TC]$2$ bla .__tls_get_mod # Modifies r0,r3,r4,r5,r11,lr,cr0 #r3 = &TLS for module lwz 4,foo[TC]$2$ $$ For 64-bit: ld 4,foo[TC]$2$ add 5,3,4 # Compute &foo .rename mh[TC], "\_$TLSML" # Symbol for the module handle must have the name "_$TLSML" ``` --------- Co-authored-by: tingwang <tingwang@tingwangs-MBP.lan> Co-authored-by: tingwang <tingwang@tingwangs-MacBook-Pro.local>
2024-02-01	[TTI] Use Register in isLoadFromStackSlot and isStoreToStackSlot [nfc] (#80339)	Philip Reames

2024-01-26	[PowerPC][X86] Make cpu id builtins target independent and lower for PPC ↵	Nemanja Ivanovic
	(#68919) Make __builtin_cpu_{init\|supports\|is} target independent and provide an opt-in query for targets that want to support it. Each target is still responsible for their specific lowering/code-gen. Also provide code-gen for PowerPC. I originally proposed this in https://reviews.llvm.org/D152914 and this addresses the comments I received there. --------- Co-authored-by: Nemanja Ivanovic <nemanjaivanovic@nemanjas-air.kpn> Co-authored-by: Nemanja Ivanovic <nemanja@synopsys.com>
2024-01-26	[NFC] Rename TargetInstrInfo::FoldImmediate to ↵	Shengchen Kan
	TargetInstrInfo::foldImmediate and simplify implementation for X86
2024-01-18	[NFC][PowerPC] remove the redundant spill related flags setting	Chen Zheng

2024-01-15	[PowerPC] Implement fence builtin (#76495)	Qiu Chaofan

2023-12-20	[PowerPC] Use 'sync; ld; cmp; bc; isync' for atomic load seq-cst on 32-bit ↵	Kai Luo
	platform (#75905) `cmp; bc; isync` is more performant than `lwsync` theoretically. 64-bit platform already features it, now implement it for 32-bit platform.
2023-12-18	[PowerPC] Let base implementation decide if MI is rematerizable by default ↵	Kai Luo
	(#75772) If MI is not PPC specific instructions, let base implementation decide if MI is rematerizable. This can fix failure in #75570 after #75271 .
2023-12-07	[PowerPC] redesign the target flags (#69695)	Chen Zheng
	12 bit is not enough for PPC's target specific flags. If 8 bit for the bitmask flags, 4 bit for the direct mask, PPC can total have 16 direct mask and 8 bitmask. Not enough for PPC, see this issue in https://github.com/llvm/llvm-project/pull/66316 Redesign how PPC target set the target specific flags. With this patch, all ppc target flags are direct flags. No bitmask flag in PPC anymore. This patch aligns with some targets like X86 which also has many target specific flags. The patch also fixes a bug related to flag `MO_TLSGDM_FLAG` and `MO_LO`. They are the same value and the test case changes in this PR shows the bug.
2023-12-06	[MachineScheduler][NFCI] Add Offset and OffsetIsScalable args to ↵	Alex Bradbury
	shouldClusterMemOps (#73778) These are picked up from getMemOperandsWithOffsetWidth but weren't then being passed through to shouldClusterMemOps, which forces backends to collect the information again if they want to use the kind of heuristics typically used for the similar shouldScheduleLoadsNear function (e.g. checking the offset is within 1 cache line). This patch just adds the parameters, but doesn't attempt to use them. There is potential to use them in the current PPC and AArch64 shouldClusterMemOps implementation, and I intend to use the offset in the heuristic for RISC-V. I've left these for future patches in the interest of being as incremental as possible. As noted in the review and in an inline FIXME, an ElementCount-style abstraction may later be used to condense these two parameters to one argument. ElementCount isn't quite suitable as it doesn't support negative offsets.
2023-12-01	TargetInstrInfo: make getOperandLatency return optional (NFC) (#73769)	Ramkumar Ramachandra
	getOperandLatency has the following behavior: it returns -1 as a special value, negative numbers other than -1 on some target-specific overrides, or a valid non-negative latency. This behavior can be surprising, as some callers do arithmetic on these negative values. Change the interface of getOperandLatency to return a std::optional<unsigned> to prevent surprises in callers. While at it, change the interface of getInstrLatency to return unsigned instead of int. This change was inspired by a refactoring in TargetSchedModel::computeOperandLatency.
2023-11-29	[NFC][MachineScheduler] Rename NumLoads parameter of shouldClusterMemOps to ↵	Alex Bradbury
	ClusterSize (#73757) As the same hook is called for both load and store clustering, NumLoads is a misleading name. Use ClusterSize instead.
2023-11-22	[AArch64] Use the same fast math preservation for MachineCombiner ↵	Craig Topper
	reassociation as X86/PowerPC/RISCV. (#72820) Don't blindly copy the original flags from the pre-reassociated instrutions. This copied the integer poison flags which are not safe to preserve after reassociation. For the FP flags, I think we should only keep the intersection of the flags. Override setSpecialOperandAttr to do this. Fixes #72777.