summaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
AgeCommit message (Collapse)Author
2025-09-10[DebugInfo][Mem2Reg] Assign uninitialized values with annotated locs (#157716)Stephen Tozer
In PromoteMem2Reg, we perform a DFS over the CFG and track, for each alloca, its incoming value and its associated incoming DebugLoc, both of which are taken from stores to that alloca; these values and DebugLocs are propagated to PHI nodes when new blocks are reached. In the event that for one incoming edge no store instruction has been seen, we propagate an UndefValue and an empty DebugLoc to the PHI. This is a perfectly valid occurrence, and assigning an empty DebugLoc to the PHI is the correct course of action; therefore, we should pass an annotated DebugLoc instead, so that in DebugLoc coverage tracking we correctly do not expect a valid DebugLoc to be present; we generally mark allocas as having CompilerGenerated locations, so I've chosen to use the same annotation to represent the uninitialized value of that alloca. This change is NFC outside of DebugLoc coverage tracking builds.
2025-08-18[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)Kazu Hirata
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.
2025-07-21[DebugInfo] Remove intrinsic-flavours of findDbgUsers (#149816)Jeremy Morse
This is one of the final remaining debug-intrinsic specific codepaths out there, and pieces of cross-LLVM infrastructure to do with debug intrinsics.
2025-07-18[DebugInfo] Shave even more users of DbgVariableIntrinsic from LLVM (#149136)Jeremy Morse
At this stage I'm just opportunistically deleting any code using debug-intrinsic types, largely adjacent to calls to findDbgUsers. I'll get to deleting that in probably one or more two commits.
2025-07-16[DebugInfo] Strip more debug-intrinsic code from local utils (#149037)Jeremy Morse
SROA and a few other facilities use generic-lambdas and some overloaded functions to deal with both intrinsics and debug-records at the same time. As part of stripping out intrinsic support, delete a swathe of this code from things in the Utils directory. This is a large diff, but is mostly about removing functions that were duplicated during the migration to debug records. I've taken a few opportunities to replace comments about "intrinsics" with "records", and replace generic lambdas with plain lambdas (I believe this makes it more readable). All of this is chipping away at intrinsic-specific code until we get to removing parts of findDbgUsers, which is the final boss -- we can't remove that until almost everything else is gone.
2025-06-03[PromoteMem2Reg] Optimize memory usage in PromoteMem2Reg (#142474)Vitaly Buka
When BasicBlock has a large number of allocas, and successors, we had to copy entire IncomingVals and IncomingLocs vectors for successors. Also updates to IncomingVals and IncomingLocs are infrequent (only Load/Store into alloca affect arrays). Given the nature of DFS traversal, instead of copying the entire vector, we can keep track of the changes and undo all changes done by successors. Fixes #142461 On the attached to issue #142461 IR RSS drops from 35Gb to 1.8Gb. But it does not affect compile time on average https://llvm-compile-time-tracker.com/compare.php?from=2e98ed8caa0b47ee79af4ad24b5436a89fe49dfa&to=effac6d1fd600e544f8bc21382c7e541973b1378&stat=instructions:u
2025-06-03[NFC][PromoteMem2Reg] Move IncomingVals, IncomingLocs, Worklist into class ↵Vitaly Buka
(#142468) They are all DFS state related, as `Visited`. But `Visited` is already a class member, so we make things more consistent and less parameters to pass around. By itself, the patch has little value, but it simplifies stuff in the #142474. For #142461
2025-06-03[NFCI][PromoteMem2Reg] Don't handle the first successor out of order (#142464)Vitaly Buka
Just for consistency, to avoid confusing conditions. `reverse` helps to avoid tests updates as nothing is changing for for successors count <=2. For #142461
2025-06-03[NFC] Remove goto in PromoteMem2Reg::RenamePass (#142454)Vitaly Buka
'goto' is essentially a shortcut for push/pop for worklist. It can be expensive if we copy vectors, but if we move them, it should not be an issue. Without 'goto' it's easier to reason about the code, when `PromoteMem2Reg::RenamePass` processes exactly one edge at a time. There is out of order processing of the first successor, I keep it just to make this patch pure NFC. I'll remove this in follow up patches. For #142461
2025-05-10[Utils] Use range-based for loops (NFC) (#139426)Kazu Hirata
2025-03-27[Transforms] Use range constructors of *Set (NFC) (#133203)Kazu Hirata
2025-02-13[reland][DebugInfo] Update DIBuilder insertion to take InsertPosition (#126967)Harald van Dijk
After #124287 updated several functions to return iterators rather than Instruction *, it was no longer straightforward to pass their result to DIBuilder. This commit updates DIBuilder methods to accept an InsertPosition instead, so that they can be called with an iterator (preferred), or with a deprecation warning an Instruction *, or a BasicBlock *. This commit also updates the existing calls to the DIBuilder methods to pass in iterators. As a special exception, DIBuilder::insertDeclare() keeps a separate overload accepting a BasicBlock *InsertAtEnd. This is because despite the name, this method does not insert at the end of the block, therefore this cannot be handled implicitly by using InsertPosition.
2025-02-12Revert "[DebugInfo] Update DIBuilder insertion to take InsertPosition (#126059)"Harald van Dijk
This reverts commit 3ec9f7494b31f2fe51d5ed0e07adcf4b7199def6.
2025-02-12[DebugInfo] Update DIBuilder insertion to take InsertPosition (#126059)Harald van Dijk
After #124287 updated several functions to return iterators rather than Instruction *, it was no longer straightforward to pass their result to DIBuilder. This commit updates DIBuilder methods to accept an InsertPosition instead, so that they can be called with an iterator (preferred), or with a deprecation warning an Instruction *, or a BasicBlock *. This commit also updates the existing calls to the DIBuilder methods to pass in iterators.
2025-01-24[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)Jeremy Morse
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).
2024-10-11[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752)Rahul Joshi
Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).
2024-08-29[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)Stephen Tozer
This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-21Handle #dbg_values in SROA. (#94070)Shubham Sandeep Rastogi
This patch properly handles #dbg_values in SROA by making sure that any #dbg_values get moved to before a store just like #dbg_declares do, or the #dbg_value is correctly updated with the right alloca after an aggregate alloca is broken up. The issue stems from swift where #dbg_values are emitted and not dbg.declares, the SROA pass doesn't handle the #dbg_values correctly and it causes them to all have undefs If we look at this simple-ish testcase (This is all I could reduce it down to, and I am still relatively bad at writing llvm IR by hand so I apologize in advance): ``` %T4main1TV13TangentVectorV = type <{ %T4main1UV13TangentVectorV, [7 x i8], %T4main1UV13TangentVectorV }> %T4main1UV13TangentVectorV = type <{ %T1M1SVySfG, [7 x i8], %T4main1VV13TangentVectorV }> %T1M1SVySfG = type <{ ptr, %Ts4Int8V }> %Ts4Int8V = type <{ i8 }> %T4main1VV13TangentVectorV = type <{ %T1M1SVySfG }> define hidden swiftcc void @"$s4main1TV13TangentVectorV1poiyA2E_AEtFZ"(ptr noalias nocapture sret(%T4main1TV13TangentVectorV) %0, ptr noalias nocapture dereferenceable(57) %1, ptr noalias nocapture dereferenceable(57) %2) #0 !dbg !44 { entry: %3 = alloca %T4main1VV13TangentVectorV %4 = alloca %T4main1UV13TangentVectorV %5 = alloca %T4main1VV13TangentVectorV %6 = alloca %T4main1UV13TangentVectorV %7 = alloca %T4main1VV13TangentVectorV %8 = alloca %T4main1UV13TangentVectorV %9 = alloca %T4main1VV13TangentVectorV %10 = alloca %T4main1UV13TangentVectorV call void @llvm.lifetime.start.p0(i64 9, ptr %3) call void @llvm.lifetime.start.p0(i64 25, ptr %4) call void @llvm.lifetime.start.p0(i64 9, ptr %5) call void @llvm.lifetime.start.p0(i64 25, ptr %6) call void @llvm.lifetime.start.p0(i64 9, ptr %7) call void @llvm.lifetime.start.p0(i64 25, ptr %8) call void @llvm.lifetime.start.p0(i64 9, ptr %9) call void @llvm.lifetime.start.p0(i64 25, ptr %10) %.u1 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 0 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %4, ptr align 8 %.u1, i64 25, i1 false) %.u11 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 0 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %6, ptr align 8 %.u11, i64 25, i1 false) call void @llvm.dbg.value(metadata ptr %4, metadata !62, metadata !DIExpression(DW_OP_deref)), !dbg !75 %.s = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 0 %.s.c = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 0 %11 = load ptr, ptr %.s.c %.s.b = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 1 %.s.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s.b, i32 0, i32 0 %12 = load i8, ptr %.s.b._value %.s2 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 0 %.s2.c = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 0 %13 = load ptr, ptr %.s2.c %.s2.b = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 1 %.s2.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s2.b, i32 0, i32 0 %14 = load i8, ptr %.s2.b._value %.v = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %3, ptr align 8 %.v, i64 9, i1 false) %.v3 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %5, ptr align 8 %.v3, i64 9, i1 false) %.s4 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %3, i32 0, i32 0 %.s4.c = getelementptr inbounds %T1M1SVySfG, ptr %.s4, i32 0, i32 0 %18 = load ptr, ptr %.s4.c %.s5 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %5, i32 0, i32 0 %.s5.c = getelementptr inbounds %T1M1SVySfG, ptr %.s5, i32 0, i32 0 %20 = load ptr, ptr %.s5.c %.u2 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %8, ptr align 8 %.u2, i64 25, i1 false) %.u26 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %10, ptr align 8 %.u26, i64 25, i1 false) %.s7 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 0 %.s7.c = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 0 %25 = load ptr, ptr %.s7.c %.s7.b = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 1 %.s7.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s7.b, i32 0, i32 0 %26 = load i8, ptr %.s7.b._value %.s8 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 0 %.s8.c = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 0 %27 = load ptr, ptr %.s8.c %.s8.b = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 1 %.s8.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s8.b, i32 0, i32 0 %28 = load i8, ptr %.s8.b._value %.v9 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %7, ptr align 8 %.v9, i64 9, i1 false) %.v10 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %9, ptr align 8 %.v10, i64 9, i1 false) %.s11 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %7, i32 0, i32 0 %.s11.c = getelementptr inbounds %T1M1SVySfG, ptr %.s11, i32 0, i32 0 %32 = load ptr, ptr %.s11.c %.s12 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %9, i32 0, i32 0 %.s12.c = getelementptr inbounds %T1M1SVySfG, ptr %.s12, i32 0, i32 0 %34 = load ptr, ptr %.s12.c call void @llvm.lifetime.end.p0(i64 25, ptr %10) call void @llvm.lifetime.end.p0(i64 9, ptr %9) call void @llvm.lifetime.end.p0(i64 25, ptr %8) call void @llvm.lifetime.end.p0(i64 9, ptr %7) call void @llvm.lifetime.end.p0(i64 25, ptr %6) call void @llvm.lifetime.end.p0(i64 9, ptr %5) call void @llvm.lifetime.end.p0(i64 25, ptr %4) call void @llvm.lifetime.end.p0(i64 9, ptr %3) ret void } !llvm.module.flags = !{!0, !1, !2, !3, !4, !6, !7, !8, !9, !10, !11, !12, !13, !14, !15} !swift.module.flags = !{!33} !llvm.linker.options = !{!34, !35, !36, !37, !38, !39, !40, !41, !42, !43} !0 = !{i32 2, !"SDK Version", [2 x i32] [i32 14, i32 4]} !1 = !{i32 1, !"Objective-C Version", i32 2} !2 = !{i32 1, !"Objective-C Image Info Version", i32 0} !3 = !{i32 1, !"Objective-C Image Info Section", !"__DATA, no_dead_strip"} !4 = !{i32 1, !"Objective-C Garbage Collection", i8 0} !6 = !{i32 7, !"Dwarf Version", i32 4} !7 = !{i32 2, !"Debug Info Version", i32 3} !8 = !{i32 1, !"wchar_size", i32 4} !9 = !{i32 8, !"PIC Level", i32 2} !10 = !{i32 7, !"uwtable", i32 1} !11 = !{i32 7, !"frame-pointer", i32 1} !12 = !{i32 1, !"Swift Version", i32 7} !13 = !{i32 1, !"Swift ABI Version", i32 7} !14 = !{i32 1, !"Swift Major Version", i8 6} !15 = !{i32 1, !"Swift Minor Version", i8 0} !16 = distinct !DICompileUnit(language: DW_LANG_Swift, file: !17, imports: !18, sdk: "MacOSX14.4.sdk") !17 = !DIFile(filename: "/Users/emilpedersen/swift2/swift/test/IRGen/debug_scope_distinct.swift", directory: "/Users/emilpedersen/swift2") !18 = !{!19, !21, !23, !25, !27, !29, !31} !19 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !20, file: !17) !20 = !DIModule(scope: null, name: "main", includePath: "/Users/emilpedersen/swift2/swift/test/IRGen") !21 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !22, file: !17) !22 = !DIModule(scope: null, name: "Swift", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/Swift.swiftmodule/arm64-apple-macos.swiftmodule") !23 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !24, line: 60) !24 = !DIModule(scope: null, name: "_Differentiation", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Differentiation.swiftmodule/arm64-apple-macos.swiftmodule") !25 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !26, line: 61) !26 = !DIModule(scope: null, name: "M", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/test-macosx-arm64/IRGen/Output/debug_scope_distinct.swift.tmp/M.swiftmodule") !27 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !28, file: !17) !28 = !DIModule(scope: null, name: "_StringProcessing", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_StringProcessing.swiftmodule/arm64-apple-macos.swiftmodule") !29 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !30, file: !17) !30 = !DIModule(scope: null, name: "_SwiftConcurrencyShims", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/shims") !31 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !32, file: !17) !32 = !DIModule(scope: null, name: "_Concurrency", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Concurrency.swiftmodule/arm64-apple-macos.swiftmodule") !33 = !{i1 false} !34 = !{!"-lswiftCore"} !35 = !{!"-lswift_StringProcessing"} !36 = !{!"-lswift_Differentiation"} !37 = !{!"-lswiftDarwin"} !38 = !{!"-lswift_Concurrency"} !39 = !{!"-lswiftSwiftOnoneSupport"} !40 = !{!"-lobjc"} !41 = !{!"-lswiftCompatibilityConcurrency"} !42 = !{!"-lswiftCompatibility56"} !43 = !{!"-lswiftCompatibilityPacks"} !44 = distinct !DISubprogram( unit: !16, declaration: !52, retainedNodes: !53) !45 = !DIFile(filename: "<compiler-generated>", directory: "/") !46 = !DICompositeType(tag: DW_TAG_structure_type, scope: !47, elements: !48, identifier: "$s4main1TV13TangentVectorVD") !47 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TVD") !48 = !{} !49 = !DISubroutineType(types: !50) !50 = !{!51} !51 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TV13TangentVectorVXMtD") !52 = !DISubprogram( file: !45, type: !49, spFlags: DISPFlagOptimized) !53 = !{!54, !56, !57} !54 = !DILocalVariable( scope: !44, type: !55, flags: DIFlagArtificial) !55 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !46) !56 = !DILocalVariable( scope: !44, flags: DIFlagArtificial) !57 = !DILocalVariable( scope: !44, type: !58, flags: DIFlagArtificial) !58 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !51) !62 = !DILocalVariable( scope: !63, type: !72, flags: DIFlagArtificial) !63 = distinct !DISubprogram( type: !66, unit: !16, declaration: !69, retainedNodes: !70) !64 = !DICompositeType(tag: DW_TAG_structure_type, scope: !65, identifier: "$s4main1UV13TangentVectorVD") !65 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UVD") !66 = !DISubroutineType(types: !67) !67 = !{!68} !68 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UV13TangentVectorVXMtD") !69 = !DISubprogram( spFlags: DISPFlagOptimized) !70 = !{!71, !73} !71 = !DILocalVariable( scope: !63, flags: DIFlagArtificial) !72 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !64) !73 = !DILocalVariable( scope: !63, type: !74, flags: DIFlagArtificial) !74 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !68) !75 = !DILocation( scope: !63, inlinedAt: !76) !76 = distinct !DILocation( scope: !44) ``` if we run ` opt -S -passes=sroa file.ll -o -` With this patch we will see ``` %.sroa.5.sroa.021 = alloca [7 x i8], align 8 tail call void @llvm.dbg.value(metadata ptr %.sroa.5.sroa.021, metadata !59, metadata !DIExpression(DW_OP_deref, DW_OP_LLVM_fragment, 72, 56)), !dbg !72 %.sroa.5.sroa.014 = alloca [7 x i8], align 8 ``` Without this patch we will see: ``` %.sroa.5.sroa.021 = alloca [7 x i8], align 8 %.sroa.5.sroa.014 = alloca [7 x i8], align 8 ``` Thus this patch ensures that llvm.dbg.values that use allocas that are broken up still have the correct metadata and debug information is preserved This is part of a stack of patches and is preceded by: https://github.com/llvm/llvm-project/pull/94068
2024-08-08[DebugInfo][RemoveDIs] Use iterators to insert everywhere (#102003)Jeremy Morse
These are the final few places in LLVM where we use instruction pointers to identify the position that we're inserting something. We're trying to get away from that with a view to deprecating those methods, thus use iterators in all these places. I believe they're all debug-info safe. The sketchiest part is the ExtractValueInst copy constructor, where we cast nullptr to a BasicBlock pointer, so that we take the non-default insert-into-no-block path for instruction insertion, instead of the default nullptr-instruction path for UnaryInstruction. Such a hack is necessary until we get rid of the instruction constructor entirely.
2024-08-01[Mem2Reg] Replace block maps with block numbers (#101391)Alexis Engelke
Very minor performance improvement.
2024-07-05[Mem2Reg] Always allow single-store optimization for dominating storesNikita Popov
In #97711 the single-store optimization was disabled for the case where the value is potentially poison, as this may produce incorrect results for loads of uninitialized memory. However, this resulted in compile-time regressions. Address these by still allowing the single-store optimization to occur in cases where the store dominates the load, as we know that such a load will always read initialized memory.
2024-07-04[Mem2Reg] Don't use single store optimization for potentially poison value ↵Nikita Popov
(#97711) If there is a single store, then loads must either load the stored value or uninitialized memory (undef). If the stored value may be poison, then replacing an uninitialized memory load with it would be incorrect. Fall back to the generic code in that case. This PR only fixes the case where there is a literal poison store -- the case where the value is non-trivially poison will still get miscompiled by phi simplification later, see #96631. Fixes https://github.com/llvm/llvm-project/issues/97702.
2024-07-02[SROA] Avoid expensive isComplete() call (NFC)Nikita Popov
https://github.com/llvm/llvm-project/pull/83381 introduced a call to PHINode::isComplete() in Mem2Reg, which is O(n^2) in the number of predecessors, resulting in pathological compile-time regressions for cases with many predecessors. Remove the isComplete() check and instead cache the attribute lookup, to only perform it once per function. Actually setting the FMF flag is cheap.
2024-07-02[SROA] Propagate no-signed-zeros(nsz) fast-math flag on the phi node using ↵Yashwant Singh
function attribute (#83381) Its expected that the sequence `return X > 0.0 ? X : -X`, compiled with -Ofast, produces fabs intrinsic. However, at this point, LLVM is unable to do so. The above sequence goes through the following transformation during the pass pipeline: 1) SROA pass generates the phi node. Here, it does not infer the fast-math flags on the phi node unlike clang frontend typically does. 2) Phi node eventually gets translated into select instruction. Because of missing no-signed-zeros(nsz) fast-math flag on the select instruction, InstCombine pass fails to fold the sequence into fabs intrinsic. This patch, as a part of SROA, tries to propagate nsz fast-math flag on the phi node using function attribute enabling this folding. Closes #51601 Co-authored-by: Sushant Gokhale <sgokhale@nvidia.com>
2024-06-28[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)Nikita Popov
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.
2024-06-25[Mem2Reg] Generate non-terminator unreachable for !noundef undef (#96639)Nikita Popov
When performing a load from uninitialized memory using !noundef, insert a non-terminator unreachable instruction, which will be converted to a proper unreachable by SimplifyCFG. This way we retain the fact that UB occurred on this code path.
2024-04-16[ValueTracking] Restore isKnownNonZero parameter order. (#88873)Harald van Dijk
Prior to #85863, the required parameters of llvm::isKnownNonZero were Value and DataLayout. After, they are Value, Depth, and SimplifyQuery, where SimplifyQuery is implicitly constructible from DataLayout. The change to move Depth before SimplifyQuery needed callers to be updated unnecessarily, and as commented in #85863, we actually want Depth to be after SimplifyQuery anyway so that it can be defaulted and the caller does not need to specify it.
2024-04-12[ValueTracking] Convert `isKnownNonZero` to use SimplifyQuery (#85863)Yingwei Zheng
This patch converts `isKnownNonZero` to use SimplifyQuery. Then we can use the context information from `DomCondCache`. Fixes https://github.com/llvm/llvm-project/issues/85823. Alive2: https://alive2.llvm.org/ce/z/QUvHVj
2024-03-19[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)Stephen Tozer
This is the major rename patch that prior patches have built towards. The DPValue class is being renamed to DbgVariableRecord, which reflects the updated terminology for the "final" implementation of the RemoveDI feature. This is a pure string substitution + clang-format patch. The only manual component of this patch was determining where to perform these string substitutions: `DPValue` and `DPV` are almost exclusively used for DbgRecords, *except* for: - llvm/lib/target, where 'DP' is used to mean double-precision, and so appears as part of .td files and in variable names. NB: There is a single existing use of `DPValue` here that refers to debug info, which I've manually updated. - llvm/tools/gold, where 'LDPV' is used as a prefix for symbol visibility enums. Outside of these places, I've applied several basic string substitutions, with the intent that they only affect DbgRecord-related identifiers; I've checked them as I went through to verify this, with reasonable confidence that there are no unintended changes that slipped through the cracks. The substitutions applied are all case-sensitive, and are applied in the order shown: ``` DPValue -> DbgVariableRecord DPVal -> DbgVarRec DPV -> DVR ``` Following the previous rename patches, it should be the case that there are no instances of any of these strings that are meant to refer to the general case of DbgRecords, or anything other than the DPValue class. The idea behind this patch is therefore that pure string substitution is correct in all cases as long as these assumptions hold.
2024-03-12[RemoveDIs] Update DIBuilder to conditionally insert DbgRecords (#84739)Orlando Cazalet-Hyams
Have DIBuilder conditionally insert either debug intrinsics or DbgRecord depending on the module's IsNewDbgInfoFormat flag. The insertion methods now return a `DbgInstPtr` (a `PointerUnion<Instruction *, DbgRecord *>`). Add a unittest for both modes (I couldn't find an existing test testing insertion behaviours specifically). This patch changes the existing assumption that DbgRecords are only ever inserted if there's an instruction to insert-before because clang currently inserts debug intrinsics while CodeGening (like any other instruction) meaning it'll try inserting to the end of a block without a terminator. We already have machinery in place to maintain the DbgRecords when a terminator is removed - these become "trailing DbgRecords" which are re-attached when a new instruction is inserted. All I've done is allow this state to occur while inserting DbgRecords too, i.e., it's not only removing terminators that causes this valid transient state, but inserting DbgRecords into incomplete blocks too. The C API will be updated in follow up patches. --- Note: this doesn't mean clang is emitting DbgRecords yet, because the modules it creates are still always in the old debug mode. That will come in a future patch.
2024-01-23[RemoveDIs][DebugInfo] Handle DPVAssign in most transforms (#78986)Stephen Tozer
This patch trivially updates various opt passes to handle DPVAssigns. In all cases, this means some combination of generifying existing code to handle DPValues and DbgAssignIntrinsics, iterating over DPValues where previously we did not, or duplicating code for DbgAssignIntrinsics to the equivalent DPValue function (in inlining and salvageDebugInfo).
2023-11-30[DebugInfo][RemoveDIs] Handle DPValues at remaining dbg.value using sites ↵Jeremy Morse
(#73788) This patch updates the last few places in LLVM using findDbgValues that don't also collect and handle DPValue objects. This largely involves instcombine and mem2reg changes, and are largely mechanical, calling existing utilities on collections of DPValues instead of just DbgValuesInsts. A variety of tests have had RemoveDIs RUN lines added to them to cover these behaviours. We have some technical debt of the instcombine sinking code for DPValues not being implemented yet, so I've left FIXME stubs indicating that we intend to cover tests with RemoveDIs but haven't yet.
2023-09-11[NFC][RemoveDIs] Prefer iterator-insertion over instructionsJeremy Morse
Continuing the patch series to get rid of debug intrinsics [0], instruction insertion needs to be done with iterators rather than instruction pointers, so that we can communicate information in the iterator class. This patch adds an iterator-taking insertBefore method and converts various call sites to take iterators. These are all sites where such debug-info needs to be preserved so that a stage2 clang can be built identically; it's likely that many more will need to be changed in the future. At this stage, this is just changing the spelling of a few operations, which will eventually become signifiant once the debug-info bearing iterator is used. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152537
2023-06-10PromoteMem2Reg: use poison instead of undef as placeholder in phi entries ↵Nuno Lopes
from unreachable predecessors [NFC]
2023-03-22[Assignment Tracking] Fix mem2reg misidentifying unlinked storesOCHyams
updateForDeletedStore updates the assignment tracking debug info for a store that is about to be deleted by mem2reg. For each variable backed by the target alloca, if a dbg.assign exists it is kept (well - it's downgraded to a dbg.value). A dbg.value is inserted if there's not a linked dbg.assign for a variable which is backed by the target alloca. This patch fixes a bug whereby a store with a linked dbg.assign that describes a fragment different to the one linked to the alloca was not counted for the variable, leading to both keeping the dbg.assign (downgrading it) and inserting a new dbg.value. Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D146299
2023-03-21[Assignment Tracking] Downgrade dbg.assigns to dbg.values in mem2regOCHyams
For fully promoted variables dbg.assigns and dbg.values convey the same information and can be used interchangeably. This patch converts dbg.assigns to dbg.values for variables promoted by mem2reg. This reduces resource usage by reducing the amount of unnecessary function local metadata. The compile time tracker reports that CTMark projects build with LTO-O3-g with 0.4% fewer instructions retired and peak memory usage is reduced by 2.2%. Reviewed By: jryans Differential Revision: https://reviews.llvm.org/D145511
2023-02-10[Assignment Tracking][mem2reg] Remove overly defensive assertOCHyams
The assert fires if a store to an alloca with no linked dbg.assigns has linked dbg.assigns. This can happen in the wild due to optimisations dropping the alloca's debug info so we shouldn't assert against it. Reviewed By: jryans Differential Revision: https://reviews.llvm.org/D143153
2023-01-20[Mem2Reg] Only convert !nonnull to assume if !noundef presentNikita Popov
After D141386 !nonnull violation returns poison rather than resulting in immediate undefined behavior. However, converting it into an assume would result in IUB. As such, we can only perform this transform if !noundef is also present.
2023-01-12[Mem2Reg] Extract code for converting !nonull to assume (NFC)Nikita Popov
2022-11-23[NFC] Replaced BB->getInstList().{erase(),pop_front(),pop_back()} with ↵Vasileios Porpodas
eraseFromParent(). Differential Revision: https://reviews.llvm.org/D138617
2022-11-15[Assignment Tracking][12/*] Account for assignment tracking in mem2regOCHyams
The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir The changes for assignment tracking in mem2reg don't require much of a deviation from existing behaviour. dbg.assign intrinsics linked to an alloca are treated much in the same way as dbg.declare users of an alloca, except that we don't insert dbg.value intrinsics to describe assignments when there is already a dbg.assign intrinsic present, e.g. one linked to a store that is going to be removed. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D133295
2022-07-12[Mem2Reg] Consistently preserve nonnull assume for uninit loadNikita Popov
When performing a !nonnull load from uninitialized memory, we should preserve the nonnull assume just like in all other cases. We already do this correctly in the generic mem2reg code, but don't handle this case when using the optimized single-block implementation. Make sure that the optimized implementation exhibits the same behavior as the generic implementation.
2022-06-09[NFC] format InstructionSimplify & lowerCaseFunctionNamesSimon Moll
Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889
2022-04-25[NFC] Rename Instrinsic to IntrinsicDavid Green
2022-03-01Cleanup includes: TransformsUtilsserge-sans-paille
Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120741
2022-02-08[Mem2Reg] Check that load type matches alloca typeNikita Popov
Alloca promotion can only deal with cases where the load/store types match the alloca type (it explicitly does not support bitcasted load/stores). With opaque pointers this is no longer enforced through the pointer type, so add an explicit check.
2022-02-02Cleanup header dependencies in LLVMCoreserge-sans-paille
Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652
2021-09-08[SROA] Support opaque pointersNikita Popov
Make the following changes in order to support opaque pointers in SROA: * Generate i8 GEPs for opaque pointers. * Explicitly enforce that promotable allocas only have stores of the alloca type -- previously this was implicitly enforced. * Replace a check for pointer element type with load/store type. Differential Revision: https://reviews.llvm.org/D109259
2021-06-21[Mem2Reg] Use poison for unreachable casesNikita Popov
Use poison instead of undef for cases dealing with unreachable code. This still leaves the more interesting case of "load from uninitialized memory" as undef.
2021-04-06Add a subclass of IntrinsicInst for llvm.assume [nfc]Philip Reames
Add the subclass, update a few places which check for the intrinsic to use idiomatic dyn_cast, and update the public interface of AssumptionCache to use the new class. A follow up change will do the same for the newer assumption query/bundle mechanisms.