| Age | Commit message (Collapse) | Author |
|
In PromoteMem2Reg, we perform a DFS over the CFG and track, for each
alloca, its incoming value and its associated incoming DebugLoc, both of
which are taken from stores to that alloca; these values and DebugLocs
are propagated to PHI nodes when new blocks are reached. In the event
that for one incoming edge no store instruction has been seen, we
propagate an UndefValue and an empty DebugLoc to the PHI.
This is a perfectly valid occurrence, and assigning an empty DebugLoc to
the PHI is the correct course of action; therefore, we should pass an
annotated DebugLoc instead, so that in DebugLoc coverage tracking we
correctly do not expect a valid DebugLoc to be present; we generally
mark allocas as having CompilerGenerated locations, so I've chosen to
use the same annotation to represent the uninitialized value of that
alloca.
This change is NFC outside of DebugLoc coverage tracking builds.
|
|
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>. Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:
template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};
We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
|
|
This is one of the final remaining debug-intrinsic specific codepaths
out there, and pieces of cross-LLVM infrastructure to do with debug
intrinsics.
|
|
At this stage I'm just opportunistically deleting any code using
debug-intrinsic types, largely adjacent to calls to findDbgUsers. I'll
get to deleting that in probably one or more two commits.
|
|
SROA and a few other facilities use generic-lambdas and some overloaded
functions to deal with both intrinsics and debug-records at the same time.
As part of stripping out intrinsic support, delete a swathe of this code
from things in the Utils directory.
This is a large diff, but is mostly about removing functions that were
duplicated during the migration to debug records. I've taken a few
opportunities to replace comments about "intrinsics" with "records",
and replace generic lambdas with plain lambdas (I believe this makes
it more readable).
All of this is chipping away at intrinsic-specific code until we get to
removing parts of findDbgUsers, which is the final boss -- we can't
remove that until almost everything else is gone.
|
|
When BasicBlock has a large number of allocas, and
successors, we had to copy entire IncomingVals and
IncomingLocs vectors for successors.
Also updates to IncomingVals and
IncomingLocs are infrequent (only Load/Store into
alloca affect arrays).
Given the nature of DFS traversal, instead of copying
the entire vector, we can keep track of the changes
and undo all changes done by successors.
Fixes #142461
On the attached to issue #142461 IR RSS drops from 35Gb to 1.8Gb.
But it does not affect compile time on average
https://llvm-compile-time-tracker.com/compare.php?from=2e98ed8caa0b47ee79af4ad24b5436a89fe49dfa&to=effac6d1fd600e544f8bc21382c7e541973b1378&stat=instructions:u
|
|
(#142468)
They are all DFS state related, as `Visited`. But `Visited` is already a
class member, so we make things more consistent and less
parameters to pass around.
By itself, the patch has little value, but it simplifies stuff in the
#142474.
For #142461
|
|
Just for consistency, to avoid confusing conditions.
`reverse` helps to avoid tests updates as nothing is
changing for for successors count <=2.
For #142461
|
|
'goto' is essentially a shortcut for push/pop for worklist.
It can be expensive if we copy vectors, but if we move them, it
should not be an issue.
Without 'goto' it's easier to reason about the
code, when `PromoteMem2Reg::RenamePass` processes
exactly one edge at a time.
There is out of order processing of the first
successor, I keep it just to make this patch pure NFC. I'll
remove this in follow up patches.
For #142461
|
|
|
|
|
|
After #124287 updated several functions to return iterators rather than
Instruction *, it was no longer straightforward to pass their result to
DIBuilder. This commit updates DIBuilder methods to accept an
InsertPosition instead, so that they can be called with an iterator
(preferred), or with a deprecation warning an Instruction *, or a
BasicBlock *. This commit also updates the existing calls to the
DIBuilder methods to pass in iterators.
As a special exception, DIBuilder::insertDeclare() keeps a separate
overload accepting a BasicBlock *InsertAtEnd. This is because despite
the name, this method does not insert at the end of the block, therefore
this cannot be handled implicitly by using InsertPosition.
|
|
This reverts commit 3ec9f7494b31f2fe51d5ed0e07adcf4b7199def6.
|
|
After #124287 updated several functions to return iterators rather than
Instruction *, it was no longer straightforward to pass their result to
DIBuilder. This commit updates DIBuilder methods to accept an
InsertPosition instead, so that they can be called with an iterator
(preferred), or with a deprecation warning an Instruction *, or a
BasicBlock *. This commit also updates the existing calls to the
DIBuilder methods to pass in iterators.
|
|
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
|
|
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
|
|
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:
https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850
This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).
Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
|
|
This patch properly handles #dbg_values in SROA by making sure that any
#dbg_values get moved to before a store just like #dbg_declares do, or
the #dbg_value is correctly updated with the right alloca after an
aggregate alloca is broken up.
The issue stems from swift where #dbg_values are emitted and not
dbg.declares, the SROA pass doesn't handle the #dbg_values correctly and
it causes them to all have undefs
If we look at this simple-ish testcase (This is all I could reduce it
down to, and I am still relatively bad at writing llvm IR by hand so I
apologize in advance):
```
%T4main1TV13TangentVectorV = type <{ %T4main1UV13TangentVectorV, [7 x i8], %T4main1UV13TangentVectorV }>
%T4main1UV13TangentVectorV = type <{ %T1M1SVySfG, [7 x i8], %T4main1VV13TangentVectorV }>
%T1M1SVySfG = type <{ ptr, %Ts4Int8V }>
%Ts4Int8V = type <{ i8 }>
%T4main1VV13TangentVectorV = type <{ %T1M1SVySfG }>
define hidden swiftcc void @"$s4main1TV13TangentVectorV1poiyA2E_AEtFZ"(ptr noalias nocapture sret(%T4main1TV13TangentVectorV) %0, ptr noalias nocapture dereferenceable(57) %1, ptr noalias nocapture dereferenceable(57) %2) #0 !dbg !44 {
entry:
%3 = alloca %T4main1VV13TangentVectorV
%4 = alloca %T4main1UV13TangentVectorV
%5 = alloca %T4main1VV13TangentVectorV
%6 = alloca %T4main1UV13TangentVectorV
%7 = alloca %T4main1VV13TangentVectorV
%8 = alloca %T4main1UV13TangentVectorV
%9 = alloca %T4main1VV13TangentVectorV
%10 = alloca %T4main1UV13TangentVectorV
call void @llvm.lifetime.start.p0(i64 9, ptr %3)
call void @llvm.lifetime.start.p0(i64 25, ptr %4)
call void @llvm.lifetime.start.p0(i64 9, ptr %5)
call void @llvm.lifetime.start.p0(i64 25, ptr %6)
call void @llvm.lifetime.start.p0(i64 9, ptr %7)
call void @llvm.lifetime.start.p0(i64 25, ptr %8)
call void @llvm.lifetime.start.p0(i64 9, ptr %9)
call void @llvm.lifetime.start.p0(i64 25, ptr %10)
%.u1 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 0
call void @llvm.memcpy.p0.p0.i64(ptr align 8 %4, ptr align 8 %.u1, i64 25, i1 false)
%.u11 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 0
call void @llvm.memcpy.p0.p0.i64(ptr align 8 %6, ptr align 8 %.u11, i64 25, i1 false)
call void @llvm.dbg.value(metadata ptr %4, metadata !62, metadata !DIExpression(DW_OP_deref)), !dbg !75
%.s = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 0
%.s.c = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 0
%11 = load ptr, ptr %.s.c
%.s.b = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 1
%.s.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s.b, i32 0, i32 0
%12 = load i8, ptr %.s.b._value
%.s2 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 0
%.s2.c = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 0
%13 = load ptr, ptr %.s2.c
%.s2.b = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 1
%.s2.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s2.b, i32 0, i32 0
%14 = load i8, ptr %.s2.b._value
%.v = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 2
call void @llvm.memcpy.p0.p0.i64(ptr align 8 %3, ptr align 8 %.v, i64 9, i1 false)
%.v3 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 2
call void @llvm.memcpy.p0.p0.i64(ptr align 8 %5, ptr align 8 %.v3, i64 9, i1 false)
%.s4 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %3, i32 0, i32 0
%.s4.c = getelementptr inbounds %T1M1SVySfG, ptr %.s4, i32 0, i32 0
%18 = load ptr, ptr %.s4.c
%.s5 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %5, i32 0, i32 0
%.s5.c = getelementptr inbounds %T1M1SVySfG, ptr %.s5, i32 0, i32 0
%20 = load ptr, ptr %.s5.c
%.u2 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 2
call void @llvm.memcpy.p0.p0.i64(ptr align 8 %8, ptr align 8 %.u2, i64 25, i1 false)
%.u26 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 2
call void @llvm.memcpy.p0.p0.i64(ptr align 8 %10, ptr align 8 %.u26, i64 25, i1 false)
%.s7 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 0
%.s7.c = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 0
%25 = load ptr, ptr %.s7.c
%.s7.b = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 1
%.s7.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s7.b, i32 0, i32 0
%26 = load i8, ptr %.s7.b._value
%.s8 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 0
%.s8.c = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 0
%27 = load ptr, ptr %.s8.c
%.s8.b = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 1
%.s8.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s8.b, i32 0, i32 0
%28 = load i8, ptr %.s8.b._value
%.v9 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 2
call void @llvm.memcpy.p0.p0.i64(ptr align 8 %7, ptr align 8 %.v9, i64 9, i1 false)
%.v10 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 2
call void @llvm.memcpy.p0.p0.i64(ptr align 8 %9, ptr align 8 %.v10, i64 9, i1 false)
%.s11 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %7, i32 0, i32 0
%.s11.c = getelementptr inbounds %T1M1SVySfG, ptr %.s11, i32 0, i32 0
%32 = load ptr, ptr %.s11.c
%.s12 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %9, i32 0, i32 0
%.s12.c = getelementptr inbounds %T1M1SVySfG, ptr %.s12, i32 0, i32 0
%34 = load ptr, ptr %.s12.c
call void @llvm.lifetime.end.p0(i64 25, ptr %10)
call void @llvm.lifetime.end.p0(i64 9, ptr %9)
call void @llvm.lifetime.end.p0(i64 25, ptr %8)
call void @llvm.lifetime.end.p0(i64 9, ptr %7)
call void @llvm.lifetime.end.p0(i64 25, ptr %6)
call void @llvm.lifetime.end.p0(i64 9, ptr %5)
call void @llvm.lifetime.end.p0(i64 25, ptr %4)
call void @llvm.lifetime.end.p0(i64 9, ptr %3)
ret void
}
!llvm.module.flags = !{!0, !1, !2, !3, !4, !6, !7, !8, !9, !10, !11, !12, !13, !14, !15}
!swift.module.flags = !{!33}
!llvm.linker.options = !{!34, !35, !36, !37, !38, !39, !40, !41, !42, !43}
!0 = !{i32 2, !"SDK Version", [2 x i32] [i32 14, i32 4]}
!1 = !{i32 1, !"Objective-C Version", i32 2}
!2 = !{i32 1, !"Objective-C Image Info Version", i32 0}
!3 = !{i32 1, !"Objective-C Image Info Section", !"__DATA, no_dead_strip"}
!4 = !{i32 1, !"Objective-C Garbage Collection", i8 0}
!6 = !{i32 7, !"Dwarf Version", i32 4}
!7 = !{i32 2, !"Debug Info Version", i32 3}
!8 = !{i32 1, !"wchar_size", i32 4}
!9 = !{i32 8, !"PIC Level", i32 2}
!10 = !{i32 7, !"uwtable", i32 1}
!11 = !{i32 7, !"frame-pointer", i32 1}
!12 = !{i32 1, !"Swift Version", i32 7}
!13 = !{i32 1, !"Swift ABI Version", i32 7}
!14 = !{i32 1, !"Swift Major Version", i8 6}
!15 = !{i32 1, !"Swift Minor Version", i8 0}
!16 = distinct !DICompileUnit(language: DW_LANG_Swift, file: !17, imports: !18, sdk: "MacOSX14.4.sdk")
!17 = !DIFile(filename: "/Users/emilpedersen/swift2/swift/test/IRGen/debug_scope_distinct.swift", directory: "/Users/emilpedersen/swift2")
!18 = !{!19, !21, !23, !25, !27, !29, !31}
!19 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !20, file: !17)
!20 = !DIModule(scope: null, name: "main", includePath: "/Users/emilpedersen/swift2/swift/test/IRGen")
!21 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !22, file: !17)
!22 = !DIModule(scope: null, name: "Swift", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/Swift.swiftmodule/arm64-apple-macos.swiftmodule")
!23 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !24, line: 60)
!24 = !DIModule(scope: null, name: "_Differentiation", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Differentiation.swiftmodule/arm64-apple-macos.swiftmodule")
!25 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !26, line: 61)
!26 = !DIModule(scope: null, name: "M", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/test-macosx-arm64/IRGen/Output/debug_scope_distinct.swift.tmp/M.swiftmodule")
!27 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !28, file: !17)
!28 = !DIModule(scope: null, name: "_StringProcessing", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_StringProcessing.swiftmodule/arm64-apple-macos.swiftmodule")
!29 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !30, file: !17)
!30 = !DIModule(scope: null, name: "_SwiftConcurrencyShims", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/shims")
!31 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !32, file: !17)
!32 = !DIModule(scope: null, name: "_Concurrency", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Concurrency.swiftmodule/arm64-apple-macos.swiftmodule")
!33 = !{i1 false}
!34 = !{!"-lswiftCore"}
!35 = !{!"-lswift_StringProcessing"}
!36 = !{!"-lswift_Differentiation"}
!37 = !{!"-lswiftDarwin"}
!38 = !{!"-lswift_Concurrency"}
!39 = !{!"-lswiftSwiftOnoneSupport"}
!40 = !{!"-lobjc"}
!41 = !{!"-lswiftCompatibilityConcurrency"}
!42 = !{!"-lswiftCompatibility56"}
!43 = !{!"-lswiftCompatibilityPacks"}
!44 = distinct !DISubprogram( unit: !16, declaration: !52, retainedNodes: !53)
!45 = !DIFile(filename: "<compiler-generated>", directory: "/")
!46 = !DICompositeType(tag: DW_TAG_structure_type, scope: !47, elements: !48, identifier: "$s4main1TV13TangentVectorVD")
!47 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TVD")
!48 = !{}
!49 = !DISubroutineType(types: !50)
!50 = !{!51}
!51 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TV13TangentVectorVXMtD")
!52 = !DISubprogram( file: !45, type: !49, spFlags: DISPFlagOptimized)
!53 = !{!54, !56, !57}
!54 = !DILocalVariable( scope: !44, type: !55, flags: DIFlagArtificial)
!55 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !46)
!56 = !DILocalVariable( scope: !44, flags: DIFlagArtificial)
!57 = !DILocalVariable( scope: !44, type: !58, flags: DIFlagArtificial)
!58 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !51)
!62 = !DILocalVariable( scope: !63, type: !72, flags: DIFlagArtificial)
!63 = distinct !DISubprogram( type: !66, unit: !16, declaration: !69, retainedNodes: !70)
!64 = !DICompositeType(tag: DW_TAG_structure_type, scope: !65, identifier: "$s4main1UV13TangentVectorVD")
!65 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UVD")
!66 = !DISubroutineType(types: !67)
!67 = !{!68}
!68 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UV13TangentVectorVXMtD")
!69 = !DISubprogram( spFlags: DISPFlagOptimized)
!70 = !{!71, !73}
!71 = !DILocalVariable( scope: !63, flags: DIFlagArtificial)
!72 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !64)
!73 = !DILocalVariable( scope: !63, type: !74, flags: DIFlagArtificial)
!74 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !68)
!75 = !DILocation( scope: !63, inlinedAt: !76)
!76 = distinct !DILocation( scope: !44)
```
if we run
` opt -S -passes=sroa file.ll -o -`
With this patch we will see
```
%.sroa.5.sroa.021 = alloca [7 x i8], align 8
tail call void @llvm.dbg.value(metadata ptr %.sroa.5.sroa.021, metadata !59, metadata !DIExpression(DW_OP_deref, DW_OP_LLVM_fragment, 72, 56)), !dbg !72
%.sroa.5.sroa.014 = alloca [7 x i8], align 8
```
Without this patch we will see:
```
%.sroa.5.sroa.021 = alloca [7 x i8], align 8
%.sroa.5.sroa.014 = alloca [7 x i8], align 8
```
Thus this patch ensures that llvm.dbg.values that use allocas that are broken up still have the correct metadata and debug information is preserved
This is part of a stack of patches and is preceded by: https://github.com/llvm/llvm-project/pull/94068
|
|
These are the final few places in LLVM where we use instruction pointers
to identify the position that we're inserting something. We're trying to
get away from that with a view to deprecating those methods, thus use
iterators in all these places. I believe they're all debug-info safe.
The sketchiest part is the ExtractValueInst copy constructor, where we
cast nullptr to a BasicBlock pointer, so that we take the non-default
insert-into-no-block path for instruction insertion, instead of the
default nullptr-instruction path for UnaryInstruction. Such a hack is
necessary until we get rid of the instruction constructor entirely.
|
|
Very minor performance improvement.
|
|
In #97711 the single-store optimization was disabled for the case
where the value is potentially poison, as this may produce incorrect
results for loads of uninitialized memory.
However, this resulted in compile-time regressions. Address these
by still allowing the single-store optimization to occur in cases
where the store dominates the load, as we know that such a load
will always read initialized memory.
|
|
(#97711)
If there is a single store, then loads must either load the stored value
or uninitialized memory (undef). If the stored value may be poison, then
replacing an uninitialized memory load with it would be incorrect. Fall
back to the generic code in that case.
This PR only fixes the case where there is a literal poison store -- the
case where the value is non-trivially poison will still get miscompiled
by phi simplification later, see #96631.
Fixes https://github.com/llvm/llvm-project/issues/97702.
|
|
https://github.com/llvm/llvm-project/pull/83381 introduced a call
to PHINode::isComplete() in Mem2Reg, which is O(n^2) in the number
of predecessors, resulting in pathological compile-time regressions
for cases with many predecessors.
Remove the isComplete() check and instead cache the attribute
lookup, to only perform it once per function. Actually setting
the FMF flag is cheap.
|
|
function attribute (#83381)
Its expected that the sequence `return X > 0.0 ? X : -X`, compiled with
-Ofast, produces fabs intrinsic. However, at this point, LLVM is unable
to do so.
The above sequence goes through the following transformation during the
pass pipeline:
1) SROA pass generates the phi node. Here, it does not infer the
fast-math flags on the phi node unlike clang frontend typically does.
2) Phi node eventually gets translated into select instruction.
Because of missing no-signed-zeros(nsz) fast-math flag on the select
instruction, InstCombine pass fails to fold the sequence into fabs
intrinsic.
This patch, as a part of SROA, tries to propagate nsz fast-math flag on
the phi node using function attribute enabling this folding.
Closes #51601
Co-authored-by: Sushant Gokhale <sgokhale@nvidia.com>
|
|
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
|
|
When performing a load from uninitialized memory using !noundef, insert
a non-terminator unreachable instruction, which will be converted to a
proper unreachable by SimplifyCFG. This way we retain the fact that UB
occurred on this code path.
|
|
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
|
|
This patch converts `isKnownNonZero` to use SimplifyQuery. Then we can
use the context information from `DomCondCache`.
Fixes https://github.com/llvm/llvm-project/issues/85823.
Alive2: https://alive2.llvm.org/ce/z/QUvHVj
|
|
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:
- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.
Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:
```
DPValue -> DbgVariableRecord
DPVal -> DbgVarRec
DPV -> DVR
```
Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
|
|
Have DIBuilder conditionally insert either debug intrinsics or DbgRecord
depending on the module's IsNewDbgInfoFormat flag. The insertion methods
now return a `DbgInstPtr` (a `PointerUnion<Instruction *, DbgRecord
*>`).
Add a unittest for both modes (I couldn't find an existing test testing
insertion behaviours specifically).
This patch changes the existing assumption that DbgRecords are only ever
inserted if there's an instruction to insert-before because clang
currently inserts debug intrinsics while CodeGening (like any other
instruction) meaning it'll try inserting to the end of a block without a
terminator. We already have machinery in place to maintain the
DbgRecords when a terminator is removed - these become "trailing
DbgRecords" which are re-attached when a new instruction is inserted.
All I've done is allow this state to occur while inserting DbgRecords
too, i.e., it's not only removing terminators that causes this valid
transient state, but inserting DbgRecords into incomplete blocks too.
The C API will be updated in follow up patches.
---
Note: this doesn't mean clang is emitting DbgRecords yet, because the
modules it creates are still always in the old debug mode. That will
come in a future patch.
|
|
This patch trivially updates various opt passes to handle DPVAssigns. In
all cases, this means some combination of generifying existing code to
handle DPValues and DbgAssignIntrinsics, iterating over DPValues where
previously we did not, or duplicating code for DbgAssignIntrinsics to
the equivalent DPValue function (in inlining and salvageDebugInfo).
|
|
(#73788)
This patch updates the last few places in LLVM using findDbgValues that
don't also collect and handle DPValue objects. This largely involves
instcombine and mem2reg changes, and are largely mechanical, calling
existing utilities on collections of DPValues instead of just
DbgValuesInsts.
A variety of tests have had RemoveDIs RUN lines added to them to cover
these behaviours. We have some technical debt of the instcombine sinking
code for DPValues not being implemented yet, so I've left FIXME stubs
indicating that we intend to cover tests with RemoveDIs but haven't yet.
|
|
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.
At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
Differential Revision: https://reviews.llvm.org/D152537
|
|
from unreachable predecessors [NFC]
|
|
updateForDeletedStore updates the assignment tracking debug info for a store
that is about to be deleted by mem2reg. For each variable backed by the target
alloca, if a dbg.assign exists it is kept (well - it's downgraded to a
dbg.value). A dbg.value is inserted if there's not a linked dbg.assign for a
variable which is backed by the target alloca. This patch fixes a bug whereby a
store with a linked dbg.assign that describes a fragment different to the one
linked to the alloca was not counted for the variable, leading to both keeping
the dbg.assign (downgrading it) and inserting a new dbg.value.
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D146299
|
|
For fully promoted variables dbg.assigns and dbg.values convey the same
information and can be used interchangeably. This patch converts dbg.assigns to
dbg.values for variables promoted by mem2reg. This reduces resource usage by
reducing the amount of unnecessary function local metadata. The compile time
tracker reports that CTMark projects build with LTO-O3-g with 0.4% fewer
instructions retired and peak memory usage is reduced by 2.2%.
Reviewed By: jryans
Differential Revision: https://reviews.llvm.org/D145511
|
|
The assert fires if a store to an alloca with no linked dbg.assigns has linked
dbg.assigns. This can happen in the wild due to optimisations dropping the
alloca's debug info so we shouldn't assert against it.
Reviewed By: jryans
Differential Revision: https://reviews.llvm.org/D143153
|
|
After D141386 !nonnull violation returns poison rather than
resulting in immediate undefined behavior. However, converting
it into an assume would result in IUB. As such, we can only
perform this transform if !noundef is also present.
|
|
|
|
eraseFromParent().
Differential Revision: https://reviews.llvm.org/D138617
|
|
The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir
The changes for assignment tracking in mem2reg don't require much of a
deviation from existing behaviour. dbg.assign intrinsics linked to an alloca
are treated much in the same way as dbg.declare users of an alloca, except that
we don't insert dbg.value intrinsics to describe assignments when there is
already a dbg.assign intrinsic present, e.g. one linked to a store that is
going to be removed.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133295
|
|
When performing a !nonnull load from uninitialized memory, we
should preserve the nonnull assume just like in all other cases.
We already do this correctly in the generic mem2reg code, but
don't handle this case when using the optimized single-block
implementation.
Make sure that the optimized implementation exhibits the same
behavior as the generic implementation.
|
|
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName". This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.
This is the alternative to the less invasive clang-format only patch: D126783
Reviewed By: spatel, rengolin
Differential Revision: https://reviews.llvm.org/D126889
|
|
|
|
Estimation on the impact on preprocessor output:
before: 1065307662
after: 1064800684
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741
|
|
Alloca promotion can only deal with cases where the load/store
types match the alloca type (it explicitly does not support
bitcasted load/stores).
With opaque pointers this is no longer enforced through the pointer
type, so add an explicit check.
|
|
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/
I've tried to summarize the biggest change below:
- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h
And the usual count of preprocessed lines:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after: 6189948
200k lines less to process is no that bad ;-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118652
|
|
Make the following changes in order to support opaque pointers in SROA:
* Generate i8 GEPs for opaque pointers.
* Explicitly enforce that promotable allocas only have stores of
the alloca type -- previously this was implicitly enforced.
* Replace a check for pointer element type with load/store type.
Differential Revision: https://reviews.llvm.org/D109259
|
|
Use poison instead of undef for cases dealing with unreachable
code. This still leaves the more interesting case of "load from
uninitialized memory" as undef.
|
|
Add the subclass, update a few places which check for the intrinsic to use idiomatic dyn_cast, and update the public interface of AssumptionCache to use the new class. A follow up change will do the same for the newer assumption query/bundle mechanisms.
|