llvm-project.git/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp, branch main

[ThinLTO][WPD] LICM a loop invariant check (#164862)

2025-10-23T19:02:27+00:00

Move a loop invariant check out of the innermost loop. I measured a
small but consistent thin link speedup from this change for a large
target (0.75%).

[ThinLTO][WPD] Simplify check for local summary for efficiency (NFCI) (#164859)

2025-10-23T19:02:05+00:00

Use the new HasLocal flag to avoid looking through all summaries to see
if there is a local copy.

[WPD] Reduce ThinLTO link time by avoiding unnecessary summary analysis (#164046)

2025-10-23T14:16:45+00:00

We are scanning through every single definition of a vtable across all
translation units which is unnecessary in most cases.

If this is a local, we want to make sure there isn't another local with
the same GUID due to it having the same relative path. However, we were
always scanning through every single summary in all cases.

We can now check the new HasLocal flag added in PR164647 ahead of the
loop,
instead of checking on every iteration.

This cut down a large thin link by around 6%, which was over half the
time it spent in WPD.

Note that we previously took the last conforming vtable summary, and now
we use the first. This caused a test difference in one somewhat
contrived test for vtables in comdats.

[WPD]: Enable speculative devirtualizatoin. (#159048)

2025-10-22T11:16:11+00:00

This patch implements the speculative devirtualization feature in the
LLVM backend.
It handles the case of single implementation devirtualization where
there is a single possible callee of a virtual function.
- Add cl::opt 'devirtualize-speculatively' to enable it.
- Flag is disabled by default.
- It works regardless of the visibility of the object.
- Not enabled for LTO for now.

[ThinLTO] Make SummaryList private (NFC) (#164355)

2025-10-21T13:53:40+00:00

In preparation for a follow on change that will require checking every
time a new summary is added to the SummaryList for a GUID, make the
SummaryList private and require all accesses to go through one of two
new interfaces. Most changes are to access the list via the read only
getSummaryList() method, and the few that add new summaries (e.g. while
building the combined summary) use the new addSummary() method.

[NFC][LLVM] Use namespace qualifier to define DenseMapInfo specializations (#162683)

2025-10-10T12:23:44+00:00

Use `llvm::DenseMapInfo` to define `DenseMapInfo` specializations
instead of surrounding it with `namespace llvm {}`.

Cleanup the LLVM exported symbols namespace (#161240)

2025-10-01T22:32:07+00:00

There's a pattern throughout LLVM of cl::opts being exported. That in
itself is probably a bit unfortunate, but what's especially bad about it
is that a lot of those symbols are in the global namespace. Move them
into the llvm namespace.

While doing this, I noticed some other variables in the global namespace
and moved them as well.

[profcheck] Require `unknown` metadata have an origin parameter (#157594)

2025-09-10T22:34:35+00:00

Rather than passes using `!prof = !{!”unknown”}`for cases where don’t have enough information to emit profile values, this patch captures the pass (or some other information) that can help diagnostics - i.e. `!{!”unknown”, !”some-pass-name”}`.

For example, suppose we emitted a `select` with the unknown metadata, and, later, end up needing to lower that to a conditional branch. If we observe (via sample profiling, for example) that the branch is biased and would have benefitted from a valid profile, the extra information can help speed up debugging.

We can also (in a subsequent pass) generate optimization remarks about such lowered selects, with a similar aim - identify patterns lowering to `select` that may be worth some extra investment in extracting a more precise profile.

[WPD] set the branch funnel function entry count (#155657)

2025-09-08T23:41:33+00:00

We can compute the entry count of branch funnel functions, and potentially avoid them being deemed cold (also, keeping profile information coherent is always good for performance)

Issue #147390

[LLD][COFF] Add more `--time-trace` tags for ThinLTO linking (#156471)

2025-09-05T19:28:19+00:00

In order to better see what's going on during ThinLTO linking, this PR
adds more profile tags when using `--time-trace` on a `lld-link.exe`
invocation.

After PR, linking `clang.exe`:



Linking a custom (Unreal Engine game) binary gives a completly
different picture, probably because of using Unity files, and the sheer
amount of input files (here, providing over 60 GB of .OBJs/.LIBs).