| Age | Commit message (Collapse) | Author |
|
Implement MIR2Vec embedder for generating vector representations of Machine IR instructions, basic blocks, and functions. This patch introduces changes necessary to *embed* machine opcodes. Machine operands would be handled incrementally in the upcoming patches.
|
|
This PR introduces the initial infrastructure and vocabulary necessary for generating embeddings for MIR (discussed briefly in the earlier IR2Vec RFC - https://discourse.llvm.org/t/rfc-enhancing-mlgo-inlining-with-ir2vec-embeddings). The MIR2Vec embeddings are useful in driving target specific optimizations that work on MIR like register allocation.
(Tracking issue - #141817)
|
|
In this commit:
(1) Added new pass manager support for `ReachingDefAnalysis`.
(2) Added printer pass.
(3) Make old pass manager use `ReachingDefInfoWrapperPass`
|
|
Replace its only use in the AMDGPU R600 backend with a call to
finalizeBundles.
|
|
same as https://github.com/llvm/llvm-project/pull/138829
Co-authored-by : Oke, Akshat
<[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Didn't find a test for this (but there are tests for the `Function`
version of this pass)
|
|
|
|
This completes the PreEmitPasses.
|
|
|
|
|
|
or unlikely prefixed sections based on profile information (#125756)
In this PR, static-data-splitter pass finds out the local-linkage global
variables in {`.rodata`, `.data.rel.ro`, `bss`, `.data`} sections by
analyzing machine instruction operands, and aggregates their accesses
from code across functions.
A follow-up item is to analyze global variable initializers and count
for access from data.
* This limitation is demonstrated by `bss2` and `data3` in
`llvm/test/CodeGen/X86/global-variable-partition.ll`.
Some stats of static-data-splitter with this patch:
**section**|**bss**|**rodata**|**data**
:-----:|:-----:|:-----:|:-----:
hot-prefixed section coverage|99.75%|97.71%|91.30%
unlikely-prefixed section size percentage|67.94%|39.37%|63.10%
1. The coverage is defined as `#perf-sample-in-hot-prefixed <data>
section / #perf-sample in <data.*> section` for each <data> section.
* The perf command samples
`MEM_INST_RETIRED.ALL_LOADS:u:pinned:precise=2` events at a high
frequency (`perf -c 2251`) for 30 seconds. The profiled binary is built
as non-PIE so `data.rel.ro` coverage data is not available.
2. The unlikely-prefixed `<data>` section size percentage is defined as
`unlikely <data> section size / the sum size of <data>.* sections` for
each `<data>` section
|
|
|
|
|
|
This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR
expansion for frem instruction" which implements the expansion of
another instruction in this pass. The more general name seems more
appropriate given this change and quite reasonable even without it.
|
|
|
|
EnableTailMerge is false by default and is handled by the pass builder.
Passes are independent of target pipeline options.
This completes the generic `MachineLateOptimization` passes for the NPM
pipeline.
|
|
|
|
|
|
|
|
|
|
|
|
Targets can set the EnableSinkAndFold option in CGPassBuilderOptions for
the NPM pipeline in buildCodeGenPipeline(... &Opts, ...)
|
|
|
|
Leaving out NPM command line support for the next patch.
|
|
There are no standalone tests for this pass for backends implementing
the NPM yet.
|
|
`RegisterClassInfo` was supposed to be kept alive between pass runs,
which wasn't being done leading to recomputations increasing the compile
time.
Now the Impl class is a member of the legacy and new passes so that it
is not reconstructed on every pass run.
---------
Co-authored-by: Christudasan Devadasan <christudasan.devadasan@amd.com>
|
|
This reverts commit 5aa4979c47255770cac7b557f3e4a980d0131d69 while I
investigate what's causing the compile-time regression.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744
proposes to partition static data sections.
This patch introduces a codegen pass. This patch produces jump table
hotness in the in-memory states (machine jump table info and entries).
Target-lowering and asm-printer consume the states and produce `.hot`
section suffix. The follow up PR
https://github.com/llvm/llvm-project/pull/122215 implements such
changes.
---------
Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>
|
|
This registers the pass with PassRegistry so we can use -start-before
and other options for machine-function-splitter.
|
|
|
|
The existing analysis was already a pimpl wrapper.
I have extracted legacy pass logic to a LDVImpl wrapper named
`LiveDebugVariables` which is the analysis::Result now. This controls
whether to activate the LDV (depending on `-live-debug-variables` and
DIsubprogram) itself.
The legacy and new analysis only construct the LiveDebugVariables.
VirtRegRewriter will test this.
|
|
With this, all machine SSA optimization passes are available in the new codegen pipeline.
|
|
|
|
|
|
|
|
|