summaryrefslogtreecommitdiff
path: root/bolt/lib/Passes/SplitFunctions.cpp
AgeCommit message (Collapse)Author
2025-10-14[bolt] Fix typos discovered by codespell (#124726)Christian Clauss
https://github.com/codespell-project/codespell ```bash codespell bolt --skip="*.yaml,Maintainers.txt" --write-changes \ --ignore-words-list=acount,alledges,ans,archtype,defin,iself,mis,mmaped,othere,outweight,vas ```
2025-10-03[BOLT][AArch64] Refuse to run CDSplit pass (#159351)Paschalis Mpeis
LongJmp does not support warm blocks. On builds without assertions, this may lead to unexpected crashes. This patch exits with a clear message.
2024-11-22[BOLT] Use compact EH format for fixed-address executables (#117274)Maksim Panchenko
Use ULEB128 format for emitting LSDAs for fixed-address executables, similar to what we use for PIEs/DSOs. Main difference is that we don't use landing pad trampolines when landing pads are not contained in a single fragment. Instead, we fallback to emitting larger fixed-address LSDAs, which is still better than adding trampoline instructions.
2024-11-21[BOLT] Avoid EH trampolines for PIEs/DSOs (#117106)Maksim Panchenko
We used to emit EH trampolines for PIE/DSO whenever a function fragment contained a landing pad outside of it. However, it is common to have all landing pads in a cold fragment even when their throwers are in a hot one. To reduce the number of trampolines, analyze landing pads for any given function fragment, and if they all belong to the same (possibly different) fragment, designate that fragment as a landing pad fragment for the "thrower" fragment. Later, emit landing pad fragment symbol as an LPStart for the thrower LSDA.
2024-05-01[BOLT] Add split function support for the Linux kernel (#90541)Maksim Panchenko
While rewriting the Linux kernel, we try to fit optimized functions into their original boundaries. When a function becomes larger, we skip it during the rewrite and end up with less than optimal code layout. To overcome that issue, add support for --split-function option so that hot part of the function could be fit into the original space. The cold part should go to reserved space in the binary.
2024-03-31[BOLT][NFC] Clean includes, add license headers (#87200)Amir Ayupov
2024-02-12[BOLT][NFC] Log through JournalingStreams (#81524)Amir Ayupov
Make core BOLT functionality more friendly to being used as a library instead of in our standalone driver llvm-bolt. To accomplish this, we augment BinaryContext with journaling streams that are to be used by most BOLT code whenever something needs to be logged to the screen. Users of the library can decide if logs should be printed to a file, no file or to the screen, as before. To illustrate this, this patch adds a new option `--log-file` that allows the user to redirect BOLT logging to a file on disk or completely hide it by using `--log-file=/dev/null`. Future BOLT code should now use `BinaryContext::outs()` for printing important messages instead of `llvm::outs()`. A new test log.test enforces this by verifying that no strings are print to screen once the `--log-file` option is used. In previous patches we also added a new BOLTError class to report common and fatal errors, so code shouldn't call exit(1) now. To easily handle problems as before (by quitting with exit(1)), callers can now use `BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code needs to deal with BOLT errors. To test this, we have fatal.s that checks we are correctly quitting and printing a fatal error to the screen. Because this is a significant change by itself, not all code was yet ported. Code from Profiler libs (DataAggregator and friends) still print errors directly to screen. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC
2024-02-12[BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521)Amir Ayupov
As part of the effort to refactor old error handling code that would directly call exit(1), in this patch we change the interface to `BinaryFunctionPass` to return an Error on `runOnFunctions()`. This gives passes the ability to report a serious problem to the caller (RewriteInstance class), so the caller may decide how to best handle the exceptional situation. Co-authored-by: Rafael Auler <rafaelauler@fb.com> Test Plan: NFC
2023-12-21[BOLT] Don't split likely fallthrough in CDSplit (#76164)ShatianWang
This diff speeds up CDSplit by not considering any hot-warm splitting point that could break a fall-through branch from a basic block to its most likely successor. Co-authored-by: spupyrev <spupyrev@fb.com>
2023-12-01[BOLT][NFC] Remove unused code for CDSplit (#74136)ShatianWang
This diff removes JumpInfo related code that is no longer needed by CDSplit from SplitFunctions.cpp.
2023-11-30[BOLT] CDSplit main logic part 2/2 (#74032)ShatianWang
This diff implements the main splitting logic of CDSplit. CDSplit processes functions in a binary in parallel. For each function BF, it assumes that all other functions are hot-cold split. For each possible hot-warm split point of BF, it computes its corresponding SplitScore, and chooses the split point with the best SplitScore. The SplitScore of each split point is computed in the following way: each call edge or jump edge has an edge score that is proportional to its execution count, and inversely proportional to its distance. The SplitScore of a split point is a sum of edge scores over a fixed set of edges whose distance can change due to hot-warm splitting BF. This set contains all cover calls in the form of X->Y or Y->X given function order [... X ... BF ... Y ...]; we refer to the sum of edge scores over the set of cover calls as CoverCallScore. This set also contains all jump edges (branches) within BF as well as all call edges originated from BF; we refer to the sum of edge scores over this set of edges as LocalScore. CDSplit finds the split index maximizing CoverCallScore + LocalScore.
2023-11-30[BOLT] CDSplit main logic part 1/2 (#73895)ShatianWang
This diff defines and initializes auxiliary variables used by CDSplit and implements two important helper functions. The first helper function approximates the block level size increase if a function is hot-warm split at a given split index (X86 specific). The second helper function finds all calls in the form of X->Y or Y->X for each BF given function order [... X ... BF ... Y ...]. These calls are referred to as "cover calls". Their distance will decrease if BF's hot fragment size is further reduced by hot-warm splitting. NFC.
2023-11-29[BOLT] Create .text.warm for 3-way splitting (#73863)ShatianWang
This commit explicitly adds a warm code section, .text.warm, when -split-functions -split-strategy=cdsplit is used. This replaces the previous approach of using .text.cold.0 as warm and .text.cold.1 as cold in 3-way function splitting. NFC.
2023-11-29[BOLT] Add structure of CDSplit to SplitFunctions (#73430)ShatianWang
This commit establishes the general structure of the CDSplit strategy in SplitFunctions without incorporating the exact splitting logic. With -split-functions -split-strategy=cdsplit, the SplitFunctions pass will run twice: the first time is before function reordering and functions are hot-cold split; the second time is after function reordering and functions are hot-warm-cold split based on the fixed function ordering. Currently, all functions are hot-warm split after the entry block in the second splitting pass. Subsequent commits will introduce the precise splitting logic. NFC.
2023-02-27[BOLT][NFC] Log reversing splitting decisionAmir Ayupov
Expose log for testing purposes. Reviewed By: #bolt, maksfb Differential Revision: https://reviews.llvm.org/D144674
2022-12-06[BOLT][NFC] Use std::optional in MCPlusBuilderAmir Ayupov
Reviewed By: maksfb, #bolt Differential Revision: https://reviews.llvm.org/D139260
2022-09-08[BOLT] Emit LSDA call sites for all fragmentsFabian Parzefall
For exception handling, LSDA call sites have to be emitted for each fragment individually. With this patch, call sites and respective LSDA symbols are generated and associated with each fragment of their function, such that they can be used by the emitter. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D132052
2022-09-08[BOLT] Fragment all blocks (not just outlineable blocks)Fabian Parzefall
To enable split strategies that require view of the entire CFG (e.g. to estimate cost of path from entry block), with this patch, all blocks of a function are passed to `SplitStrategy::fragment`. Because this might move non-outlineable blocks into a split fragment, these blocks are moved back into the main fragment after fragmenting. This also gives strategies the option to specify whether empty fragments should be kept or removed. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D132423
2022-09-08[BOLT] Introduce SplitStrategy ABCFabian Parzefall
This introduces an abstract base class for splitting strategies to document the interface a strategy needs to implement, and also to avoid code bloat of the `splitFunction` method. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D132054
2022-09-03[BOLT] Use range-based for loops (NFC)Kazu Hirata
LLVM Coding Standards discourage for_each unless callable objects already exist.
2022-08-18[BOLT] Insert EH trampolines for multiple fragmentsFabian Parzefall
This patch adds exception handling trampolines when a function is split into more than two fragments. Trampolines are tracked per-fragment, such that they can be removed if splitting is reversed. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132048
2022-08-18[BOLT] Add randomN split strategyFabian Parzefall
This adds a strategy to split functions into a random number of fragments at randomly chosen split points. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D130647
2022-08-18[BOLT] Add split all blocks strategyFabian Parzefall
This adds a function splitting strategy that splits each outlineable basic block into its own fragment. This is exposed through a new command line option `--split-strategy`. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D129827
2022-08-18[BOLT] Generate sections for multiple fragmentsFabian Parzefall
This patch adds support to generate any number of sections that are assigned to fragments of functions that are split more than two-way. With this, a function's *nth* split fragment goes into section `.text.cold.n`. This also changes `FunctionLayout::erase` to make sure, that there are no empty fragments at the end of the function. This sometimes happens when blocks are erased from the function. To avoid creating symbols pointing to these fragments, they need to be removed. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D130521
2022-07-16[BOLT] Add function layout classFabian Parzefall
This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to a strict hot/cold split). Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129518
2022-07-14[BOLT] Replace uses of layout with basic block listFabian Parzefall
As we are moving towards support for multiple fragments, loops that iterate over all basic blocks of a function, but do not depend on the order of basic blocks in the final layout, should iterate over binary functions directly, rather than the layout. Eventually, all loops using the layout list should either iterate over the function, or be aware of multiple layouts. This patch replaces references to binary function's block layout with the binary function itself where only little code changes are necessary. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129585
2022-06-29[BOLT] Fix EH trampoline backout codeMaksim Panchenko
When SplitFunctions pass adds a trampoline code for exception landing pads (limited to shared objects), it may increase the size of the hot fragment making it larger than the whole function pre-split. When this happens, the pass reverts the splitting action by restoring the original block order and marking all blocks hot. However, if createEHTrampolines() added new blocks to the CFG and modified invoke instructions, simply restoring the original block layout will not suffice as the new CFG has more blocks. For proper backout of the split, modify the original layout by merging in trampoline blocks immediately before their matching targets. As a result, the number of blocks increases, but the number of instructions and the function size remains the same as pre-split. Add an assertion for the number of blocks when updating a function layout. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128696
2022-06-29[BOLT] Add option to randomize function split pointFabian Parzefall
For test purposes, we want to split functions at a random split point to be able to test different layouts without relying on the profile. This patch introduces an option, that randomly chooses a split point to partition blocks of a function into hot and cold regions. Reviewed By: Amir, yota9 Differential Revision: https://reviews.llvm.org/D128773
2022-06-24[BOLT] Mark option values of --split-functions deprecatedFabian Parzefall
The SplitFunctions pass does not distinguish between various splitting modes anymore. This change updates the command line interface to reflect this behavior by deprecating values passed to the --split-function option. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128558
2022-06-23[BOLT][NFC] Use range-based STL wrappersAmir Ayupov
Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128154
2022-06-19[BOLT] Split functions with exceptions in shared objects and PIEsMaksim Panchenko
Add functionality to allow splitting code with C++ exceptions in shared libraries and PIEs. To overcome a limitation in exception ranges format, for functions with fragments spanning multiple sections, add trampoline landing pads in the same section as the corresponding throwing range. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D127936
2022-06-05[bolt] Remove unneeded cl::ZeroOrMore for cl::opt optionsFangrui Song
2021-12-28[BOLT][NFC] Fix braces usage in PassesAmir Ayupov
Summary: Refactor bolt/*/Passes to follow the braces rule for if/else/loop from [LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html). (cherry picked from FBD33344642)
2021-12-21[BOLT][NFC] Fix file-description commentsMaksim Panchenko
Summary: Fix comments at the start of source files. (cherry picked from FBD33274597)
2021-12-14[BOLT][NFC] Reformat with clang-formatMaksim Panchenko
Summary: Selectively apply clang-format to BOLT code base. (cherry picked from FBD33119052)
2021-12-08[BOLT] Use more ADT data structures for BinaryFunctionMaksim Panchenko
Summary: Switched members of BinaryFunction to ADT where it was possible and made sense. As a result, the size of BinaryFunction on x86-64 Linux reduced from 1624 bytes to 1448. (cherry picked from FBD32981555)
2021-10-08Rebase: [NFC] Refactor sources to be buildable in shared modeRafael Auler
Summary: Moves source files into separate components, and make explicit component dependency on each other, so LLVM build system knows how to build BOLT in BUILD_SHARED_LIBS=ON. Please use the -c merge.renamelimit=230 git option when rebasing your work on top of this change. To achieve this, we create a new library to hold core IR files (most classes beginning with Binary in their names), a new library to hold Utils, some command line options shared across both RewriteInstance and core IR files, a new library called Rewrite to hold most classes concerned with running top-level functions coordinating the binary rewriting process, and a new library called Profile to hold classes dealing with profile reading and writing. To remove the dependency from BinaryContext into X86-specific classes, we do some refactoring on the BinaryContext constructor to receive a reference to the specific backend directly from RewriteInstance. Then, the dependency on X86 or AArch64-specific classes is transfered to the Rewrite library. We can't have the Core library depend on targets because targets depend on Core (which would create a cycle). Files implementing the entry point of a tool are transferred to the tools/ folder. All header files are transferred to the include/ folder. The src/ folder was renamed to lib/. (cherry picked from FBD32746834)