summaryrefslogtreecommitdiff
path: root/llvm/test/Bitcode/thinlto-function-summary-callgraph.ll
AgeCommit message (Collapse)Author
2024-08-27Reapply: Use an abbrev to reduce size of VALUE_GUID records in ThinLTO ↵Jan Voung
summaries (#106165) This retries #90692 which was reverted previously due to issues with lld-available being set, even if the copy of lld is not built from source. This does not change any code compared to #90692 to address the lld-available issue. The main change w.r.t, lld-available is xfailing tests in PR #99056 (until a longer term fix is available).
2024-05-06Revert "Reapply "Use an abbrev to reduce size of VALUE_GUID records in ↵Jan Voung
ThinLTO summaries" (#90610)" (#91194) Reverts llvm/llvm-project#90692 Breaking PPC buildbots. The bots are not meant to test LLD, but are running a test that is using an old version of LLD without the change (so is incompatible). Revert until a fix is found.
2024-05-01Reapply "Use an abbrev to reduce size of VALUE_GUID records in ThinLTO ↵Jan Voung
summaries" (#90610) (#90692) This reverts commit 2aabfc811670beb843074c765c056fff4a7b443b. Add fixes to LLD and Gold tests missed in original change. Co-authored-by: Jan Voung <jvoung@google.com>
2024-04-30Revert "Use an abbrev to reduce size of VALUE_GUID records in ThinLTO ↵Jan Voung
summaries" (#90610) Reverts llvm/llvm-project#90497 Broke some LLD tests.
2024-04-30Use an abbrev to reduce size of VALUE_GUID records in ThinLTO summaries (#90497)Jan Voung
GUID often have content in the higher bits of a 64-bit entry so using the unabbrev encoding is inefficient (lots of VBR control bits). Instead, use an abbrev with two 32-bit fixed width chunks. The abbrev also helps encode the "count" in one place instead of in every record. Reduces size of distributed backend summary files by 8.7% in one example app. Co-authored-by: Jan Voung <jvoung@google.com>
2023-12-06[ThinLTO] Add tail call flag to call edges in summary (#74043)Teresa Johnson
This adds support for a HasTailCall flag on function call edges in the ThinLTO summary. It is intended for use in aiding discovery of missing frames from tail calls in profiled call stacks for MemProf of profiled binaries that did not disable tail call elimination. A follow on change will add the use of this new flag during MemProf context disambiguation. The new flag is encoded in the bitcode along with either the hotness flag from the profile, or the relative block frequency under the -write-relbf-to-summary flag when there is no profile data. Because we now will always have some additional call edge information, I have removed the non-profile function summary record format, and we simply encode the tail call flag along with a hotness type of none when there is no profile information or relative block frequency. The change of record format and name caused most of the test case changes. I have added explicit testing of generation of the new tail call flag into the bitcode and IR assembly format as part of the changes to llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also added round trip testing through assembly and bitcode to llvm/test/Assembler/thinlto-summary.ll.
2023-04-20[ThinLTO] Remove BlockCount for non partial sample profile buildsTeresa Johnson
As pointed out in https://discourse.llvm.org/t/undeterministic-thin-index-file/69985, the block count added to distributed ThinLTO index files breaks incremental builds on ThinLTO - if any linked file has a different number of BBs, then the accumulated sum placed in the index files will change, causing all ThinLTO backend compiles to be redone. The block count is only used for scaling of partial sample profiles, and was added in D80403 for D79831. This patch simply removes this field from the index files of non partial sample profile compiles, which is NFC on the output of the compiler. We subsequently need to see if this can be removed for partial sample profiles without signficant performance loss, or redesigned in a way that does not destroy caching. Differential Revision: https://reviews.llvm.org/D148746
2020-05-28[ThinLTO] Compute the basic block count across modules.Hiroshi Yamauchi
Summary: Count the per-module number of basic blocks when the module summary is computed and sum them up during Thin LTO indexing. This is used to estimate the working set size under the partial sample PGO. This is split off of D79831. Reviewers: davidxl, espindola Subscribers: emaste, inglorion, hiraditya, MaskRay, steven_wu, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80403
2019-07-05[ThinLTO] Attempt to recommit r365188 after alignment fixEugene Leviant
llvm-svn: 365215
2019-07-05Reverted r365188 due to alignment problems on i686-androidEugene Leviant
llvm-svn: 365206
2019-07-05[ThinLTO] Attempt to recommit r365040 after caching fixEugene Leviant
It's possible that some function can load and store the same variable using the same constant expression: store %Derived* @foo, %Derived** bitcast (%Base** @bar to %Derived**) %42 = load %Derived*, %Derived** bitcast (%Base** @bar to %Derived**) The bitcast expression was mistakenly cached while processing loads, and never examined later when processing store. This caused @bar to be mistakenly treated as read-only variable. See load-store-caching.ll. llvm-svn: 365188
2019-07-04Revert [ThinLTO] Optimize writeonly globals outReid Kleckner
This reverts r365040 (git commit 5cacb914758c7f436b47c8362100f10cef14bbc4) Speculatively reverting, since this appears to have broken check-lld on Linux. Partial analysis in https://crbug.com/981168. llvm-svn: 365097
2019-07-03[ThinLTO] Optimize writeonly globals outEugene Leviant
Differential revision: https://reviews.llvm.org/D63444 llvm-svn: 365040
2019-01-11[LTO] Record whether LTOUnit splitting is enabled in indexTeresa Johnson
Summary: Records in the module summary index whether the bitcode was compiled with the option necessary to enable splitting the LTO unit (e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit). The information is passed down to the ModuleSummaryIndex builder via a new module flag "EnableSplitLTOUnit", which is propagated onto a flag on the summary index. This is then used during the LTO link to check whether all linked summaries were built with the same value of this flag. If not, an error is issued when we detect a situation requiring whole program visibility of the class hierarchy. This is the case when both of the following conditions are met: 1) We are performing LowerTypeTests or Whole Program Devirtualization. 2) There are type tests or type checked loads in the code. Note I have also changed the ThinLTOBitcodeWriter to also gate the module splitting on the value of this flag. Reviewers: pcc Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D53890 llvm-svn: 350948
2018-12-13[ThinLTO] Compute synthetic function entry countEaswaran Raman
Summary: This patch computes the synthetic function entry count on the whole program callgraph (based on module summary) and writes the entry counts to the summary. After function importing, this count gets attached to the IR as metadata. Since it adds a new field to the summary, this bumps up the version. Reviewers: tejohnson Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43521 llvm-svn: 349076
2018-11-16[ThinLTO] Internalize readonly globalsEugene Leviant
An attempt to recommit r346584 after failure on OSX build bot. Fixed cache key computation in ThinLTOCodeGenerator and added test case llvm-svn: 347033
2018-11-13Revert "[ThinLTO] Internalize readonly globals"Steven Wu
This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2. llvm-svn: 346768
2018-11-10[ThinLTO] Internalize readonly globalsEugene Leviant
This patch allows internalising globals if all accesses to them (from live functions) are from non-volatile load instructions Differential revision: https://reviews.llvm.org/D49362 llvm-svn: 346584
2018-02-07[ThinLTO] Serialize WithGlobalValueDeadStripping index flag for distributed ↵Teresa Johnson
backends Summary: A recent fix to drop dead symbols (r323633) did not work for ThinLTO distributed backends because we lose the WithGlobalValueDeadStripping set on the index during the thin link. This patch adds a new flags record to the bitcode format for the index, and serializes this flag for the combined index (it would always be 0 for the per-module index generated by the compile step, so no need to serialize the new flags record there until/unless we add another flag that applies to the per-module indexes). Generally this flag should always be set for the distributed backends, which are necessarily performed after the thin link. However, if we were to simply set this flag on the index applied to the distributed backends (invoked via clang), we would lose the ability to disable dead stripping via -compute-dead=false for debugging purposes. Reviewers: grimar, pcc Subscribers: mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D42799 llvm-svn: 324444
2017-08-04[ThinLTO] Add FunctionAttrs to ThinLTO indexCharles Saternos
Adds function attributes to index: ReadNone, ReadOnly, NoRecurse, NoAlias. This attributes will be used for future ThinLTO optimizations that will propagate function attributes across modules. llvm-svn: 310061
2017-06-27Bitcode: Write the irsymtab to disk.Peter Collingbourne
Differential Revision: https://reviews.llvm.org/D33973 llvm-svn: 306487
2017-05-31[ThinLTO] Reduce unnecessary map lookups during combined summary writeTeresa Johnson
Summary: Don't assign values to undefined references, simply don't emit those reference edges as they are not useful (we were already not emitting call edges to undefined refs). Also, streamline the later lookup of value ids when writing the summaries, by combining the check for value id existence with the access of that value id. Reviewers: pcc Subscribers: Prazek, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D33634 llvm-svn: 304323
2017-04-17Bitcode: Add a string table to the bitcode format.Peter Collingbourne
Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464
2016-09-26[thinlto] Basic thinlto fdo heuristicPiotr Padlewski
Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437
2016-04-27[ThinLTO] Use valueid instead of bitcode offsets in combined index fileTeresa Johnson
Summary: With the removal of support for lazy parsing of combined index summary records (e.g. r267344), we no longer need to include the summary record bitcode offset in the VST entries for definitions. Change the combined index format to be similar to the per-module index format in using value ids to cross-reference from the summary record to the VST entry (rather than the summary record bitcode offset to cross-reference in the other direction). The visible changes are: 1) Add the value id to the combined summary records 2) Remove the summary offset from the combined VST records, which has the following effects: - No longer need the VST_CODE_COMBINED_GVDEFENTRY record, as all combined index VST entries now only contain the value id and corresponding GUID. - No longer have duplicate VST entries in the case where there are multiple definitions of a symbol (e.g. weak/linkonce), as they all have the same value id and GUID. An implication of #2 above is that in order to hook up an alias to the correct aliasee based on the value id of the aliasee recorded in the combined index alias record, we need to scan the entries in the index for that GUID to find the one from the same module (i.e. the case where there are multiple entries for the aliasee). But the reader no longer has to maintain a special map to hook up the alias/aliasee. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19481 llvm-svn: 267712
2016-04-24Add a version field in the bitcode for the summaryMehdi Amini
Differential Revision: http://reviews.llvm.org/D19456 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267318
2016-04-12Move summary creation out of llvm-as into optMehdi Amini
Summary: Let keep llvm-as "dumb": it converts textual IR to bitcode. This commit removes the dependency from llvm-as to libLLVMAnalysis. We'll add back summary in llvm-as if we get to a textual representation for it at some point. In the meantime, opt seems like a better place for that. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19032 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266131
2016-03-15[ThinLTO] Renaming of function index to module summary index (NFC)Teresa Johnson
(Resubmitting after fixing missing file issue) With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. llvm-svn: 263513
2016-03-14Revert "[ThinLTO] Renaming of function index to module summary index (NFC)"Teresa Johnson
This reverts commit r263490. Missed a file. llvm-svn: 263493
2016-03-14[ThinLTO] Renaming of function index to module summary index (NFC)Teresa Johnson
With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. llvm-svn: 263490
2016-03-11[ThinLTO] Support for reference graph in per-module and combined summary.Teresa Johnson
Summary: This patch adds support for including a full reference graph including call graph edges and other GV references in the summary. The reference graph edges can be used to make importing decisions without materializing any source modules, can be used in the plugin to make file staging decisions for distributed build systems, and is expected to have other uses. The call graph edges are recorded in each function summary in the bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when there is PGO, where the ValueId can be mapped to the function GUID via the ValueSymbolTable. In the function index in memory, the call graph edges reference the target via the CalleeGUID instead of the CalleeValueId. The reference graph edges are recorded in each summary record with a list of referenced value IDs, which can be mapped to value GUID via the ValueSymbolTable. Addtionally, a new summary record type is added to record references from global variable initializers. A number of bitcode records and data structures have been renamed to reflect the newly expanded scope of the summary beyond functions. More cleanup will follow. Reviewers: joker.eph, davidxl Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D17212 llvm-svn: 263275