llvm-project.git/llvm/lib/CodeGen/MachineFunction.cpp, branch users/Prabhuk/sprmain.asmprintercallgraphsection-emit-call-graph-section

[𝘀𝗽𝗿] changes to main this commit is based on

2024-04-03T22:20:04+00:00

Created using spr 1.3.6-beta.1

[skip ci]

[CodeGen] Use LocationSize for MMO getSize (#84751)

2024-03-17T18:15:56+00:00

This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.

This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.

Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.

[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128)

2024-02-02T01:50:46+00:00

Today `-split-machine-functions` and `-fbasic-block-sections={all,list}`
cannot be combined with `-basic-block-sections=labels` (the labels
option will be ignored).
The inconsistency comes from the way basic block address map -- the
underlying mechanism for basic block labels -- encodes basic block
addresses
(https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html).
Specifically, basic block offsets are computed relative to the function
begin symbol. This relies on functions being contiguous which is not the
case for MFS and basic block section binaries. This means Propeller
cannot use binary profiles collected from these binaries, which limits
the applicability of Propeller for iterative optimization.
    
To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section
binaries, we propose modifying the encoding of this section as follows.

First let us review the current encoding which emits the address of each
function and its number of basic blocks, followed by basic block entries
for each basic block.

| | |
|--|--|
| Address of the function | Function Address |
|  Number of basic blocks in this function | NumBlocks |
|  BB entry 1
|  BB entry 2
|   ...
|  BB entry #NumBlocks
    
To make this work for basic block sections, we treat each basic block
section similar to a function, except that basic block sections of the
same function must be encapsulated in the same structure so we can map
all of them to their single function.
    
We modify the encoding to first emit the number of basic block sections
(BB ranges) in the function. Then we emit the address map of each basic
block section section as before: the base address of the section, its
number of blocks, and BB entries for its basic block. The first section
in the BB address map is always the function entry section.
| | |
|--|--|
|  Number of sections for this function   | NumBBRanges |
| Section 1 begin address                     | BaseAddress[1]  |
| Number of basic blocks in section 1 | NumBlocks[1]    |
| BB entries for Section 1
|..................|
| Section #NumBBRanges begin address | BaseAddress[NumBBRanges] |
| Number of basic blocks in section #NumBBRanges |
NumBlocks[NumBBRanges] |
| BB entries for Section #NumBBRanges
    
The encoding of basic block entries remains as before with the minor
change that each basic block offset is now computed relative to the
begin symbol of its containing BB section.
    
This patch adds a new boolean codegen option `-basic-block-address-map`.
Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD
flag `--lto-basic-block-address-map` are introduced.
Analogously, we add a new TargetOption field `BBAddrMap`. This means BB
address maps are either generated for all functions in the compiling
unit, or for none (depending on `TargetOptions::BBAddrMap`).
    
This patch keeps the functionality of the old
`-fbasic-block-sections=labels` option but does not remove it. A
subsequent patch will remove the obsolete option.

We refactor the `BasicBlockSections` pass by separating the BB address
map and BB sections handing to their own functions (named
`handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers
basic blocks and places them in their assigned sections.
`handleBBAddrMap` is invoked after `handleBBSections` (if requested) and
only renumbers the blocks.
  - New tests added:
- Two tests basic-block-address-map-with-basic-block-sections.ll and
basic-block-address-map-with-mfs.ll to exercise the combination of
`-basic-block-address-map` with `-basic-block-sections=list` and
'-split-machine-functions`.
- A driver sanity test for the `-fbasic-block-address-map` option
(basic-block-address-map.c).
- An LLD test for testing the `--lto-basic-block-address-map` option.
This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`.
- Renamed and modified the two existing codegen tests for basic block
address map (`basic-block-sections-labels-functions-sections.ll` and
`basic-block-sections-labels.ll`)
- Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of
`SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2
will happen in a separate PR in a few months.

[CodeGen] Split off PseudoSourceValueManager into separate header (NFC) (#73327)

2023-12-04T09:17:59+00:00

Most users of PseudoSourceValue.h only need PseudoSourceValue, not the
PseudoSourceValueManager. However, this header pulls in some very
expensive dependencies like ValueMap.h, which is only used for the
manager.

Split off the manager into a separate header and include it only where
used.

[BasicBlockSections] Apply path cloning with -basic-block-sections. (#68860)

2023-10-28T04:49:39+00:00

https://github.com/llvm/llvm-project/commit/28b912687900bc0a67cd61c374fce296b09963c4
introduced the path cloning format in the basic-block-sections profile.

This PR validates and applies path clonings. 
A path cloning is valid if all of these conditions hold:
  1. All bb ids in the path are mapped to existing blocks.
2. Each two consecutive bb ids in the path have a successor relationship
in the CFG.
3. The path does not include a block with indirect branches, except
possibly as the last block.
 
Applying a path cloning involves cloning all blocks in the path (except
the first one) and setting up their branches.
Once all clonings are applied, the cluster information is used to guide
block layout in the modified function.

[X86, Peephole] Enable FoldImmediate for X86

2023-10-27T19:47:23+00:00

Enable FoldImmediate for X86 by implementing X86InstrInfo::FoldImmediate.

Also enhanced peephole by deleting identical instructions after FoldImmediate.

Differential Revision: https://reviews.llvm.org/D151848

[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)

2023-09-14T21:10:14+00:00

This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::

[X86] Use 64-bit jump table entries for large code model PIC

2023-08-31T21:13:38+00:00

With the large code model, the label difference may not fit into 32 bits.
Even if we assume that any individual function is no larger than 2^32
and use a difference from the function entry to the target destination,
things like BOLT can rearrange blocks (even if BOLT doesn't necessarily
work with the large code model right now).

set directives avoid static relocations in some 32-bit entry cases, but
don't worry about set directives for 64-bit jump table entries (we can
do that later if somebody really cares about it).

check-llvm in a bootstrapped clang with the large code model passes.

Fixes #62894

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D159297

MachineFunction: -fsanitize={function,kcfi}: ensure 4-byte alignment

2023-06-30T16:13:19+00:00

Fix https://github.com/llvm/llvm-project/issues/63579
```
% cat a.c
void foo() {}
% clang --target=arm-none-eabi -mthumb -mno-unaligned-access -fsanitize=kcfi a.c -S -o - | grep p2align
        .p2align        1
% clang --target=armv6m-none-eabi -fsanitize=function a.c -S -o - | grep p2align
        .p2align        1
```

Ensure that -fsanitize={function,kcfi} instrumented functions are aligned by at
least 4, so that loading the type hash before the function label will not cause
a misaligned access. This is especially important for -mno-unaligned-access
configurations that don't set `setMinFunctionAlignment` to 4 or greater.

With this patch, the generated assembly for the examples above will contain `.p2align 2`
before the type hash.

If `__attribute__((aligned(N)))` or `-falign-functions=N` is specified, the
larger alignment will be used.

Reviewed By: simon_tatham, samitolvanen

Differential Revision: https://reviews.llvm.org/D154125

[Analysis] Refactor MBB hotness/coldness into templated PSI functions.

2023-06-29T05:32:52+00:00

Currently, to use PSI->isFunctionHotInCallGraph, we first need to
calculate BPI->BFI, which is expensive. Instead, we can implement this
directly with MBFI. Also as @wenlei mentioned in another patch review,
that MachineSizeOpts already has isFunctionColdInCallGraph,
isFunctionHotInCallGraphNthPercentile, etc implemented. These can be
refactored and so they can be reused across MachineFunctionSplitting
and MachineSizeOpts passes.

This CL does this - it refactors out those internal static functions
into PSI as templated functions, so they can be accessed easily.

Differential Revision: https://reviews.llvm.org/D153927