<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/ProfileData/SampleProfWriter.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[SampleFDO][TypeProf]Support vtable type profiling for ext-binary and text format (#148002)</title>
<updated>2025-09-12T22:58:16+00:00</updated>
<author>
<name>Mingming Liu</name>
<email>mingmingl@google.com</email>
</author>
<published>2025-09-12T22:58:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=52c583b3f95a0e666ab837e39a5db900b66adf15'/>
<id>52c583b3f95a0e666ab837e39a5db900b66adf15</id>
<content type='text'>
This change extends SampleFDO ext-binary and text format to record the
vtable symbols and their counts for virtual calls inside a function. The
vtable profiles will allow the compiler to annotate vtable types on IR
instructions and perform vtable-based indirect call promotion. An RFC is
in
https://discourse.llvm.org/t/rfc-vtable-type-profiling-for-samplefdo/87283

Given a function below, the before vs after of a function's profile is
illustrated in text format in the table:

```
__attribute__((noinline)) int loop_func(int i, int a, int b) {
    Base *ptr = createType(i);

    int sum = ptr-&gt;func(a, b);
    
    delete ptr;
    
    return sum;
}
```

| before | after |
| --- | --- |
| Samples collected in the function's body { &lt;br&gt; 0: 636241 &lt;br&gt; 1:
681458, calls: _Z10createTypei:681458 &lt;br&gt; 3: 543499, calls:
_ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878
&lt;br&gt; 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635
_ZN8Derived1D0Ev:147566 &lt;br&gt; 7: 511057 &lt;br&gt; } | Samples collected in the
function's body { &lt;br&gt; 0: 636241 &lt;br&gt; 1: 681458, calls:
_Z10createTypei:681458 &lt;br&gt; 3: 543499, calls:
_ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878
&lt;br&gt; 3: vtables: _ZTV8Derived1:1377 _ZTVN12_GLOBAL__N_18Derived2E:4250
&lt;br&gt; 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635
_ZN8Derived1D0Ev:147566 &lt;br&gt; 5.1: vtables: _ZTV8Derived1:227
_ZTVN12_GLOBAL__N_18Derived2E:765 &lt;br&gt; 7: 511057 &lt;br&gt; } |

Key points for this change:
1. In-memory representation of vtable profiles
* A field of type `map&lt;LineLocation, map&lt;FunctionId, uint64_t&gt;&gt;` is
introduced in a function's in-memory representation
[FunctionSamples](https://github.com/llvm/llvm-project/blob/ccc416312ed72e92a885425d9cb9c01f9afa58eb/llvm/include/llvm/ProfileData/SampleProf.h#L749-L754).
2. The vtable counters for one LineLocation represents the relative
frequency among vtables for this LineLocation. They are not required to
be comparable across LineLocations.
3. For backward compatibility of ext-binary format, we take one bit from
ProfSummaryFlag as illustrated in the enum class `SecProfSummaryFlags`.
The ext-binary profile reader parses the integer type flag and reads
this bit. If it's set, the profile reader will parse vtable profiles.
4. The vtable profiles are optional in ext-binary format, and not
serialized out by default, we introduce an LLVM boolean option (named
`-extbinary-write-vtable-type-prof`). The ext-binary profile writer
reads the boolean option and decide whether to set the section flag bit
and serialize the in-memory class members corresponding to vtables.
5. This change doesn't implement `llvm-profdata overlap --sample` for
the vtable profiles. A subsequent change will do it to keep this one
focused on the profile format change.

We don't plan to add the vtable support to non-extensible format mainly
because of the maintenance cost to keep backward compatibility for prior
versions of profile data.
* Currently, the [non-extensible binary
format](https://github.com/llvm/llvm-project/blob/5c28af409978c19a35021855a29dcaa65e95da00/llvm/lib/ProfileData/SampleProfWriter.cpp#L899-L900)
does not have feature parity with extensible binary format today, for
instance, the former doesn't support [profile symbol
list](https://github.com/llvm/llvm-project/blob/41e22aa31b1905aa3e9d83c0343a96ec0d5187ec/llvm/include/llvm/ProfileData/SampleProf.h#L1518-L1522)
or context-sensitive PGO, both of which give measurable performance
boost. Presumably the non-extensible format is not in wide use.

---------

Co-authored-by: Paschalis Mpeis &lt;paschalis.mpeis@arm.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This change extends SampleFDO ext-binary and text format to record the
vtable symbols and their counts for virtual calls inside a function. The
vtable profiles will allow the compiler to annotate vtable types on IR
instructions and perform vtable-based indirect call promotion. An RFC is
in
https://discourse.llvm.org/t/rfc-vtable-type-profiling-for-samplefdo/87283

Given a function below, the before vs after of a function's profile is
illustrated in text format in the table:

```
__attribute__((noinline)) int loop_func(int i, int a, int b) {
    Base *ptr = createType(i);

    int sum = ptr-&gt;func(a, b);
    
    delete ptr;
    
    return sum;
}
```

| before | after |
| --- | --- |
| Samples collected in the function's body { &lt;br&gt; 0: 636241 &lt;br&gt; 1:
681458, calls: _Z10createTypei:681458 &lt;br&gt; 3: 543499, calls:
_ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878
&lt;br&gt; 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635
_ZN8Derived1D0Ev:147566 &lt;br&gt; 7: 511057 &lt;br&gt; } | Samples collected in the
function's body { &lt;br&gt; 0: 636241 &lt;br&gt; 1: 681458, calls:
_Z10createTypei:681458 &lt;br&gt; 3: 543499, calls:
_ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878
&lt;br&gt; 3: vtables: _ZTV8Derived1:1377 _ZTVN12_GLOBAL__N_18Derived2E:4250
&lt;br&gt; 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635
_ZN8Derived1D0Ev:147566 &lt;br&gt; 5.1: vtables: _ZTV8Derived1:227
_ZTVN12_GLOBAL__N_18Derived2E:765 &lt;br&gt; 7: 511057 &lt;br&gt; } |

Key points for this change:
1. In-memory representation of vtable profiles
* A field of type `map&lt;LineLocation, map&lt;FunctionId, uint64_t&gt;&gt;` is
introduced in a function's in-memory representation
[FunctionSamples](https://github.com/llvm/llvm-project/blob/ccc416312ed72e92a885425d9cb9c01f9afa58eb/llvm/include/llvm/ProfileData/SampleProf.h#L749-L754).
2. The vtable counters for one LineLocation represents the relative
frequency among vtables for this LineLocation. They are not required to
be comparable across LineLocations.
3. For backward compatibility of ext-binary format, we take one bit from
ProfSummaryFlag as illustrated in the enum class `SecProfSummaryFlags`.
The ext-binary profile reader parses the integer type flag and reads
this bit. If it's set, the profile reader will parse vtable profiles.
4. The vtable profiles are optional in ext-binary format, and not
serialized out by default, we introduce an LLVM boolean option (named
`-extbinary-write-vtable-type-prof`). The ext-binary profile writer
reads the boolean option and decide whether to set the section flag bit
and serialize the in-memory class members corresponding to vtables.
5. This change doesn't implement `llvm-profdata overlap --sample` for
the vtable profiles. A subsequent change will do it to keep this one
focused on the profile format change.

We don't plan to add the vtable support to non-extensible format mainly
because of the maintenance cost to keep backward compatibility for prior
versions of profile data.
* Currently, the [non-extensible binary
format](https://github.com/llvm/llvm-project/blob/5c28af409978c19a35021855a29dcaa65e95da00/llvm/lib/ProfileData/SampleProfWriter.cpp#L899-L900)
does not have feature parity with extensible binary format today, for
instance, the former doesn't support [profile symbol
list](https://github.com/llvm/llvm-project/blob/41e22aa31b1905aa3e9d83c0343a96ec0d5187ec/llvm/include/llvm/ProfileData/SampleProf.h#L1518-L1522)
or context-sensitive PGO, both of which give measurable performance
boost. Presumably the non-extensible format is not in wide use.

---------

Co-authored-by: Paschalis Mpeis &lt;paschalis.mpeis@arm.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[NFC]Codestyle changes for SampleFDO library (#147840)</title>
<updated>2025-07-09T23:48:17+00:00</updated>
<author>
<name>Mingming Liu</name>
<email>mingmingl@google.com</email>
</author>
<published>2025-07-09T23:48:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=20daa73a0962efd22cee3bbf327ee35b22add39d'/>
<id>20daa73a0962efd22cee3bbf327ee35b22add39d</id>
<content type='text'>
* Introduce an error code for illegal_line_offset in sampleprof_error
namespace, and use it for line offset parsing error.
* Add `const` for `LineLocation::serialize`.
* Use structured binding, make_first/second_range in loops.

I'm working on a [sample-profile format
change](https://github.com/llvm/llvm-project/compare/users/mingmingl-llvm/samplefdo-profile-format)
to extend SampleFDO profile with vtable profiles. And this change splits
the non-functional changes.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Introduce an error code for illegal_line_offset in sampleprof_error
namespace, and use it for line offset parsing error.
* Add `const` for `LineLocation::serialize`.
* Use structured binding, make_first/second_range in loops.

I'm working on a [sample-profile format
change](https://github.com/llvm/llvm-project/compare/users/mingmingl-llvm/samplefdo-profile-format)
to extend SampleFDO profile with vtable profiles. And this change splits
the non-functional changes.</pre>
</div>
</content>
</entry>
<entry>
<title>[NFCI]Add SampleRecord::serialize and LineLocation::serialize to simplify FunctionSamples serialization (#141669)</title>
<updated>2025-05-27T23:32:35+00:00</updated>
<author>
<name>Mingming Liu</name>
<email>mingmingl@google.com</email>
</author>
<published>2025-05-27T23:32:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2186c95a6f59d1b87c8becea2af6e437f02bf7cb'/>
<id>2186c95a6f59d1b87c8becea2af6e437f02bf7cb</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[NFCI]Print LineLocation using its print method to simplify the code. (#141545)</title>
<updated>2025-05-27T16:41:28+00:00</updated>
<author>
<name>Mingming Liu</name>
<email>mingmingl@google.com</email>
</author>
<published>2025-05-27T16:41:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=15c3adee9f96c09add3bd5bcca4bef434ffec810'/>
<id>15c3adee9f96c09add3bd5bcca4bef434ffec810</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[ProfileData] Remove unused includes (NFC) (#116751)</title>
<updated>2024-11-20T03:42:20+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2024-11-20T03:42:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4f1b20f023626a2ae9aab627e918974ce81199fe'/>
<id>4f1b20f023626a2ae9aab627e918974ce81199fe</id>
<content type='text'>
Identified with misc-include-cleaner.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Identified with misc-include-cleaner.</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm-profdata] Enabled functionality to write split-layout profile (#101795)</title>
<updated>2024-08-29T00:33:54+00:00</updated>
<author>
<name>William Junda Huang</name>
<email>williamjhuang@google.com</email>
</author>
<published>2024-08-29T00:33:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=75e9d191f52b047ea839f75ab2a7a7d9f8c6becd'/>
<id>75e9d191f52b047ea839f75ab2a7a7d9f8c6becd</id>
<content type='text'>
Using the flag `-split_layout` in llvm-profdata merge, the output
profile can write profiles with and without inlined function into two
different extbinary sections (and their FuncOffsetTable too). The
section without inlined functions are marked with `SecFlagFlat` and is
skipped by ThinLTO because it provides no useful info.

The split layout feature was already implemented in SampleProfWriter but
previously there is no way to use it from llvm-profdata.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using the flag `-split_layout` in llvm-profdata merge, the output
profile can write profiles with and without inlined function into two
different extbinary sections (and their FuncOffsetTable too). The
section without inlined functions are marked with `SecFlagFlat` and is
skipped by ThinLTO because it provides no useful info.

The split layout feature was already implemented in SampleProfWriter but
previously there is no way to use it from llvm-profdata.</pre>
</div>
</content>
</entry>
<entry>
<title>[ProfileData] Use ArrayRef instead of const std::vector&lt;T&gt; &amp; (NFC) (#94878)</title>
<updated>2024-06-09T17:26:18+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2024-06-09T17:26:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=089c4bb589dd46d1484bd6ba1fe8f5c472339af4'/>
<id>089c4bb589dd46d1484bd6ba1fe8f5c472339af4</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm-profdata] Do not create numerical strings for MD5 function names read from a Sample Profile. (#66164)</title>
<updated>2023-10-17T21:09:39+00:00</updated>
<author>
<name>William Junda Huang</name>
<email>williamjhuang@google.com</email>
</author>
<published>2023-10-17T21:09:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=ef0e0adccd94ffdb10546491ef2719669754d3c9'/>
<id>ef0e0adccd94ffdb10546491ef2719669754d3c9</id>
<content type='text'>
This is phase 2 of the MD5 refactoring on Sample Profile following
https://reviews.llvm.org/D147740
    
In previous implementation, when a MD5 Sample Profile is read, the
reader first converts the MD5 values to strings, and then create a
StringRef as if the numerical strings are regular function names, and
later on IPO transformation passes perform string comparison over these
numerical strings for profile matching. This is inefficient since it
causes many small heap allocations.
In this patch I created a class `ProfileFuncRef` that is similar to
`StringRef` but it can represent a hash value directly without any
conversion, and it will be more efficient (I will attach some benchmark
results later) when being used in associative containers.

ProfileFuncRef guarantees the same function name in string form or in
MD5 form has the same hash value, which also fix a few issue in IPO
passes where function matching/lookup only check for function name
string, while returns a no-match if the profile is MD5.

When testing on an internal large profile (&gt; 1 GB, with more than 10
million functions), the full profile load time is reduced from 28 sec to
25 sec in average, and reading function offset table from 0.78s to 0.7s</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is phase 2 of the MD5 refactoring on Sample Profile following
https://reviews.llvm.org/D147740
    
In previous implementation, when a MD5 Sample Profile is read, the
reader first converts the MD5 values to strings, and then create a
StringRef as if the numerical strings are regular function names, and
later on IPO transformation passes perform string comparison over these
numerical strings for profile matching. This is inefficient since it
causes many small heap allocations.
In this patch I created a class `ProfileFuncRef` that is similar to
`StringRef` but it can represent a hash value directly without any
conversion, and it will be more efficient (I will attach some benchmark
results later) when being used in associative containers.

ProfileFuncRef guarantees the same function name in string form or in
MD5 form has the same hash value, which also fix a few issue in IPO
passes where function matching/lookup only check for function name
string, while returns a no-match if the profile is MD5.

When testing on an internal large profile (&gt; 1 GB, with more than 10
million functions), the full profile load time is reduced from 28 sec to
25 sec in average, and reading function offset table from 0.78s to 0.7s</pre>
</div>
</content>
</entry>
<entry>
<title>Use llvm::endianness::{big,little,native} (NFC)</title>
<updated>2023-10-13T04:21:45+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2023-10-13T04:21:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4a0ccfa865437fe29ef2ecb18152df7694dddb7f'/>
<id>4a0ccfa865437fe29ef2ecb18152df7694dddb7f</id>
<content type='text'>
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
</pre>
</div>
</content>
</entry>
<entry>
<title>[llvm] Use range-based for loops (NFC)</title>
<updated>2023-09-22T07:41:37+00:00</updated>
<author>
<name>Kazu Hirata</name>
<email>kazu@google.com</email>
</author>
<published>2023-09-22T07:41:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4c14638b55ad0e3b2ee33adfaaa88f23f60a96c1'/>
<id>4c14638b55ad0e3b2ee33adfaaa88f23f60a96c1</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
</feed>
