<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Target/SPIRV/SPIRVModuleAnalysis.cpp, branch users/chapuni/cov/single/loop</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[SPIR-V] Overhaul module analysis to improve translation speed and simplify the underlying logics (#120415)</title>
<updated>2025-01-07T09:42:23+00:00</updated>
<author>
<name>Vyacheslav Levytskyy</name>
<email>vyacheslav.levytskyy@intel.com</email>
</author>
<published>2025-01-07T09:42:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=83c1d003118a2cb8136fe49e2ec43958c93d9d6b'/>
<id>83c1d003118a2cb8136fe49e2ec43958c93d9d6b</id>
<content type='text'>
This PR is to address legacy issues with module analysis that currently
uses a complicated and not so efficient approach to trace dependencies
between SPIR-V id's via a duplicate tracker data structures and an
explicitly built dependency graph. Even a quick performance check
without any specialized benchmarks points to this part of the
implementation as a biggest bottleneck.

This PR specifically:
* eliminates a need to build a dependency graph as a data structure,
* updates the test suite (mainly, by fixing incorrect CHECK's referring
to a hardcoded order of definitions, contradicting the spec requirement
to allow certain definitions to go "in any order", see
https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_logical_layout_of_a_module),
* improves function pointers implementation so that it now passes
EXPENSIVE_CHECKS (thus removing 3 XFAIL's in the test suite).

As a quick sanity check of whether goals of the PR are achieved, we can
measure time of translation for any big LLVM IR. While testing the PR in
the local development environment, improvements of the x5 order have
been observed.

For example, the SYCL test case "group barrier" that is a ~1Mb binary IR
input shows the following values of the naive performance metric that we
can nevertheless apply here to roughly estimate effects of the PR.

before the PR:
```
$ time llc -O0 -mtriple=spirv64v1.6-unknown-unknown _group_barrier_phi.bc -o 1 --filetype=obj

real    3m33.241s
user    3m14.688s
sys     0m18.530s
```

after the PR

```
$ time llc -O0 -mtriple=spirv64v1.6-unknown-unknown _group_barrier_phi.bc -o 1 --filetype=obj

real    0m42.031s
user    0m38.834s
sys     0m3.193s
```

Next work should probably address Duplicate Tracker further, as it needs
analysis now from the perspective of what parts of it are not necessary
now, after changing the approach to implementation of the module
analysis step.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This PR is to address legacy issues with module analysis that currently
uses a complicated and not so efficient approach to trace dependencies
between SPIR-V id's via a duplicate tracker data structures and an
explicitly built dependency graph. Even a quick performance check
without any specialized benchmarks points to this part of the
implementation as a biggest bottleneck.

This PR specifically:
* eliminates a need to build a dependency graph as a data structure,
* updates the test suite (mainly, by fixing incorrect CHECK's referring
to a hardcoded order of definitions, contradicting the spec requirement
to allow certain definitions to go "in any order", see
https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_logical_layout_of_a_module),
* improves function pointers implementation so that it now passes
EXPENSIVE_CHECKS (thus removing 3 XFAIL's in the test suite).

As a quick sanity check of whether goals of the PR are achieved, we can
measure time of translation for any big LLVM IR. While testing the PR in
the local development environment, improvements of the x5 order have
been observed.

For example, the SYCL test case "group barrier" that is a ~1Mb binary IR
input shows the following values of the naive performance metric that we
can nevertheless apply here to roughly estimate effects of the PR.

before the PR:
```
$ time llc -O0 -mtriple=spirv64v1.6-unknown-unknown _group_barrier_phi.bc -o 1 --filetype=obj

real    3m33.241s
user    3m14.688s
sys     0m18.530s
```

after the PR

```
$ time llc -O0 -mtriple=spirv64v1.6-unknown-unknown _group_barrier_phi.bc -o 1 --filetype=obj

real    0m42.031s
user    0m38.834s
sys     0m3.193s
```

Next work should probably address Duplicate Tracker further, as it needs
analysis now from the perspective of what parts of it are not necessary
now, after changing the approach to implementation of the module
analysis step.</pre>
</div>
</content>
</entry>
<entry>
<title>[SPIR-V] Add saturation and float rounding mode decorations, a subset of arithmetic constrained floating-point intrinsics, and SPV_INTEL_float_controls2 extension (#119862)</title>
<updated>2024-12-16T09:29:46+00:00</updated>
<author>
<name>Vyacheslav Levytskyy</name>
<email>vyacheslav.levytskyy@intel.com</email>
</author>
<published>2024-12-16T09:29:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=978de2d6664a74864471d62244700c216fdc6741'/>
<id>978de2d6664a74864471d62244700c216fdc6741</id>
<content type='text'>
This PR adds the following features:
* saturation and float rounding mode decorations,
* arithmetic constrained floating-point intrinsics (strict_fadd,
strict_fsub, strict_fmul, strict_fdiv, strict_frem, strict_fma and
strict_fldexp),
* and SPV_INTEL_float_controls2 extension,
* using recent improvements of emit-intrinsics step, this PR also
simplifies pre- and post-legalizer steps and improves instruction
selection.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This PR adds the following features:
* saturation and float rounding mode decorations,
* arithmetic constrained floating-point intrinsics (strict_fadd,
strict_fsub, strict_fmul, strict_fdiv, strict_frem, strict_fma and
strict_fldexp),
* and SPV_INTEL_float_controls2 extension,
* using recent improvements of emit-intrinsics step, this PR also
simplifies pre- and post-legalizer steps and improves instruction
selection.</pre>
</div>
</content>
</entry>
<entry>
<title>[SPIR-V] Improve general validity of emitted code between passes (#119202)</title>
<updated>2024-12-09T20:10:09+00:00</updated>
<author>
<name>Vyacheslav Levytskyy</name>
<email>vyacheslav.levytskyy@intel.com</email>
</author>
<published>2024-12-09T20:10:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=42633cf27bd2cfb44e9f332c33cfd6750b9d7be4'/>
<id>42633cf27bd2cfb44e9f332c33cfd6750b9d7be4</id>
<content type='text'>
This PR improves general validity of emitted code between passes due to
generation of `TargetOpcode::PHI` instead of `SPIRV::OpPhi` after
Instruction Selection, fixing generation of OpTypePointer instructions
and using of proper virtual register classes.

Using `TargetOpcode::PHI` instead of `SPIRV::OpPhi` after Instruction
Selection has a benefit to support existing optimization passes
immediately, as an alternative path to disable those passes that use
`MI.isPHI()`. This PR makes it possible thus to revert
https://github.com/llvm/llvm-project/pull/116060 actions and get back to
use the `MachineSink` pass.

This PR is a solution of the problem discussed in details in
https://github.com/llvm/llvm-project/pull/110507. It accepts an advice
from code reviewers of the PR #110507 to postpone generation of OpPhi
rather than to patch CodeGen. This solution allows to unblock
improvements wrt. expensive checks and makes it unrelated to the general
points of the discussion about OpPhi vs. G_PHI/PHI.

This PR contains numerous small patches of emitted code validity that
allows to substantially pass rate with expensive checks. Namely, the
test suite with expensive checks set ON now has only 12 fails out of 569
total test cases.

FYI @bogner</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This PR improves general validity of emitted code between passes due to
generation of `TargetOpcode::PHI` instead of `SPIRV::OpPhi` after
Instruction Selection, fixing generation of OpTypePointer instructions
and using of proper virtual register classes.

Using `TargetOpcode::PHI` instead of `SPIRV::OpPhi` after Instruction
Selection has a benefit to support existing optimization passes
immediately, as an alternative path to disable those passes that use
`MI.isPHI()`. This PR makes it possible thus to revert
https://github.com/llvm/llvm-project/pull/116060 actions and get back to
use the `MachineSink` pass.

This PR is a solution of the problem discussed in details in
https://github.com/llvm/llvm-project/pull/110507. It accepts an advice
from code reviewers of the PR #110507 to postpone generation of OpPhi
rather than to patch CodeGen. This solution allows to unblock
improvements wrt. expensive checks and makes it unrelated to the general
points of the discussion about OpPhi vs. G_PHI/PHI.

This PR contains numerous small patches of emitted code validity that
allows to substantially pass rate with expensive checks. Namely, the
test suite with expensive checks set ON now has only 12 fails out of 569
total test cases.

FYI @bogner</pre>
</div>
</content>
</entry>
<entry>
<title>[SPIR-V] Add SPV_INTEL_joint_matrix extension (#118578)</title>
<updated>2024-12-04T18:00:19+00:00</updated>
<author>
<name>Dmitry Sidorov</name>
<email>dmitry.sidorov@intel.com</email>
</author>
<published>2024-12-04T18:00:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d057b53a7db43f33f4a9fd832115e613ebe0a67b'/>
<id>d057b53a7db43f33f4a9fd832115e613ebe0a67b</id>
<content type='text'>
The spec is available here:
https://github.com/intel/llvm/pull/12497

The PR doesn't add OpCooperativeMatrixApplyFunctionINTEL instruction as
it's still experimental and not properly tested E2E.

The PR also fixes few bugs in the related code:
1. CooperativeMatrixMulAddKHR optional operand must be literal, not a
constant;
2. Fixed available capabilities table creation for a case, when a single
extension adds few capabilities, that occupy not contiguous op codes.

---------

Signed-off-by: Sidorov, Dmitry &lt;dmitry.sidorov@intel.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The spec is available here:
https://github.com/intel/llvm/pull/12497

The PR doesn't add OpCooperativeMatrixApplyFunctionINTEL instruction as
it's still experimental and not properly tested E2E.

The PR also fixes few bugs in the related code:
1. CooperativeMatrixMulAddKHR optional operand must be literal, not a
constant;
2. Fixed available capabilities table creation for a case, when a single
extension adds few capabilities, that occupy not contiguous op codes.

---------

Signed-off-by: Sidorov, Dmitry &lt;dmitry.sidorov@intel.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[SPIR-V] Fix emission of debug and annotation instructions and add SPV_EXT_optnone SPIR-V extension (#118402)</title>
<updated>2024-12-03T15:18:06+00:00</updated>
<author>
<name>Vyacheslav Levytskyy</name>
<email>vyacheslav.levytskyy@intel.com</email>
</author>
<published>2024-12-03T15:18:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=874b4fb6adf742344fabd7df898be31360a563b9'/>
<id>874b4fb6adf742344fabd7df898be31360a563b9</id>
<content type='text'>
This PR fixes:
* emission of OpNames (added newly inserted internal intrinsics and
basic blocks)
* emission of function attributes (SRet is added)
* implementation of SPV_INTEL_optnone so that it emits OptNoneINTEL
Function Control flag, and add implementation of the SPV_EXT_optnone
SPIR-V extension.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This PR fixes:
* emission of OpNames (added newly inserted internal intrinsics and
basic blocks)
* emission of function attributes (SRet is added)
* implementation of SPV_INTEL_optnone so that it emits OptNoneINTEL
Function Control flag, and add implementation of the SPV_EXT_optnone
SPIR-V extension.</pre>
</div>
</content>
</entry>
<entry>
<title>Add support for SPIR-V extension: SPV_INTEL_media_block_io (#118024)</title>
<updated>2024-12-03T12:47:18+00:00</updated>
<author>
<name>Viktoria Maximova</name>
<email>viktoria.maksimova@intel.com</email>
</author>
<published>2024-12-03T12:47:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4a6ecd38218219ede3df34eeea72f3ffccec3413'/>
<id>4a6ecd38218219ede3df34eeea72f3ffccec3413</id>
<content type='text'>
This changes implements SPV_INTEL_media_block_io extension in SPIR-V
backend.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This changes implements SPV_INTEL_media_block_io extension in SPIR-V
backend.</pre>
</div>
</content>
</entry>
<entry>
<title>[SPIRV] Use `Op[S|U]Dot` when possible for integer dot product (#115095)</title>
<updated>2024-11-21T22:32:46+00:00</updated>
<author>
<name>Finn Plummer</name>
<email>50529406+inbelic@users.noreply.github.com</email>
</author>
<published>2024-11-21T22:32:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=dcd69ddefb66c0627f41547b5fdc166030d76ccb'/>
<id>dcd69ddefb66c0627f41547b5fdc166030d76ccb</id>
<content type='text'>
```
- use the new OpSDot/OpUDot instructions when capabilites allow in SPIRVInstructionSelector.cpp
- correct functionality of capability check onto input operand and not return operand type in SPIRVModuleAnalysis.cpp

- add test cases to demonstrate use case in idot.ll
```

Resolves #114632</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
```
- use the new OpSDot/OpUDot instructions when capabilites allow in SPIRVInstructionSelector.cpp
- correct functionality of capability check onto input operand and not return operand type in SPIRVModuleAnalysis.cpp

- add test cases to demonstrate use case in idot.ll
```

Resolves #114632</pre>
</div>
</content>
</entry>
<entry>
<title>[HLSL] Implement WaveActiveAnyTrue intrinsic (#115902)</title>
<updated>2024-11-21T17:44:58+00:00</updated>
<author>
<name>Ashley Coleman</name>
<email>ascoleman@microsoft.com</email>
</author>
<published>2024-11-21T17:44:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6735c5ebd414b4f0a28dfc6549187c06c67c1e32'/>
<id>6735c5ebd414b4f0a28dfc6549187c06c67c1e32</id>
<content type='text'>
Resolves https://github.com/llvm/llvm-project/issues/99160

- [x]  Implement `WaveActiveAnyTrue` clang builtin,
- [x]  Link `WaveActiveAnyTrue` clang builtin with `hlsl_intrinsics.h`
- [x] Add sema checks for `WaveActiveAnyTrue` to
`CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [x] Add codegen for `WaveActiveAnyTrue` to `EmitHLSLBuiltinExpr` in
`CGBuiltin.cpp`
- [x] Add codegen tests to
`clang/test/CodeGenHLSL/builtins/WaveActiveAnyTrue.hlsl`
- [x] Add sema tests to
`clang/test/SemaHLSL/BuiltIns/WaveActiveAnyTrue-errors.hlsl`
- [x] Create the `int_dx_WaveActiveAnyTrue` intrinsic in
`IntrinsicsDirectX.td`
- [x] Create the `DXILOpMapping` of `int_dx_WaveActiveAnyTrue` to `113`
in `DXIL.td`
- [x] Create the `WaveActiveAnyTrue.ll` and
`WaveActiveAnyTrue_errors.ll` tests in `llvm/test/CodeGen/DirectX/`
- [x] Create the `int_spv_WaveActiveAnyTrue` intrinsic in
`IntrinsicsSPIRV.td`
- [x] In SPIRVInstructionSelector.cpp create the `WaveActiveAnyTrue`
lowering and map it to `int_spv_WaveActiveAnyTrue` in
`SPIRVInstructionSelector::selectIntrinsic`.
- [x] Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveAnyTrue.ll`

---------

Co-authored-by: Finn Plummer &lt;50529406+inbelic@users.noreply.github.com&gt;
Co-authored-by: Greg Roth &lt;grroth@microsoft.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Resolves https://github.com/llvm/llvm-project/issues/99160

- [x]  Implement `WaveActiveAnyTrue` clang builtin,
- [x]  Link `WaveActiveAnyTrue` clang builtin with `hlsl_intrinsics.h`
- [x] Add sema checks for `WaveActiveAnyTrue` to
`CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [x] Add codegen for `WaveActiveAnyTrue` to `EmitHLSLBuiltinExpr` in
`CGBuiltin.cpp`
- [x] Add codegen tests to
`clang/test/CodeGenHLSL/builtins/WaveActiveAnyTrue.hlsl`
- [x] Add sema tests to
`clang/test/SemaHLSL/BuiltIns/WaveActiveAnyTrue-errors.hlsl`
- [x] Create the `int_dx_WaveActiveAnyTrue` intrinsic in
`IntrinsicsDirectX.td`
- [x] Create the `DXILOpMapping` of `int_dx_WaveActiveAnyTrue` to `113`
in `DXIL.td`
- [x] Create the `WaveActiveAnyTrue.ll` and
`WaveActiveAnyTrue_errors.ll` tests in `llvm/test/CodeGen/DirectX/`
- [x] Create the `int_spv_WaveActiveAnyTrue` intrinsic in
`IntrinsicsSPIRV.td`
- [x] In SPIRVInstructionSelector.cpp create the `WaveActiveAnyTrue`
lowering and map it to `int_spv_WaveActiveAnyTrue` in
`SPIRVInstructionSelector::selectIntrinsic`.
- [x] Create SPIR-V backend test case in
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveAnyTrue.ll`

---------

Co-authored-by: Finn Plummer &lt;50529406+inbelic@users.noreply.github.com&gt;
Co-authored-by: Greg Roth &lt;grroth@microsoft.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[SPIRV] Add write to image buffer for shaders. (#115927)</title>
<updated>2024-11-18T14:06:05+00:00</updated>
<author>
<name>Steven Perron</name>
<email>stevenperron@google.com</email>
</author>
<published>2024-11-18T14:06:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=756fe54dc7f7e7fcdfefb11d8f51b1687322daf7'/>
<id>756fe54dc7f7e7fcdfefb11d8f51b1687322daf7</id>
<content type='text'>
This commit adds an intrinsic that will write to an image buffer. We
chose to match the name of the DXIL intrinsic for simplicity in clang.

We cannot reuse the existing openCL write_image function because that is
not a reserved name in HLSL. There is not much common code to factor
out.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit adds an intrinsic that will write to an image buffer. We
chose to match the name of the DXIL intrinsic for simplicity in clang.

We cannot reuse the existing openCL write_image function because that is
not a reserved name in HLSL. There is not much common code to factor
out.</pre>
</div>
</content>
</entry>
<entry>
<title>[HLSL] Adding HLSL `clip` function. (#114588)</title>
<updated>2024-11-15T07:34:07+00:00</updated>
<author>
<name>joaosaffran</name>
<email>126493771+joaosaffran@users.noreply.github.com</email>
</author>
<published>2024-11-15T07:34:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=bc6c0681271788ca7078fb679ac67b56944de1a6'/>
<id>bc6c0681271788ca7078fb679ac67b56944de1a6</id>
<content type='text'>
Adding HLSL `clip` function.
 - adding llvm intrinsic
 - adding sema checks
 - adding dxil lowering
 - ading spirv lowering
 - adding sema tests
 - adding codegen tests
 - adding lowering tests

Closes #99093

---------

Co-authored-by: Joao Saffran &lt;jderezende@microsoft.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adding HLSL `clip` function.
 - adding llvm intrinsic
 - adding sema checks
 - adding dxil lowering
 - ading spirv lowering
 - adding sema tests
 - adding codegen tests
 - adding lowering tests

Closes #99093

---------

Co-authored-by: Joao Saffran &lt;jderezende@microsoft.com&gt;</pre>
</div>
</content>
</entry>
</feed>
