<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/utils/TableGen/DecoderEmitter.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[NFC][TableGen] Fix namespace usage in various files (#161839)</title>
<updated>2025-10-03T16:27:04+00:00</updated>
<author>
<name>Rahul Joshi</name>
<email>rjoshi@nvidia.com</email>
</author>
<published>2025-10-03T16:27:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=bd7e228fa43d7da93aef19892cd8a7de3350835e'/>
<id>bd7e228fa43d7da93aef19892cd8a7de3350835e</id>
<content type='text'>
- Move standalone functions and variables out of anonymous namespace and
make them static.
- Eliminate `namespace llvm {}` wrapping all code in .cpp files, and
instead use namespace qualifier to define such functions
(https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions)
- Add namespace for X86DisassemblerShared.h.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Move standalone functions and variables out of anonymous namespace and
make them static.
- Eliminate `namespace llvm {}` wrapping all code in .cpp files, and
instead use namespace qualifier to define such functions
(https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions)
- Add namespace for X86DisassemblerShared.h.</pre>
</div>
</content>
</entry>
<entry>
<title>[TableGen][DecoderEmitter][RISCV] Always handle `bits&lt;0&gt;` (#159951)</title>
<updated>2025-09-22T17:50:17+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-09-22T17:50:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6a43c669d17ca6f47beda6c5b2428eb34a24fa4f'/>
<id>6a43c669d17ca6f47beda6c5b2428eb34a24fa4f</id>
<content type='text'>
Previously, `bits&lt;0&gt;` only had effect if `ignore-non-decodable-operands`
wasn't specified. Handle it even if the option was specified. This
should allow for a smoother transition to the option removed.

The change revealed a couple of inaccuracies in RISCV compressed
instruction definitions.
* `C_ADDI4SPN` has `bits&lt;5&gt; rs1` field, but `rs1` is not encoded. It
should be `bits&lt;0&gt;`.
* `C_ADDI16SP` has `bits&lt;5&gt; rd` in the base class, but it is unused
since `Inst{11-7}` is overwritten with constant bits.
We should instead set `rd = 2` and `Inst{11-7} = rd`. There are a couple
of alternative fixes, but this one is the shortest.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously, `bits&lt;0&gt;` only had effect if `ignore-non-decodable-operands`
wasn't specified. Handle it even if the option was specified. This
should allow for a smoother transition to the option removed.

The change revealed a couple of inaccuracies in RISCV compressed
instruction definitions.
* `C_ADDI4SPN` has `bits&lt;5&gt; rs1` field, but `rs1` is not encoded. It
should be `bits&lt;0&gt;`.
* `C_ADDI16SP` has `bits&lt;5&gt; rd` in the base class, but it is unused
since `Inst{11-7}` is overwritten with constant bits.
We should instead set `rd = 2` and `Inst{11-7} = rd`. There are a couple
of alternative fixes, but this one is the shortest.</pre>
</div>
</content>
</entry>
<entry>
<title>[TableGen][DecoderEmitter] Rework table construction/emission (#155889)</title>
<updated>2025-09-20T01:58:53+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-09-20T01:58:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=60bdf0965441ef244a4fd79e4cd056359b9d31d5'/>
<id>60bdf0965441ef244a4fd79e4cd056359b9d31d5</id>
<content type='text'>
### Current state

We have FilterChooser class, which can be thought of as a **tree of
encodings**. Tree nodes are instances of FilterChooser itself, and come
in two types:

* A node containing single encoding that has *constant* bits in the
specified bit range, a.k.a. singleton node.
* A node containing only child nodes, where each child represents a set
of encodings that have the same *constant* bits in the specified bit
range.

Either of these nodes can have an additional child, which represents a
set of encodings that have some *unknown* bits in the same bit range.

As can be seen, the **data structure is very high level**.

The encoding tree represented by FilterChooser is then converted into a
finite-state machine (FSM), represented as **byte array**. The
translation is straightforward: for each node of the tree we emit a
sequence of opcodes that check encoding bits and predicates for each
encoding. For a singleton node we also emit a terminal "decode" opcode.

The translation is done in one go, and this has negative consequences:

* We miss optimization opportunities.
* We have to use "fixups" when encoding transitions in the FSM since we
don't know the size of the data we want to jump over in advance. We have
to emit the data first and then fix up the location of the jump. This
means the fixup size has to be large enough to encode the longest jump,
so **most of the transitions are encoded inefficiently**.
* Finally, when converting the FSM into human readable form, we have to
**decode the byte array we've just emitted**. This is also done in one
go, so we **can't do any pretty printing**.

### This PR

We introduce an intermediary data structure, decoder tree, that can be
thought as **AST of the decoder program**.
This data structure is **low level** and as such allows for optimization
and analysis.
It resolves all the issues listed above. We now can:
* Emit more optimal opcode sequences.
* Compute the size of the data to be emitted in advance, avoiding
fixups.
* Do pretty printing.

Serialization is done by a new class, DecoderTableEmitter, which
converts the AST into a FSM in **textual form**, streamed right into the
output file.

### Results
* The new approach immediately resulted in 12% total table size savings
across all in-tree targets, without implementing any optimizations on
the AST. Many tables observe ~20% size reduction.
* The generated file is much more readable.
* The implementation is arguably simpler and more straightforward (the
diff is only +150~200 lines, which feels rather small for the benefits
the change gives).</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
### Current state

We have FilterChooser class, which can be thought of as a **tree of
encodings**. Tree nodes are instances of FilterChooser itself, and come
in two types:

* A node containing single encoding that has *constant* bits in the
specified bit range, a.k.a. singleton node.
* A node containing only child nodes, where each child represents a set
of encodings that have the same *constant* bits in the specified bit
range.

Either of these nodes can have an additional child, which represents a
set of encodings that have some *unknown* bits in the same bit range.

As can be seen, the **data structure is very high level**.

The encoding tree represented by FilterChooser is then converted into a
finite-state machine (FSM), represented as **byte array**. The
translation is straightforward: for each node of the tree we emit a
sequence of opcodes that check encoding bits and predicates for each
encoding. For a singleton node we also emit a terminal "decode" opcode.

The translation is done in one go, and this has negative consequences:

* We miss optimization opportunities.
* We have to use "fixups" when encoding transitions in the FSM since we
don't know the size of the data we want to jump over in advance. We have
to emit the data first and then fix up the location of the jump. This
means the fixup size has to be large enough to encode the longest jump,
so **most of the transitions are encoded inefficiently**.
* Finally, when converting the FSM into human readable form, we have to
**decode the byte array we've just emitted**. This is also done in one
go, so we **can't do any pretty printing**.

### This PR

We introduce an intermediary data structure, decoder tree, that can be
thought as **AST of the decoder program**.
This data structure is **low level** and as such allows for optimization
and analysis.
It resolves all the issues listed above. We now can:
* Emit more optimal opcode sequences.
* Compute the size of the data to be emitted in advance, avoiding
fixups.
* Do pretty printing.

Serialization is done by a new class, DecoderTableEmitter, which
converts the AST into a FSM in **textual form**, streamed right into the
output file.

### Results
* The new approach immediately resulted in 12% total table size savings
across all in-tree targets, without implementing any optimizations on
the AST. Many tables observe ~20% size reduction.
* The generated file is much more readable.
* The implementation is arguably simpler and more straightforward (the
diff is only +150~200 lines, which feels rather small for the benefits
the change gives).</pre>
</div>
</content>
</entry>
<entry>
<title>[TableGen] Remove unused Target from InstructionEncoding methods (NFC) (#159833)</title>
<updated>2025-09-20T00:05:55+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-09-20T00:05:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=386301cd5ccbbf2b65edf627d6e05f14146d127b'/>
<id>386301cd5ccbbf2b65edf627d6e05f14146d127b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>CodeGen: Add RegisterClass by HwMode (#158269)</title>
<updated>2025-09-19T11:08:51+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2025-09-19T11:08:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6b54c92be02079eff4f4edfbe667e60c3a1949df'/>
<id>6b54c92be02079eff4f4edfbe667e60c3a1949df</id>
<content type='text'>
This is a generalization of the LookupPtrRegClass mechanism.
AMDGPU has several use cases for swapping the register class of
instruction operands based on the subtarget, but none of them
really fit into the box of being pointer-like.

The current system requires manual management of an arbitrary integer
ID. For the AMDGPU use case, this would end up being around 40 new
entries to manage.

This just introduces the base infrastructure. I have ports of all
the target specific usage of PointerLikeRegClass ready.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a generalization of the LookupPtrRegClass mechanism.
AMDGPU has several use cases for swapping the register class of
instruction operands based on the subtarget, but none of them
really fit into the box of being pointer-like.

The current system requires manual management of an arbitrary integer
ID. For the AMDGPU use case, this would end up being around 40 new
entries to manage.

This just introduces the base infrastructure. I have ports of all
the target specific usage of PointerLikeRegClass ready.</pre>
</div>
</content>
</entry>
<entry>
<title>[TableGen][DecoderEmitter] Sink repeated code after the switch (NFC) (#159549)</title>
<updated>2025-09-18T11:09:35+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-09-18T11:09:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3defab36b73d60e616f5d6fe0e88e435c3dfc0dc'/>
<id>3defab36b73d60e616f5d6fe0e88e435c3dfc0dc</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[TableGen][DecoderEmitter] Simplify FilterChooser::getIslands() (NFC) (#159218)</title>
<updated>2025-09-17T19:15:20+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-09-17T19:15:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2c2fec331d550a894b7af1de4ed492119a7733eb'/>
<id>2c2fec331d550a894b7af1de4ed492119a7733eb</id>
<content type='text'>
Also turn the method into a static function so it can be used without
an instance of the class.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Also turn the method into a static function so it can be used without
an instance of the class.
</pre>
</div>
</content>
</entry>
<entry>
<title>[TableGen][DecoderEmitter] Inline `insertBits()` (NFC) (#159353)</title>
<updated>2025-09-17T13:50:00+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-09-17T13:50:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=db204d92191b370891ef69c621c092a5c9c417bf'/>
<id>db204d92191b370891ef69c621c092a5c9c417bf</id>
<content type='text'>
`tmp` is always of integer type, so we can use bitwise OR and shift.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
`tmp` is always of integer type, so we can use bitwise OR and shift.</pre>
</div>
</content>
</entry>
<entry>
<title>[TableGen][DecoderEmitter] Merge OPC_Decode with OPC_TryDecode (#159178)</title>
<updated>2025-09-16T23:28:47+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-09-16T23:28:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=82acc31fd4b26723b61dae76da120c0c649d0b97'/>
<id>82acc31fd4b26723b61dae76da120c0c649d0b97</id>
<content type='text'>
OPC_Decode is a specialized OPC_TryDecode. The difference between them
is that OPC_TryDecode performs a "completeness check", while  OPC_Decode
asserts that the check passes.

The check is just a boolean test, which is nothing compared to the
complexity of the decoding process, so there is no point in having a
special opcode that optimizes the check.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
OPC_Decode is a specialized OPC_TryDecode. The difference between them
is that OPC_TryDecode performs a "completeness check", while  OPC_Decode
asserts that the check passes.

The check is just a boolean test, which is nothing compared to the
complexity of the decoding process, so there is no point in having a
special opcode that optimizes the check.</pre>
</div>
</content>
</entry>
<entry>
<title>[TableGen][DecoderEmitter] Replace opcode mask with booleans (NFC) (#159113)</title>
<updated>2025-09-16T16:23:29+00:00</updated>
<author>
<name>Sergei Barannikov</name>
<email>barannikov88@gmail.com</email>
</author>
<published>2025-09-16T16:23:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=0864965e54a2ae050898ca0a3bb1d07f8fabe954'/>
<id>0864965e54a2ae050898ca0a3bb1d07f8fabe954</id>
<content type='text'>
Extracted from #155889, which removes inclusion of `MCDecoderOps.h`.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Extracted from #155889, which removes inclusion of `MCDecoderOps.h`.</pre>
</div>
</content>
</entry>
</feed>
