llvm-project.git/llvm/lib/Target/AMDGPU/SIModeRegister.cpp, branch main

[AMDGPU][NPM] Port SIModeRegister to NPM (#129014)

2025-03-04T05:21:03+00:00

[AMDGPU][True16][CodeGen] true16 codegen pat for fptrunc_round (#124044)

2025-01-30T23:31:52+00:00

true16 codegen pattern for fptrunc_round f32 to f16.

For mir test, split to preGFX11 and postGFX11. and add a true16 and a
fake16 test accordingly

[AMDGPU][True16][MC] 16bit vsrc and vdst support in MC (#104510)

2024-09-11T14:48:11+00:00

This is a large patch includes the MC level support for V_CVT_F16_F32,
V_CVT_F32_F16 and V_LDEXP_F16 in true16 format.

This patch includes the asm/disasm changes to encode/decode the 16bit
vsrc, vdst and src modifieres for vop and dpp format. This patch is a
dependency for many 16 bit instructions while only three instructions
are updated to make it easier to review.

There will be another patch to support these three instructions in the
codeGen level, this patch just replaces these two instructions with its
fake16 format.

AMDGPU: Add f64 to f32 support for llvm.fptrunc.round (#107481)

2024-09-06T05:57:27+00:00

AMDGPU: Add tonearest and towardzero roundings for intrinsic llvm.fptrunc.round (#104486)

2024-08-17T18:22:47+00:00

This work simplifies and generalizes the instruction definition for
intrinsic llvm.fptrunc.round. We no longer name the instruction with the
rounding mode. Instead, we introduce an immediate operand for the
rounding mode for the pseudo instruction. This immediate will be used to
set up the hardware mode register at the time the real instruction is
generated. We name the pseudo instruction as FPTRUNC_ROUND_F16_F32 (for
f32 -> f16), which is easy to generalize for other types.

"round.towardzero" and "round.tonearest" are added for f32 -> f16
truncating, in addition to the existing "round.upward" and
"round.downward". Other rounding modes are not supported by hardware at
this moment.

[AMDGPU] Fix mode register pass for constrained FP operations (#90085)

2024-05-03T17:47:15+00:00

This PR will fix the si-mode-register pass which is inserting an extra
setreg instruction in case of constrained FP operations. This pass will
be ignored for strictfp functions.

[AMDGPU][NFC] Have helpers to deal with encoding fields. (#82772)

2024-02-23T17:34:55+00:00

These are hoped to provide more convenient and less error prone
facilities to encode and decode fields than manually defined constants
and functions.

[AMDGPU] Reapply 'Sign extend simm16 in setreg intrinsic' (#78492)

2024-01-18T01:23:46+00:00

We currently force users to use a negative contant in the intrinsic
call. Changing it zext would break existing programs, so just sign
extend an argument.

[AMDGPU] Modernize Status and BlockData (NFC)

2023-04-16T20:03:02+00:00

Identified with modernize-use-default-member-init.

[Target] Use llvm::count{l,r}_{zero,one} (NFC)

2023-01-28T17:23:07+00:00