llvm-project.git/llvm/test/CodeGen/AMDGPU/llvm.log.ll, branch users/mingmingl-llvm/samplefdo-profile-format

[AMDGPU] Remove `ApproxFuncFPMath` uses (#155578)

2025-08-28T03:09:01+00:00

One of options in `resetTargetOptions`, this removes `ApproxFuncFPMath`
in AMDGPU part.

[AMDGPU] Remove `UnsafeFPMath` uses (#151079)

2025-07-31T09:36:57+00:00

Remove `UnsafeFPMath` in AMDGPU part, it blocks some bugfixes related to
clang and the ultimate goal is to remove `resetTargetOptions` method in
`TargetMachine`, see FIXME in `resetTargetOptions`.
See also
https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast

https://discourse.llvm.org/t/allowfpopfusion-vs-sdnodeflags-hasallowcontract

---------

Co-authored-by: Matt Arsenault

[InstSimplify] Add poison propagation for trivially vectorizable intrinsics (#149243)

2025-07-20T02:37:21+00:00

Fixes https://github.com/llvm/llvm-project/issues/146769

Test cases added to
`llvm/test/Transforms/InstSimplify/fold-intrinsics.ll`

[AMDGPU][True16][Codegen] remove packed build_vector pattern from true16 (#148715)

2025-07-18T16:55:11+00:00

Some of the packed build_vector use vgpr_32 for i16/f16/bf16. 

In gfx11, bf16 arithmetic get promoted to f32 and this is done via v2i16
pack. In true16 mode this v2i16 pack is selected to a
build_vector/v_lshlrev pattern which only accepts VGPR32. This causes
isel to insert an illegal copy "vgpr32 = copy vgpr16" between def and
use. In the end this illegal copy confuses cse pass and trigger wrong
code elimination.

Remove the packed build_vector pattern from true16. After removal, ISel
will use vgpr16 build_vector patterns instead.

MachineScheduler: Reset next cluster candidate for each node (#139513)

2025-05-28T06:53:46+00:00

When a node is picked, we should reset its next cluster candidate to
null before releasing its successors/predecessors.

[DAGCombiner] Eliminate fp casts if we have the right fast math flags (#131345)

2025-04-28T10:21:51+00:00

When floating-point operations are legalized to operations of a higher
precision (e.g. f16 fadd being legalized to f32 fadd) then we get
narrowing then widening operations between each operation. With the
appropriate fast math flags (nnan ninf contract) we can eliminate these
casts.

[AMDGPU][True16][CodeGen] update GFX11Plus codegen test with true16 flag (#135078)

2025-04-23T17:06:52+00:00

This is a NFC patch.

This patch run a bulk update on CodeGen tests that are impacted by the
true16 features. This patch applies:
1. duplicate GFX11plus runlines and apply them with
"+mattr=+real-true16" and "+mattr=-real-true16"
2. update the test with the update script

For some GISEL runlines, the current CodeGen do not fully support the
true16 version. Still update the runlines, but comment out the failing
one, and added a "FIXME-TRUE16" comment to that test for easier
tracking. These test will be fixed in the following patches.

This is in a transition state that we support both
"+real-true16/-real-true16" in our code base. We plan to move to
"+real-true16" as default, and finally remove "-real-true16" mode and
test lines.

AMDGPU: Replace some float undef test uses with poison (#131090)

2025-03-13T13:07:48+00:00

Reland "[AMDGPU] Remove s_delay_alu for VALU->SGPR->SALU (#127212)" (#131111)

2025-03-13T09:26:20+00:00

We have a VALU->SGPR->SALU (VALU writing to SGPR and SALU reading from
it). When VALU is issued, it increments internal counter VA_SDST used to
track use of this SGPR. SALU will not issue until VA_SDST is zero, that
is when VALU is finished writing. Therefore, delays added by s_delay_alu
are not needed in this situation.

Revert "[AMDGPU] Remove s_delay_alu for VALU->SGPR->SALU (#127212)"

2025-03-12T19:09:09+00:00

This reverts commit 71582c6667a6334c688734cae628e906b3c1ac1d.

Multiple buildbot failures have been reported:
https://github.com/llvm/llvm-project/pull/127212