diff options
| author | Konstantin Bogdanov <thevar1able@users.noreply.github.com> | 2025-06-14 09:32:54 +0300 |
|---|---|---|
| committer | Tom Stellard <tstellar@redhat.com> | 2025-07-08 16:06:32 -0700 |
| commit | 87f0227cb60147a26a1eeb4fb06e3b505e9c7261 (patch) | |
| tree | 3794c285c20e5a47fbdc83d87b185a63546739ca /offload | |
| parent | df43f93388b7587c9843838a237dd57a9bd19b52 (diff) | |
[InstCombine] Avoid folding `select(umin(X, Y), X)` with min/max values in false arm (#143020)llvmorg-20.1.8release/20.x
Fixes https://github.com/llvm/llvm-project/issues/139050.
This patch adds a check to avoid folding min/max reduction into select, which may block loop vectorization.
The issue is that the following snippet:
```
declare i8 @llvm.umin.i8(i8, i8)
define i8 @masked_min_fold_bug(i8 %acc, i8 %val, i8 %mask) {
; CHECK-LABEL: @masked_min_fold_bug(
; CHECK: %cond = icmp eq i8 %mask, 0
; CHECK: %masked_val = select i1 %cond, i8 %val, i8 255
; CHECK: call i8 @llvm.umin.i8(i8 %acc, i8 %masked_val)
;
%cond = icmp eq i8 %mask, 0
%masked_val = select i1 %cond, i8 %val, i8 255
%res = call i8 @llvm.umin.i8(i8 %acc, i8 %masked_val)
ret i8 %res
}
```
is being optimized to the following code, which can not be vectorized
later.
```
declare i8 @llvm.umin.i8(i8, i8) #0
define i8 @masked_min_fold_bug(i8 %acc, i8 %val, i8 %mask) {
%cond = icmp eq i8 %mask, 0
%1 = call i8 @llvm.umin.i8(i8 %acc, i8 %val)
%res = select i1 %cond, i8 %1, i8 %acc
ret i8 %res
}
attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```
Expected:
```
declare i8 @llvm.umin.i8(i8, i8) #0
define i8 @masked_min_fold_bug(i8 %acc, i8 %val, i8 %mask) {
%cond = icmp eq i8 %mask, 0
%masked_val = select i1 %cond, i8 %val, i8 -1
%res = call i8 @llvm.umin.i8(i8 %acc, i8 %masked_val)
ret i8 %res
}
attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```
https://godbolt.org/z/cYMheKE5r
(cherry picked from commit 07fa6d1d90c714fa269529c3e5004a063d814c4a)
Diffstat (limited to 'offload')
0 files changed, 0 insertions, 0 deletions
