summaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
AgeCommit message (Collapse)Author
2025-11-20[SLP]Check if the non-schedulable phi parent node has unique operandsAlexey Bataev
Need to check if the non-schedulable phi parent node has unique operands, if the incoming node has copyables, and the node is commutative. Otherwise, there might be issues with the correct calculation of the dependencies. Fixes #168589
2025-11-19[SLP]Fix insertion point for setting for the nodesAlexey Bataev
The problem with the many def-use chain problems in SLP vectorizer are related to the fact that some nodes reuse the same instruction as insertion point. Insertion point is not the instruction, but the place between instructions. To set it correctly, better to generate pseudo instruction immediately after the last instruction, and use it as insertion point. It resolves the issues in most cases. Fixes #168512 #168576
2025-11-19[SLPVectorizer] Widen constant strided loads. (#162324)Mikhail Gudim
Given a set of pointers, check if they can be rearranged as follows (%s is a constant): %b + 0 * %s + 0 %b + 0 * %s + 1 %b + 0 * %s + 2 ... %b + 0 * %s + w %b + 1 * %s + 0 %b + 1 * %s + 1 %b + 1 * %s + 2 ... %b + 1 * %s + w ... If the pointers can be rearanged in the above pattern, it means that the memory can be accessed with a strided loads of width `w` and stride `%s`.
2025-11-19[NFC][LLVM] Namespace cleanup in SLPVectorizer (#168623)Rahul Joshi
- Remove file local functions out of `llvm` or anonymous namespace and make them static. - Use namespace qualifier to define `BoUpSLP` class and several template specializations.
2025-11-19[TTI] Use MemIntrinsicCostAttributes for getMaskedMemoryOpCost (#168029)Shih-Po Hung
- Split from #165532. This is a step toward a unified interface for masked/gather-scatter/strided/expand-compress cost modeling. - Replace the ad-hoc parameter list with a single attributes object. API change: ``` - InstructionCost getMaskedMemoryOpCost(Opcode, Src, Alignment, - AddressSpace, CostKind); + InstructionCost getMaskedMemoryOpCost(MemIntrinsicCostAttributes, + CostKind); ``` Notes: - NFCI intended: callers populate MemIntrinsicCostAttributes with the same information as before. - Follow-up: migrate gather/scatter, strided, and expand/compress cost queries to the same attributes-based entry point.
2025-11-18[SLP] Invariant loads cannot have a memory dependency on stores. (#167929)Michael Bedy
2025-11-16[SLP]Do not consider split nodes, when checking parent PHI-based nodesAlexey Bataev
The compiler should not consider split vectorize nodes, when checking for non-schedulable PHI-based parent nodes. Only pure PHI nodes must be considered, they only can be considered as explicit users, split nodes are not. Fixes #168268
2025-11-14[SLP]Check if the copyable element is a sub instruciton with abs in isCommutableAlexey Bataev
Need to check if the non-copyable element is an instruction before actually trying to check its NSW attribute.
2025-11-14Revert "[SLP]Check if the copyable element is a sub instruciton with abs in ↵Alexey Bataev
isCommutable" This reverts commit ddf5bb0a2e2d2dd77bce66173387d62ab7174d9f to fix buildbots https://lab.llvm.org/buildbot/#/builders/11/builds/28083.
2025-11-14[SLP]Check if the copyable element is a sub instruciton with abs in isCommutableAlexey Bataev
Need to check if the non-copyable element is an instruction before actually trying to check its NSW attribute.
2025-11-14[SLP]Enable Sub as a base instruction in copyablesAlexey Bataev
Patch adds support for sub instructions as main instruction in copyables elements. Also, adds a check if the base instruction is not profitable for the selection if at least one instruction with the main opcode is used as an immediate operand. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/163231
2025-11-11[SLP]Be careful when trying match/vectorize copyable nodes with external ↵Alexey Bataev
uses only Need to be careful when trying to match and/or build copyable node with the instructions, used outside the block only and if their operands immediately precede such instructions. In this case insertion point might be the same and it may cause broken def-use chain. Fixes #167366
2025-11-08[llvm] Remove unused local variables (NFC) (#167185)Kazu Hirata
Identified with bugprone-unused-local-non-trivial-variable.
2025-11-08[llvm] Remove unused local variables (NFC) (#167106)Kazu Hirata
Identified with bugprone-unused-local-non-trivial-variable.
2025-11-06[SLP]Gather copyable node, if its parent is copyable, but this node is still ↵Alexey Bataev
used outside of the block only If the current node is a copyable node and its parent is copyable too and still current node is only used outside, better to cancel scheduling for such node, because otherwise there might be wrong def-use chain built during vectorization. Fixes #166775
2025-11-03[SLP]Do not create copyable node, if parent node is non-schedulable and has ↵Alexey Bataev
a use in binop. If the parent node is non-schedulable (only externally used instructions), and at least one instruction has multiple uses and used in the binop, such copyable node should be created. Otherwise, it may contain wrong def-use chain model, which cannot be effective detected. Fixes #166035
2025-10-31[SLP]Fix the minbitwidth analysis for slternate opcodesAlexey Bataev
If the laternate operation is more stricter than the main operation, we cannot rely on the analysis of the main operation. In such case, better to avoid doing the analysis at all, since it may affect the overall result and lead to incorrect optimization Fixes #165878
2025-10-29[SLP] Do not match the gather node with copyable parent, containing insert ↵Alexey Bataev
instruction If the gather/buildvector node has the match and this matching node has a scheduled copyable parent, and the parent node of the original node has a last instruction, which is non-schedulable and is part of the schedule copyable parent, such matching node should be excluded as non-matching, since it produces wrong def-use chain. Fixes #165435
2025-10-28[SLP]Check only instructions with unique parent instruction userAlexey Bataev
Need to re-check the instruction with the non-schedulable parent, only if this parent has a user phi node (i.e. it is used only outside the block) and the user instruction has unique parent instruction. Fixes issue reported in https://github.com/llvm/llvm-project/commit/20675ee67d048a42482c246e25b284637d55347c#commitcomment-168863594
2025-10-26[SLP]Consider non-inst operands, when checking insts, used outside onlyAlexey Bataev
If the instructions in the node do not require scheduling and used outside basic block only, still need to check, if their operands are non-inst too. Such nodes should be emitted in the beginning of the block. Fixes #165151
2025-10-21[SLP] Check all copyable children for non-schedulable parent nodesAlexey Bataev
If the parent node is non-schedulable and it includes several copies of the same instruction, its operand might be replaced by the copyable nodes in multiple children nodes, and if the instruction is commutative, they can be used in different operands. The compiler shall consider this opportunity, taking into account that non-copyable children are scheduled only ones for the same parent instruction. Fixes #164242
2025-10-20Revert "[SLP] Check all copyable children for non-schedulable parent nodes"Alexey Bataev
This reverts commit e7f370f910701b6c67d41dab80e645227692c58b to fix buildbots https://lab.llvm.org/buildbot/#/builders/213/builds/1056.
2025-10-20[SLP] Check all copyable children for non-schedulable parent nodesAlexey Bataev
If the parent node is non-schedulable and it includes several copies of the same instruction, its operand might be replaced by the copyable nodes in multiple children nodes, and if the instruction is commutative, they can be used in different operands. The compiler shall consider this opportunity, taking into account that non-copyable children are scheduled only ones for the same parent instruction. Fixes #164242
2025-10-20[SLP]Do not pack div-like copyable valuesAlexey Bataev
If a main instruction in the copyables is a div-like instruction, the compiler cannot pack duplicates, extending with poisons, these instructions, being vectorize, will result in undefined behavior. Fixes #164185
2025-10-19[SLP]Correctly calculate number of copyable operandsAlexey Bataev
The compiler shall not check for overflow of the number of copyable operands counter, otherwise non-copyable operand can be counted as copyable and lead to a compiler crash. Fixes #164164
2025-10-18[SLPVectorizer] Refactor isStridedLoad, NFC. (#163844)Mikhail Gudim
Move the checks that all strides are the same from `isStridedLoad` to a new function `analyzeConstantStrideCandidate`. This is to reduce the diff for the following MRs which will modify the logic in `analyzeConstantStrideCandidate` to cover the case of widening of the strided load. All the checks that are left in `isStridedLoad` will be reused.
2025-10-17[SLP]Fix insert point for copyable node with the last inst, used only ↵Alexey Bataev
outside the block If the copyable entry has the last instruction, used only outside the block, tha insert ion point for the vector code should be the last instruction itself, not the following one. It prevents wrong def-use sequences, which might be generated for the buildvector nodes. Fixes #163404
2025-10-13[slp][profcheck] Mark `select`s as having unknown profile (#162960)Mircea Trofin
There are 2 cases: - either the `select`​ condition is a vector of bools, case in which we don't currently have a way to represent the per-element branch probabilities anyway; - or the select condition is a scalar, for example from a `llvm.vector.reduce`​. We could potentially try and do more here - if the reduced vector contained conditions from other selects, for instance In either case, IIUC, chances are the `select`​ doesn't get lowered to a branch, at least I'm not seeing any evidence of that in an internal complex application (CSFDO + ThinLTO). Seems sufficient to mark the selects are unknown (for profiled functions); since that metadata carries with it the pass name (`DEBUG_TYPE`​) that marked it as such, we can revisit this if we detect later lowerings of these selects that would have required an actual profile. Issue #147390
2025-10-13[SLP]Enable support for logical ops in copyables (#162945)Alexey Bataev
Allows to use And, Or and Xor instructions as base for copyables.
2025-10-12[SLP]INsert postponed vector value after all uses, if the parent node is PHIAlexey Bataev
Need to insert the vector value for the postponed gather/buildvector node after all uses non only if the vector value of the user node is phi, but also if the user node itself is PHI node, which may produce vector phi + shuffle. Fixes #162799
2025-10-12[SLP]Support non-ordered copyable argument in non-commutative instructionsAlexey Bataev
If the non-commutative user has several same operands and at least one of them (but not the first) is copyable, need to consider this opportunity when calculating the number of dependencies. Otherwise, the schedule bundle might be not scheduled correctly and cause a compiler crash Fixes #162925
2025-10-10[SLP]Do not allow undefs being combined with divsAlexey Bataev
Undefs/poisons with divs in vector operations lead to undefined behavior, disabling this combination Fixes #162663
2025-10-10[SLPVectorizer] Move size checks (NFC). (#161867)Mikhail Gudim
Add the `analyzeRtStrideCandidate` function. In the future commits we're going to add the capability to widen strided loads to it. So, in this commit, we move the size / type checks into it, since it can possibly change size / type of load.
2025-10-08[SLP]Enable SDiv/UDiv support as main op in copyables (#161892)Alexey Bataev
Allow SDiv/UDiv as a main operation in copyables support
2025-10-06[SLP]Enable Shl as a base opcode in copyables (#156766)Alexey Bataev
Enables Shl matching for the nodes, where copyable can be modelled as shl %v, 0
2025-10-01[SLPVectorizer] Change arguments of 'isStridedLoad' (NFC) (#160401)Mikhail Gudim
This is needed to reduce the diff for the future work on widening strided loads. Also, with this change we'll be able to re-use this for the case when each pointer represents a start of a group of contiguous loads.
2025-09-30[SLPVectorizer] Clear `TreeEntryToStridedPtrInfoMap`. (#160544)Mikhail Gudim
We need to clear `TreeEntryToStridedPtrInfoMap` in `deleteTree`.
2025-09-29[SLP]Fix mixing xor instructions in the same opcode analysisAlexey Bataev
Xor with 0 operand should not be compatible with multiplications-based instructions, only with or/xor/add/sub. Fixes #161140
2025-09-25[SLP]Correctly set the insert point for insertlements with copyable argumentsAlexey Bataev
Need to find the last insertelement instruction in the list for the copyable arguments, otherwise wrong def-use chain may be built Fixes #160671
2025-09-23[SLPVectorizer] Move size checks (NFC) (#159361)Mikhail Gudim
Move size checks inside `isStridedLoad`. In the future we plan to possibly change the size and type of strided load there.
2025-09-18[SLP]Clear the operands deps of non-schedulable nodes, if previously all ↵Alexey Bataev
operands were copyable If all operands of the non-schedulable nodes were previously only copyables, need to clear the dependencies of the original schedule data for such copyable operands and recalculate them to correctly handle number of dependecies. Fixes #159406
2025-09-17[PatternMatch] Introduce match functor (NFC) (#159386)Ramkumar Ramachandra
A common idiom is the usage of the PatternMatch match function within a functional algorithm like all_of. Introduce a match functor to shorten this idiom. Co-authored-by: Luke Lau <luke@igalia.com>
2025-09-17[SLP][NFC] Refactor a long `if` into an early `return` (#156410)Piotr Fusik
2025-09-16[SLPVectorizer][NFC] Save stride in a map. (#157706)Mikhail Gudim
In order to avoid recalculating stride of strided load twice save it in a map.
2025-09-15[SLP]Add a check if the user itself is commutableAlexey Bataev
If the commutable instruction can be represented as a non-commutable vector instruction (like add 0, %v can be represented as a part of sub nodes with operation sub %v, 0), its operands might still be reordered and this should be accounted when checking for copyables in operands Fixes #158293
2025-09-14[SLPVectorizer] Test -1 stride loads. (#158358)Mikhail Gudim
Add a test to generate -1 stride load and flags to force this behaviour.
2025-09-11[SLP][NFC] Remove unused local variable in lambda (#156835)Garth Lei
2025-09-10[SLP]Recalculate deps if the original instruction scheduled after being copyableAlexey Bataev
If the original instruction is going to be scheduled after same instruction being scheduled as copyable, need to recalculate dependencies. Otherwise, the dependencies maybe calculated incorrectly.
2025-09-08[SLP]Do not consider SExt/ZExt profitable for demotion, if the user is a ↵Alexey Bataev
bitcast to float If the user node of the SExt/ZExt node is a bitcast to a float point type, the node itself should not be considered legal to demote, since still the casting is required to match the size of the float point type. Fixes #157277
2025-09-07[SLP]Correctly schedule standalone schedule data, which is part of tree entryAlexey Bataev
If a standalone schedule data relates to a vectorized instruction, still need to schedule it as a part of pseudo-bundle to correctly handle dependencies between its child nodes.