<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/llvm/lib/Transforms/Vectorize/VectorCombine.cpp, branch users/chapuni/cov/single/unify</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[VectorCombine] Use getInstructionCost to cost Shuffle. (#122068)</title>
<updated>2025-01-08T20:48:40+00:00</updated>
<author>
<name>David Green</name>
<email>david.green@arm.com</email>
</author>
<published>2025-01-08T20:48:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=676c641718d0417a000b69917721bcc003d71d93'/>
<id>676c641718d0417a000b69917721bcc003d71d93</id>
<content type='text'>
This allows it to produce a more accurate cost for the shuffle, using
the more accurate calls to getShuffleCost in getInstructionCost. It
helps fix some of the regressions from vector combine a little while
ago, now that we have better subvector extract costs.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This allows it to produce a more accurate cost for the shuffle, using
the more accurate calls to getShuffleCost in getInstructionCost. It
helps fix some of the regressions from vector combine a little while
ago, now that we have better subvector extract costs.</pre>
</div>
</content>
</entry>
<entry>
<title>[CostModel][X86] getVectorInstrCost - correctly cost v4f32 insertelement into index 0</title>
<updated>2025-01-07T12:23:45+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2025-01-07T12:23:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a5e129ccdedf5c269a8e0fcad5e21381a7f0342c'/>
<id>a5e129ccdedf5c269a8e0fcad5e21381a7f0342c</id>
<content type='text'>
This is just the MOVSS instruction (SSE41 INSERTPS is still necessary for index != 0)

This exposed an issue in VectorCombine::foldInsExtFNeg - we need to use the more general SK_PermuteTwoSrc shuffle kind to allow getShuffleCost to match other shuffle kinds (not just SK_Select).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is just the MOVSS instruction (SSE41 INSERTPS is still necessary for index != 0)

This exposed an issue in VectorCombine::foldInsExtFNeg - we need to use the more general SK_PermuteTwoSrc shuffle kind to allow getShuffleCost to match other shuffle kinds (not just SK_Select).
</pre>
</div>
</content>
</entry>
<entry>
<title>[VectorCombine] Remove superfluous whitespace from debug log comment. NFC.</title>
<updated>2025-01-06T15:37:15+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2025-01-06T15:09:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d993b11b86dcae75b582939337770eaf1c1a228b'/>
<id>d993b11b86dcae75b582939337770eaf1c1a228b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[VectorCombine] foldInsExtVectorToShuffle - ignore shuffle costs for 'identity' insertion masks</title>
<updated>2025-01-05T13:02:31+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2025-01-05T13:02:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=054e7c59713c67ad7b65a92e4b8887076d3881b9'/>
<id>054e7c59713c67ad7b65a92e4b8887076d3881b9</id>
<content type='text'>
&lt;u,1,u,u&gt; 'inplace' single src shuffles can be treated as free identity shuffles - ignore any shuffle cost (similar to what we already do in other folds like foldShuffleOfShuffles) - eventually getShuffleCost should just return TCC_Free in these cases but in a lot of the targets' shuffle cost logic this currently ends up treated as a generic SK_PermuteSingleSrc.

We still want to generate the shuffle as it will help further shuffle folds with the additional PoisonMaskElem 'undemanded' elements.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
&lt;u,1,u,u&gt; 'inplace' single src shuffles can be treated as free identity shuffles - ignore any shuffle cost (similar to what we already do in other folds like foldShuffleOfShuffles) - eventually getShuffleCost should just return TCC_Free in these cases but in a lot of the targets' shuffle cost logic this currently ends up treated as a generic SK_PermuteSingleSrc.

We still want to generate the shuffle as it will help further shuffle folds with the additional PoisonMaskElem 'undemanded' elements.
</pre>
</div>
</content>
</entry>
<entry>
<title>[VectorCombine] foldShuffleOfBinops - fold shuffle(binop(shuffle(x),shuffle(z)),binop(shuffle(y),shuffle(w)) -&gt; binop(shuffle(x,z),shuffle(y,w)) (#120984)</title>
<updated>2025-01-03T10:29:07+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2025-01-03T10:29:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=e3ec5a728674fd775bb85a7d159acdb4fa1d69c2'/>
<id>e3ec5a728674fd775bb85a7d159acdb4fa1d69c2</id>
<content type='text'>
Some patterns (in particular horizontal style patterns) can end up with shuffles straddling both sides of a binop/cmp.

Where individually the folds aren't worth it, by merging the (oneuse) shuffles we can notably reduce the net instruction count and cost.

One of the final steps towards finally addressing #34072</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some patterns (in particular horizontal style patterns) can end up with shuffles straddling both sides of a binop/cmp.

Where individually the folds aren't worth it, by merging the (oneuse) shuffles we can notably reduce the net instruction count and cost.

One of the final steps towards finally addressing #34072</pre>
</div>
</content>
</entry>
<entry>
<title>[VectorCombine] eraseInstruction - ensure we reattempt to fold other users of an erased instruction's operands (REAPPLIED)</title>
<updated>2025-01-02T18:19:02+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2025-01-02T18:18:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=035e64c0ec02b237a266ebc672718037fdd53eb2'/>
<id>035e64c0ec02b237a266ebc672718037fdd53eb2</id>
<content type='text'>
As we're reducing the use count of the operands its more likely that they will now fold, as they were previously being prevented by a m_OneUse check, or the cost of retaining the extra instruction had been too high.

This is necessary for some upcoming patches, although the only change so far is instruction ordering as it allows some SSE folds of 256/512-bit with 128-bit subvectors to occur earlier in foldShuffleToIdentity as the subvector concats are free.

Reapplied with a fix for foldSingleElementStore/scalarizeLoadExtract which were replacing/removing memory operations - we need to ensure that the worklist is populated in the correct order so all users of the old memory operations are erased first, so there are no remaining users of the loads when its time to remove them as well.

Pulled out of #120984
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As we're reducing the use count of the operands its more likely that they will now fold, as they were previously being prevented by a m_OneUse check, or the cost of retaining the extra instruction had been too high.

This is necessary for some upcoming patches, although the only change so far is instruction ordering as it allows some SSE folds of 256/512-bit with 128-bit subvectors to occur earlier in foldShuffleToIdentity as the subvector concats are free.

Reapplied with a fix for foldSingleElementStore/scalarizeLoadExtract which were replacing/removing memory operations - we need to ensure that the worklist is populated in the correct order so all users of the old memory operations are erased first, so there are no remaining users of the loads when its time to remove them as well.

Pulled out of #120984
</pre>
</div>
</content>
</entry>
<entry>
<title>[VectorCombine] replaceValue - add "VC: Replacing" debug message to help the log show replacement for old/new.</title>
<updated>2025-01-02T17:23:13+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2025-01-02T17:17:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=f739aa4004165dc64d3a1f418d5ad3c84886f01a'/>
<id>f739aa4004165dc64d3a1f418d5ad3c84886f01a</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[VectorCombine] scalarizeLoadExtract - consistently use LoadInst and ExtractElementInst specific operand getters. NFC</title>
<updated>2024-12-31T14:42:39+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2024-12-31T14:42:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=b195bb87e1a0120d8bc6f7fd7e6a7424bd664004'/>
<id>b195bb87e1a0120d8bc6f7fd7e6a7424bd664004</id>
<content type='text'>
Noticed while investigating the hung builds reported after af83093933ca73bc82c33130f8bda9f1ae54aae2
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Noticed while investigating the hung builds reported after af83093933ca73bc82c33130f8bda9f1ae54aae2
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert af83093933ca73bc82c33130f8bda9f1ae54aae2 "[VectorCombine] eraseInstruction - ensure we reattempt to fold other users of an erased instruction's operands"</title>
<updated>2024-12-30T21:20:56+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2024-12-30T21:20:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d5a96eb125eaf661f7f9aad4cd184973fe750528'/>
<id>d5a96eb125eaf661f7f9aad4cd184973fe750528</id>
<content type='text'>
Reports of hung builds, but I don't have time to investigate at the moment.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reports of hung builds, but I don't have time to investigate at the moment.
</pre>
</div>
</content>
</entry>
<entry>
<title>[VectorCombine] eraseInstruction - ensure we reattempt to fold other users of an erased instruction's operands</title>
<updated>2024-12-30T17:52:42+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2024-12-23T17:32:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=af83093933ca73bc82c33130f8bda9f1ae54aae2'/>
<id>af83093933ca73bc82c33130f8bda9f1ae54aae2</id>
<content type='text'>
As we're reducing the use count of the operands its more likely that they will now fold, as they were previously being prevented by a m_OneUse check, or the cost of retaining the extra instruction had been too high.

This is necessary for some upcoming patches, although the only change so far is instruction ordering as it allows some SSE folds of 256/512-bit with 128-bit subvectors to occur earlier in foldShuffleToIdentity as the subvector concats are free.

Pulled out of #120984
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As we're reducing the use count of the operands its more likely that they will now fold, as they were previously being prevented by a m_OneUse check, or the cost of retaining the extra instruction had been too high.

This is necessary for some upcoming patches, although the only change so far is instruction ordering as it allows some SSE folds of 256/512-bit with 128-bit subvectors to occur earlier in foldShuffleToIdentity as the subvector concats are free.

Pulled out of #120984
</pre>
</div>
</content>
</entry>
</feed>
