<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/libclc, branch users/boomanaiden154/main.lit-remove-python-27-code-paths-in-builtin-diff</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[NFC][libclc] Replace _CLC_V_V_VP_VECTORIZE macro with use of unary_def_with_ptr_scalarize.inc (#157002)</title>
<updated>2025-09-09T00:11:27+00:00</updated>
<author>
<name>Wenju He</name>
<email>wenju.he@intel.com</email>
</author>
<published>2025-09-09T00:11:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=00b13c4103ae65a6c30cf3b0615fa0d5169eed2c'/>
<id>00b13c4103ae65a6c30cf3b0615fa0d5169eed2c</id>
<content type='text'>
Commit d50f2ef437ae removes _CLC_V_V_VP_VECTORIZE from header file, but
the macro is still used in our downstream code:
https://github.com/intel/llvm/blob/0433e4d6f5c9/libclc/libspirv/lib/ptx-nvidiacl/math/modf.cl#L30
https://github.com/intel/llvm/blob/0433e4d6f5c9/libclc/libspirv/lib/ptx-nvidiacl/math/sincos.cl#L31

We can either revert d50f2ef437ae or replace macro with use of
unary_def_with_ptr_scalarize.inc. This PR uses the latter approach.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit d50f2ef437ae removes _CLC_V_V_VP_VECTORIZE from header file, but
the macro is still used in our downstream code:
https://github.com/intel/llvm/blob/0433e4d6f5c9/libclc/libspirv/lib/ptx-nvidiacl/math/modf.cl#L30
https://github.com/intel/llvm/blob/0433e4d6f5c9/libclc/libspirv/lib/ptx-nvidiacl/math/sincos.cl#L31

We can either revert d50f2ef437ae or replace macro with use of
unary_def_with_ptr_scalarize.inc. This PR uses the latter approach.</pre>
</div>
</content>
</entry>
<entry>
<title>[libclc] Implement erf/erfc vector function with loop since scalar function is large (#157055)</title>
<updated>2025-09-05T11:58:24+00:00</updated>
<author>
<name>Wenju He</name>
<email>wenju.he@intel.com</email>
</author>
<published>2025-09-05T11:58:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a271d07488a85ce677674bbe8101b10efff58c95'/>
<id>a271d07488a85ce677674bbe8101b10efff58c95</id>
<content type='text'>
This PR reduces amdgcn--amdhsa.bc size by 1.8% and nvptx64--nvidiacl.bc
size by 4%.
Loop trip count is constant and backend can decide whether to unroll.

---------

Co-authored-by: Copilot &lt;175728472+Copilot@users.noreply.github.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This PR reduces amdgcn--amdhsa.bc size by 1.8% and nvptx64--nvidiacl.bc
size by 4%.
Loop trip count is constant and backend can decide whether to unroll.

---------

Co-authored-by: Copilot &lt;175728472+Copilot@users.noreply.github.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>[libclc] Override generic symbol using llvm-link --override flag instead of using weak linkage (#156778)</title>
<updated>2025-09-05T11:58:07+00:00</updated>
<author>
<name>Wenju He</name>
<email>wenju.he@intel.com</email>
</author>
<published>2025-09-05T11:58:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=28d9255aa7c05738c7fd88711006d71d4dfc952a'/>
<id>28d9255aa7c05738c7fd88711006d71d4dfc952a</id>
<content type='text'>
Before this PR, weak linkage is applied to a few CLC generic functions
to allow target specific implementation to override generic one.
However, adding weak linkage has a side effect of preventing
inter-procedural optimization, such as PostOrderFunctionAttrsPass,
because weak function doesn't have exact definition (as determined by
hasExactDefinition in the pass).

This PR resolves the issue by adding --override flag for every
non-generic bitcode file in llvm-link run. This approach eliminates the
need for weak linkage while still allowing target-specific
implementation to override generic one.
llvm-diff shows imporoved attribute deduction for some functions in
amdgcn--amdhsa.bc, e.g.
  %23 = tail call half @llvm.sqrt.f16(half %22)
=&gt;
  %23 = tail call noundef half @llvm.sqrt.f16(half %22)</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before this PR, weak linkage is applied to a few CLC generic functions
to allow target specific implementation to override generic one.
However, adding weak linkage has a side effect of preventing
inter-procedural optimization, such as PostOrderFunctionAttrsPass,
because weak function doesn't have exact definition (as determined by
hasExactDefinition in the pass).

This PR resolves the issue by adding --override flag for every
non-generic bitcode file in llvm-link run. This approach eliminates the
need for weak linkage while still allowing target-specific
implementation to override generic one.
llvm-diff shows imporoved attribute deduction for some functions in
amdgcn--amdhsa.bc, e.g.
  %23 = tail call half @llvm.sqrt.f16(half %22)
=&gt;
  %23 = tail call noundef half @llvm.sqrt.f16(half %22)</pre>
</div>
</content>
</entry>
<entry>
<title>[NFC][libclc] Set MACRO_ARCH to ${ARCH} uncondionally before customizing (#156789)</title>
<updated>2025-09-04T23:35:40+00:00</updated>
<author>
<name>Wenju He</name>
<email>wenju.he@intel.com</email>
</author>
<published>2025-09-04T23:35:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=fb5626fdd52905a0f4ade44221f83f0e469b9a8c'/>
<id>fb5626fdd52905a0f4ade44221f83f0e469b9a8c</id>
<content type='text'>
Our downstream libclc add a few more targets that customizes build_flags
and opt_flags. Then in each customization block, MACRO_ARCH is defined
to be ${ARCH}.
Hoisting MACRO_ARCH definition out of if-else-end block avoids code
duplication. This also avoids potential error when MACRO_ARCH definition
is forgotten, e.g. in https://github.com/intel/llvm/pull/19971.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Our downstream libclc add a few more targets that customizes build_flags
and opt_flags. Then in each customization block, MACRO_ARCH is defined
to be ${ARCH}.
Hoisting MACRO_ARCH definition out of if-else-end block avoids code
duplication. This also avoids potential error when MACRO_ARCH definition
is forgotten, e.g. in https://github.com/intel/llvm/pull/19971.</pre>
</div>
</content>
</entry>
<entry>
<title>[NFC][libclc] Remove unused -DCLC_INTERNAL build flag, remove unused M_LOG210 (#156590)</title>
<updated>2025-09-04T22:44:37+00:00</updated>
<author>
<name>Wenju He</name>
<email>wenju.he@intel.com</email>
</author>
<published>2025-09-04T22:44:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3dff9ac495d4ad98743f6cf267439d3a3c3b2d6b'/>
<id>3dff9ac495d4ad98743f6cf267439d3a3c3b2d6b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[NFC][libclc] Move _CLC_V_V_VP_VECTORIZE macro into clc_lgamma_r.cl and delete clcmacro.h (#156280)</title>
<updated>2025-09-03T00:23:01+00:00</updated>
<author>
<name>Wenju He</name>
<email>wenju.he@intel.com</email>
</author>
<published>2025-09-03T00:23:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d50f2ef437aeb1784f7556fd63639487f245ffaa'/>
<id>d50f2ef437aeb1784f7556fd63639487f245ffaa</id>
<content type='text'>
clcmacro.h only defines _CLC_V_V_VP_VECTORIZE which is only used in
clc/lib/generic/math/clc_lgamma_r.cl.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
clcmacro.h only defines _CLC_V_V_VP_VECTORIZE which is only used in
clc/lib/generic/math/clc_lgamma_r.cl.</pre>
</div>
</content>
</entry>
<entry>
<title>[libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU (#152275)</title>
<updated>2025-09-01T03:03:45+00:00</updated>
<author>
<name>Wenju He</name>
<email>wenju.he@intel.com</email>
</author>
<published>2025-09-01T03:03:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=a247da4f9363116c54b91a37755edd994c56dbf8'/>
<id>a247da4f9363116c54b91a37755edd994c56dbf8</id>
<content type='text'>
It is necessary to add MemorySemantic argument for AMDGPU which means
the memory or address space to which the memory ordering is applied.

The MemorySemantic is also necessary for implementing the SPIR-V
MemoryBarrier instruction. Additionally, the implementation of
__clc_mem_fence on Intel GPUs requires the MemorySemantic argument.

Using __builtin_amdgcn_fence for AMDGPU is follow-up of
https://github.com/llvm/llvm-project/pull/151446#discussion_r2254006508

llvm-diff shows no change to nvptx64--nvidiacl.bc.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is necessary to add MemorySemantic argument for AMDGPU which means
the memory or address space to which the memory ordering is applied.

The MemorySemantic is also necessary for implementing the SPIR-V
MemoryBarrier instruction. Additionally, the implementation of
__clc_mem_fence on Intel GPUs requires the MemorySemantic argument.

Using __builtin_amdgcn_fence for AMDGPU is follow-up of
https://github.com/llvm/llvm-project/pull/151446#discussion_r2254006508

llvm-diff shows no change to nvptx64--nvidiacl.bc.</pre>
</div>
</content>
</entry>
<entry>
<title>libclc: CMake: include GetClangResourceDir (#155836)</title>
<updated>2025-08-28T16:56:33+00:00</updated>
<author>
<name>Romaric Jodin</name>
<email>rjodin@google.com</email>
</author>
<published>2025-08-28T16:56:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=4a7205f892761bedf5208f30c8d30144c84fdd9f'/>
<id>4a7205f892761bedf5208f30c8d30144c84fdd9f</id>
<content type='text'>
`get_clang_resource_dir` is not guarantee to be there. Make sure of it
by including `GetClangResourceDir`.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
`get_clang_resource_dir` is not guarantee to be there. Make sure of it
by including `GetClangResourceDir`.</pre>
</div>
</content>
</entry>
<entry>
<title>[libclc] Only create a target per each compile command for cmake MSVC generator (#154479)</title>
<updated>2025-08-21T23:45:42+00:00</updated>
<author>
<name>Wenju He</name>
<email>wenju.he@intel.com</email>
</author>
<published>2025-08-21T23:45:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=e6d095e89c0d762c4b71dc8f6c99481893df07a9'/>
<id>e6d095e89c0d762c4b71dc8f6c99481893df07a9</id>
<content type='text'>
libclc sequential build issue addressed in commit 0c21d6b4c8ad is
specific to cmake MSVC generator. Therefore, this PR avoids creating a
large number of targets when a non-MSVC generator is used, such as the
Ninja generator, which is used in pre-merge CI on Windows in
llvm-project repo. We plan to migrate from MSVC generator to Ninja
generator in our downstream CI to fix flaky cmake bug `Cannot restore
timestamp`, which might be related to the large number of targets.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
libclc sequential build issue addressed in commit 0c21d6b4c8ad is
specific to cmake MSVC generator. Therefore, this PR avoids creating a
large number of targets when a non-MSVC generator is used, such as the
Ninja generator, which is used in pre-merge CI on Windows in
llvm-project repo. We plan to migrate from MSVC generator to Ninja
generator in our downstream CI to fix flaky cmake bug `Cannot restore
timestamp`, which might be related to the large number of targets.</pre>
</div>
</content>
</entry>
<entry>
<title>[libclc] Use elementwise ctlz/cttz builtins for CLC clz/ctz (#154535)</title>
<updated>2025-08-21T08:32:03+00:00</updated>
<author>
<name>Fraser Cormack</name>
<email>fraser@codeplay.com</email>
</author>
<published>2025-08-21T08:32:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=5c411b3c0bd6b5ba9546a09919f977fe6bc6ad4c'/>
<id>5c411b3c0bd6b5ba9546a09919f977fe6bc6ad4c</id>
<content type='text'>
Using the elementwise builtin optimizes the vector case; instead of
scalarizing we can compile directly to the vector intrinsics.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using the elementwise builtin optimizes the vector case; instead of
scalarizing we can compile directly to the vector intrinsics.</pre>
</div>
</content>
</entry>
</feed>
