<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/flang-rt/lib/runtime/assign.cpp, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[flang] Add special genre for allocatable and pointer device component (#157731)</title>
<updated>2025-09-09T20:12:20+00:00</updated>
<author>
<name>Valentin Clement (バレンタイン クレメン)</name>
<email>clementval@gmail.com</email>
</author>
<published>2025-09-09T20:12:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d3c09c45aa9807e980f8fb029f2b94d8eb175265'/>
<id>d3c09c45aa9807e980f8fb029f2b94d8eb175265</id>
<content type='text'>
Allocatable and pointer device components need a different allocator
index to be set in their descriptor when it is establish. This PR adds
two genre for the components `AllocatableDevice` and `PointerDevice` so
the correct allocator index can be set accordingly.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allocatable and pointer device components need a different allocator
index to be set in their descriptor when it is establish. This PR adds
two genre for the components `AllocatableDevice` and `PointerDevice` so
the correct allocator index can be set accordingly.</pre>
</div>
</content>
</entry>
<entry>
<title>[Flang][OpenMP][Runtime] Minor Flang runtime for OpenMP AMDGPU modifications (#152631)</title>
<updated>2025-08-29T21:04:48+00:00</updated>
<author>
<name>agozillon</name>
<email>Andrew.Gozillon@amd.com</email>
</author>
<published>2025-08-29T21:04:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=30d2cb5a7ecd34cf19843533681a53aa7bd8d351'/>
<id>30d2cb5a7ecd34cf19843533681a53aa7bd8d351</id>
<content type='text'>
We have some modifications downstream to compile the flang runtime for
amdgpu using clang OpenMP, some more hacky than others to workaround
(hopefully temporary) compiler issues. The additions here are the
non-hacky alterations.

Main changes:
* Create freestanding versions of memcpy, strlen and memmove, and
replace std:: references with these so that we can default to std:: when
it's available, or our own Flang implementation when it's not. * Wrap
more bits and pieces of the library in declare target wrappers (RT_*
macros). * Fix some warnings that'll pose issues with werror on, in this
case having the namespace infront of variables passed to templates.

Another minor issues that'll likely still pop up depending on the
program you're linking with is that abort will be undefined, it is
perhaps possible to solve it with a freestanding implementation as with
memcpy etc. but we end up with multiple definitions in this case. An
alternative is to create an empty extern "c" version (which can be empty
or forwrd on to the builtin).

Co-author: Dan Palermo Dan.Palermo@amd.com</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have some modifications downstream to compile the flang runtime for
amdgpu using clang OpenMP, some more hacky than others to workaround
(hopefully temporary) compiler issues. The additions here are the
non-hacky alterations.

Main changes:
* Create freestanding versions of memcpy, strlen and memmove, and
replace std:: references with these so that we can default to std:: when
it's available, or our own Flang implementation when it's not. * Wrap
more bits and pieces of the library in declare target wrappers (RT_*
macros). * Fix some warnings that'll pose issues with werror on, in this
case having the namespace infront of variables passed to templates.

Another minor issues that'll likely still pop up depending on the
program you're linking with is that abort will be undefined, it is
perhaps possible to solve it with a freestanding implementation as with
memcpy etc. but we end up with multiple definitions in this case. An
alternative is to create an empty extern "c" version (which can be empty
or forwrd on to the builtin).

Co-author: Dan Palermo Dan.Palermo@amd.com</pre>
</div>
</content>
</entry>
<entry>
<title>[flang][runtime] Handle ALLOCATE(..., short SOURCE=) (#155715)</title>
<updated>2025-08-29T14:50:17+00:00</updated>
<author>
<name>Peter Klausler</name>
<email>pklausler@nvidia.com</email>
</author>
<published>2025-08-29T14:50:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=6111c9cfdcc054306de0a17d9eab5274ca6a34e1'/>
<id>6111c9cfdcc054306de0a17d9eab5274ca6a34e1</id>
<content type='text'>
Ensure that blank padding takes place when a fixed-length character
allocatable is allocated with a short SOURCE= specifier. While here,
clean up DoFromSourceAssign() so that it uses a temporary descriptor on
the stack rather than allocating one from the heap.

Fixes https://github.com/llvm/llvm-project/issues/155703.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Ensure that blank padding takes place when a fixed-length character
allocatable is allocated with a short SOURCE= specifier. While here,
clean up DoFromSourceAssign() so that it uses a temporary descriptor on
the stack rather than allocating one from the heap.

Fixes https://github.com/llvm/llvm-project/issues/155703.</pre>
</div>
</content>
</entry>
<entry>
<title>[flang][runtime] Fix AllocateAssignmentLHS for monomorphic LHS (#153073)</title>
<updated>2025-08-18T21:42:16+00:00</updated>
<author>
<name>Peter Klausler</name>
<email>pklausler@nvidia.com</email>
</author>
<published>2025-08-18T21:42:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=50b55a5ee9c6fd0999c71aeab85c10f1430acb27'/>
<id>50b55a5ee9c6fd0999c71aeab85c10f1430acb27</id>
<content type='text'>
When the left-hand side of an assignment statement is an allocatable
that has a monomorphic derived type, and the right-hand side of the
assignment has a type that is an extension of that type, *don't* change
the incoming type or element size of the descriptor before allocating
it.

Fixes https://github.com/llvm/llvm-project/issues/152758.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the left-hand side of an assignment statement is an allocatable
that has a monomorphic derived type, and the right-hand side of the
assignment has a type that is an extension of that type, *don't* change
the incoming type or element size of the descriptor before allocating
it.

Fixes https://github.com/llvm/llvm-project/issues/152758.</pre>
</div>
</content>
</entry>
<entry>
<title>[flang][runtime][NFC] Add a comment to intrinsic assignment (#153260)</title>
<updated>2025-08-13T21:38:24+00:00</updated>
<author>
<name>Peter Klausler</name>
<email>pklausler@nvidia.com</email>
</author>
<published>2025-08-13T21:38:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=022bd53b889e9e113f7ec73cb60b1978845fa474'/>
<id>022bd53b889e9e113f7ec73cb60b1978845fa474</id>
<content type='text'>
Add a comment explaining why intrinsic derived type assignment
unconditionally deallocates all allocated allocatable subobject
components of the left-hand side variable, so that I won't forget the
reasoning here the next time this comes into question.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a comment explaining why intrinsic derived type assignment
unconditionally deallocates all allocated allocatable subobject
components of the left-hand side variable, so that I won't forget the
reasoning here the next time this comes into question.</pre>
</div>
</content>
</entry>
<entry>
<title>[flang][runtime] Further work on speeding up work queue operations (#149189)</title>
<updated>2025-07-18T20:44:25+00:00</updated>
<author>
<name>Peter Klausler</name>
<email>pklausler@nvidia.com</email>
</author>
<published>2025-07-18T20:44:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=97a8476068bad449c0340021398b0356a44857aa'/>
<id>97a8476068bad449c0340021398b0356a44857aa</id>
<content type='text'>
This patch avoids a trip through the work queue engine for cases on a
CPU where finalization and destruction actions during assignment were
handled without enqueueing another task.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch avoids a trip through the work queue engine for cases on a
CPU where finalization and destruction actions during assignment were
handled without enqueueing another task.</pre>
</div>
</content>
</entry>
<entry>
<title>[flang][runtime] Speed up initialization &amp; destruction (#148087)</title>
<updated>2025-07-14T18:14:02+00:00</updated>
<author>
<name>Peter Klausler</name>
<email>pklausler@nvidia.com</email>
</author>
<published>2025-07-14T18:14:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2e53a68c09b1bb4bc6c31146c8e69789ae9848ae'/>
<id>2e53a68c09b1bb4bc6c31146c8e69789ae9848ae</id>
<content type='text'>
Rework derived type initialization in the runtime to just initialize the
first element of any array, and then memcpy it to the others, rather
than exercising the per-component paths for each element.

Reword derived type destruction in the runtime to detect and exploit a
fast path for allocatable components whose types themselves don't need
nested destruction.

Small tweaks were made in hot paths exposed by profiling in descriptor
operations and derived type assignment.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rework derived type initialization in the runtime to just initialize the
first element of any array, and then memcpy it to the others, rather
than exercising the per-component paths for each element.

Reword derived type destruction in the runtime to detect and exploit a
fast path for allocatable components whose types themselves don't need
nested destruction.

Small tweaks were made in hot paths exposed by profiling in descriptor
operations and derived type assignment.</pre>
</div>
</content>
</entry>
<entry>
<title>Fix the type of offset that broke 32-bit flang-rt build to use `uint64_t` consistently (#147359)</title>
<updated>2025-07-08T14:01:43+00:00</updated>
<author>
<name>Daniel Chen</name>
<email>cdchen@ca.ibm.com</email>
</author>
<published>2025-07-08T14:01:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=b84696db745e127cc6fb1d49bbf39f3c0819b6d9'/>
<id>b84696db745e127cc6fb1d49bbf39f3c0819b6d9</id>
<content type='text'>
The recent change of `flang-rt` has code like `std::size_t
offset{offset_};`.
It broke the 32-bit `flang-rt` build because `Component::offset_` is of
type `uint64_t` but `size_t` varies.
Clang complains
```
error: non-constant-expression cannot be narrowed from type 'std::uint64_t' (aka 'unsigned long long') to 'std::size_t' (aka 'unsigned long') in initializer list [-Wc++11-narrowing]
  143 |   std::size_t offset{offset_};
      |                      ^~~~~~~

```

This patch is to use the consistent `uint64_t` for offset.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The recent change of `flang-rt` has code like `std::size_t
offset{offset_};`.
It broke the 32-bit `flang-rt` build because `Component::offset_` is of
type `uint64_t` but `size_t` varies.
Clang complains
```
error: non-constant-expression cannot be narrowed from type 'std::uint64_t' (aka 'unsigned long long') to 'std::size_t' (aka 'unsigned long') in initializer list [-Wc++11-narrowing]
  143 |   std::size_t offset{offset_};
      |                      ^~~~~~~

```

This patch is to use the consistent `uint64_t` for offset.</pre>
</div>
</content>
</entry>
<entry>
<title>[flang] Restructure runtime to avoid recursion (relanding) (#143993)</title>
<updated>2025-06-16T21:37:01+00:00</updated>
<author>
<name>Peter Klausler</name>
<email>pklausler@nvidia.com</email>
</author>
<published>2025-06-16T21:37:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2bf3ccabfa37ee1b2d74da7b370cdb16a5cc8ac0'/>
<id>2bf3ccabfa37ee1b2d74da7b370cdb16a5cc8ac0</id>
<content type='text'>
Recursion, both direct and indirect, prevents accurate stack size
calculation at link time for GPU device code. Restructure these
recursive (often mutually so) routines in the Fortran runtime with new
implementations based on an iterative work queue with
suspendable/resumable work tickets: Assign, Initialize, initializeClone,
Finalize, and Destroy.

Default derived type I/O is also recursive, but already disabled. It can
be added to this new framework later if the overall approach succeeds.

Note that derived type FINAL subroutine calls, defined assignments, and
defined I/O procedures all perform callbacks into user code, which may
well reenter the runtime library. This kind of recursion is not handled
by this change, although it may be possible to do so in the future using
thread-local work queues.

(Relanding this patch after reverting initial attempt due to some test
failures that needed some time to analyze and fix.)

Fixes https://github.com/llvm/llvm-project/issues/142481.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Recursion, both direct and indirect, prevents accurate stack size
calculation at link time for GPU device code. Restructure these
recursive (often mutually so) routines in the Fortran runtime with new
implementations based on an iterative work queue with
suspendable/resumable work tickets: Assign, Initialize, initializeClone,
Finalize, and Destroy.

Default derived type I/O is also recursive, but already disabled. It can
be added to this new framework later if the overall approach succeeds.

Note that derived type FINAL subroutine calls, defined assignments, and
defined I/O procedures all perform callbacks into user code, which may
well reenter the runtime library. This kind of recursion is not handled
by this change, although it may be possible to do so in the future using
thread-local work queues.

(Relanding this patch after reverting initial attempt due to some test
failures that needed some time to analyze and fix.)

Fixes https://github.com/llvm/llvm-project/issues/142481.</pre>
</div>
</content>
</entry>
<entry>
<title>Revert runtime work queue patch, it breaks some tests that need investigation (#143713)</title>
<updated>2025-06-11T14:55:06+00:00</updated>
<author>
<name>Peter Klausler</name>
<email>pklausler@nvidia.com</email>
</author>
<published>2025-06-11T14:55:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=10f512f7bbda076ca2a0f9e3fcb2e7be0cb07199'/>
<id>10f512f7bbda076ca2a0f9e3fcb2e7be0cb07199</id>
<content type='text'>
Revert "[flang][runtime] Another try to fix build failure"

This reverts commit 13869cac2b5051e453aa96ad71220d9d33404620.

Revert "[flang][runtime] Fix build bot flang-runtime-cuda-gcc errors
(#143650)"

This reverts commit d75e28477af0baa063a4d4cc7b3cf657cfadd758.

Revert "[flang][runtime] Replace recursion with iterative work queue
(#137727)"

This reverts commit 163c67ad3d1bf7af6590930d8f18700d65ad4564.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Revert "[flang][runtime] Another try to fix build failure"

This reverts commit 13869cac2b5051e453aa96ad71220d9d33404620.

Revert "[flang][runtime] Fix build bot flang-runtime-cuda-gcc errors
(#143650)"

This reverts commit d75e28477af0baa063a4d4cc7b3cf657cfadd758.

Revert "[flang][runtime] Replace recursion with iterative work queue
(#137727)"

This reverts commit 163c67ad3d1bf7af6590930d8f18700d65ad4564.</pre>
</div>
</content>
</entry>
</feed>
