<feed xmlns='http://www.w3.org/2005/Atom'>
<title>gcc.git/libgomp/libgomp-plugin.h, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/'/>
<entry>
<title>openmp, nvptx: ompx_gnu_managed_mem_alloc</title>
<updated>2025-11-13T14:16:09+00:00</updated>
<author>
<name>Andrew Stubbs</name>
<email>ams@codesourcery.com</email>
</author>
<published>2024-06-28T10:24:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=62174ec27b686bb9656c62a3b53d6cc08c9addb4'/>
<id>62174ec27b686bb9656c62a3b53d6cc08c9addb4</id>
<content type='text'>
This adds support for using Cuda Managed Memory with omp_alloc.  AMD support
will be added in a future patch.

There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a
corresponding memory space, which can be used to allocate memory in the
"managed" space.

The nvptx plugin is modified to make the necessary Cuda calls, via two new
(optional) plugin interfaces.

gcc/fortran/ChangeLog:

	* openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX
	and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the
	comment.

include/ChangeLog:

	* cuda/cuda.h (cuMemAllocManaged): Add declaration and related
	CU_MEM_ATTACH_GLOBAL flag.
	* gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201.
	(GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant.

libgomp/ChangeLog:

	* allocator.c (ompx_gnu_max_predefined_alloc): Update to
	ompx_gnu_managed_mem_alloc.
	(_Static_assert): Fix assertion messages for allocators and add
	new assertions for memspace constants.
	(omp_max_predefined_mem_space): New define.
	(ompx_gnu_min_predefined_mem_space): New define.
	(ompx_gnu_max_predefined_mem_space): New define.
	(MEMSPACE_ALLOC): Add check for non-standard memspaces.
	(MEMSPACE_CALLOC): Likewise.
	(MEMSPACE_REALLOC): Likewise.
	(MEMSPACE_VALIDATE): Likewise.
	(predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space.
	(omp_init_allocator): Add ompx_gnu_managed_mem_space validation.
	* config/gcn/allocator.c (gcn_memspace_alloc): Add check for
	non-standard memspaces.
	(gcn_memspace_calloc): Likewise.
	(gcn_memspace_realloc): Likewise.
	(gcn_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* config/linux/allocator.c (linux_memspace_alloc): Add managed
	memory space handling.
	(linux_memspace_calloc): Likewise.
	(linux_memspace_free): Likewise.
	(linux_memspace_realloc): Likewise (returns NULL for fallback).
	* config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for
	non-standard memspaces.
	(nvptx_memspace_calloc): Likewise.
	(nvptx_memspace_realloc): Likewise.
	(nvptx_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc,
	ompx_gnu_managed_mem_space, and some static asserts so I don't forget
	them again.
	* libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration.
	(GOMP_OFFLOAD_managed_free): New declaration.
	* libgomp.h (gomp_managed_alloc): New declaration.
	(gomp_managed_free): New declaration.
	(struct gomp_device_descr): Add managed_alloc_func and
	managed_free_func fields.
	* libgomp.texi: Document ompx_gnu_managed_mem_alloc and
	ompx_gnu_managed_mem_space, add C++ template documentation, and
	describe NVPTX and AMD support.
	* omp.h.in: Add ompx_gnu_managed_mem_space and
	ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++
	allocator template.
	* omp_lib.f90.in: Add Fortran bindings for new allocator and
	memory space.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuMemAllocManaged.
	* plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to
	support cuMemAllocManaged.
	(GOMP_OFFLOAD_alloc): Move contents to ...
	(cleanup_and_alloc): ... this new function, and add managed support.
	(GOMP_OFFLOAD_managed_alloc): New function.
	(GOMP_OFFLOAD_managed_free): New function.
	* target.c (gomp_managed_alloc): New function.
	(gomp_managed_free): New function.
	(gomp_load_plugin_for_device): Load optional managed_alloc
	and managed_free plugin APIs.
	* testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem.
	* testsuite/libgomp.c++/alloc-managed-1.C: New test.
	* testsuite/libgomp.c/alloc-managed-1.c: New test.
	* testsuite/libgomp.c/alloc-managed-2.c: New test.
	* testsuite/libgomp.c/alloc-managed-3.c: New test.
	* testsuite/libgomp.c/alloc-managed-4.c: New test.
	* testsuite/libgomp.fortran/alloc-managed-1.f90: New test.

Co-authored-by: Kwok Cheung Yeung &lt;kcyeung@baylibre.com&gt;
Co-authored-by: Thomas Schwinge &lt;tschwinge@baylibre.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds support for using Cuda Managed Memory with omp_alloc.  AMD support
will be added in a future patch.

There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a
corresponding memory space, which can be used to allocate memory in the
"managed" space.

The nvptx plugin is modified to make the necessary Cuda calls, via two new
(optional) plugin interfaces.

gcc/fortran/ChangeLog:

	* openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX
	and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the
	comment.

include/ChangeLog:

	* cuda/cuda.h (cuMemAllocManaged): Add declaration and related
	CU_MEM_ATTACH_GLOBAL flag.
	* gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201.
	(GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant.

libgomp/ChangeLog:

	* allocator.c (ompx_gnu_max_predefined_alloc): Update to
	ompx_gnu_managed_mem_alloc.
	(_Static_assert): Fix assertion messages for allocators and add
	new assertions for memspace constants.
	(omp_max_predefined_mem_space): New define.
	(ompx_gnu_min_predefined_mem_space): New define.
	(ompx_gnu_max_predefined_mem_space): New define.
	(MEMSPACE_ALLOC): Add check for non-standard memspaces.
	(MEMSPACE_CALLOC): Likewise.
	(MEMSPACE_REALLOC): Likewise.
	(MEMSPACE_VALIDATE): Likewise.
	(predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space.
	(omp_init_allocator): Add ompx_gnu_managed_mem_space validation.
	* config/gcn/allocator.c (gcn_memspace_alloc): Add check for
	non-standard memspaces.
	(gcn_memspace_calloc): Likewise.
	(gcn_memspace_realloc): Likewise.
	(gcn_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* config/linux/allocator.c (linux_memspace_alloc): Add managed
	memory space handling.
	(linux_memspace_calloc): Likewise.
	(linux_memspace_free): Likewise.
	(linux_memspace_realloc): Likewise (returns NULL for fallback).
	* config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for
	non-standard memspaces.
	(nvptx_memspace_calloc): Likewise.
	(nvptx_memspace_realloc): Likewise.
	(nvptx_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc,
	ompx_gnu_managed_mem_space, and some static asserts so I don't forget
	them again.
	* libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration.
	(GOMP_OFFLOAD_managed_free): New declaration.
	* libgomp.h (gomp_managed_alloc): New declaration.
	(gomp_managed_free): New declaration.
	(struct gomp_device_descr): Add managed_alloc_func and
	managed_free_func fields.
	* libgomp.texi: Document ompx_gnu_managed_mem_alloc and
	ompx_gnu_managed_mem_space, add C++ template documentation, and
	describe NVPTX and AMD support.
	* omp.h.in: Add ompx_gnu_managed_mem_space and
	ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++
	allocator template.
	* omp_lib.f90.in: Add Fortran bindings for new allocator and
	memory space.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuMemAllocManaged.
	* plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to
	support cuMemAllocManaged.
	(GOMP_OFFLOAD_alloc): Move contents to ...
	(cleanup_and_alloc): ... this new function, and add managed support.
	(GOMP_OFFLOAD_managed_alloc): New function.
	(GOMP_OFFLOAD_managed_free): New function.
	* target.c (gomp_managed_alloc): New function.
	(gomp_managed_free): New function.
	(gomp_load_plugin_for_device): Load optional managed_alloc
	and managed_free plugin APIs.
	* testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem.
	* testsuite/libgomp.c++/alloc-managed-1.C: New test.
	* testsuite/libgomp.c/alloc-managed-1.c: New test.
	* testsuite/libgomp.c/alloc-managed-2.c: New test.
	* testsuite/libgomp.c/alloc-managed-3.c: New test.
	* testsuite/libgomp.c/alloc-managed-4.c: New test.
	* testsuite/libgomp.fortran/alloc-managed-1.f90: New test.

Co-authored-by: Kwok Cheung Yeung &lt;kcyeung@baylibre.com&gt;
Co-authored-by: Thomas Schwinge &lt;tschwinge@baylibre.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libgomp, nvptx: Cuda pinned memory</title>
<updated>2025-10-23T11:08:06+00:00</updated>
<author>
<name>Andrew Stubbs</name>
<email>ams@baylibre.com</email>
</author>
<published>2025-10-14T11:22:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=3b8d9d579c2931f1d8d2c89ff67735bc77df55ad'/>
<id>3b8d9d579c2931f1d8d2c89ff67735bc77df55ad</id>
<content type='text'>
Use Cuda to pin memory, instead of Linux mlock, when available.

There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.

The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX.  At present, the other supported devices do
not have equivalent capabilities (or requirements).

libgomp/ChangeLog:

	* config/linux/allocator.c: Include assert.h.
	(using_device_for_page_locked): New variable.
	(linux_memspace_alloc): Add init0 parameter. Support device pinning.
	(linux_memspace_calloc): Set init0 to true.
	(linux_memspace_free): Support device pinning.
	(linux_memspace_realloc): Support device pinning.
	(MEMSPACE_ALLOC): Set init0 to false.
	* libgomp-plugin.h
	(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* libgomp.h (gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(struct gomp_device_descr): Add page_locked_host_alloc_func and
	page_locked_host_free_func.
	* libgomp.texi: Adjust the docs for the pinned trait.
	* plugin/plugin-nvptx.c
	(GOMP_OFFLOAD_page_locked_host_alloc): New function.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* target.c (device_for_page_locked): New variable.
	(get_device_for_page_locked): New function.
	(gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(gomp_load_plugin_for_device): Add page_locked_host_alloc and
	page_locked_host_free.
	* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
	devices.
	* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.

Co-Authored-By: Thomas Schwinge &lt;thomas@codesourcery.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use Cuda to pin memory, instead of Linux mlock, when available.

There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.

The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX.  At present, the other supported devices do
not have equivalent capabilities (or requirements).

libgomp/ChangeLog:

	* config/linux/allocator.c: Include assert.h.
	(using_device_for_page_locked): New variable.
	(linux_memspace_alloc): Add init0 parameter. Support device pinning.
	(linux_memspace_calloc): Set init0 to true.
	(linux_memspace_free): Support device pinning.
	(linux_memspace_realloc): Support device pinning.
	(MEMSPACE_ALLOC): Set init0 to false.
	* libgomp-plugin.h
	(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* libgomp.h (gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(struct gomp_device_descr): Add page_locked_host_alloc_func and
	page_locked_host_free_func.
	* libgomp.texi: Adjust the docs for the pinned trait.
	* plugin/plugin-nvptx.c
	(GOMP_OFFLOAD_page_locked_host_alloc): New function.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* target.c (device_for_page_locked): New variable.
	(get_device_for_page_locked): New function.
	(gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(gomp_load_plugin_for_device): Add page_locked_host_alloc and
	page_locked_host_free.
	* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
	devices.
	* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.

Co-Authored-By: Thomas Schwinge &lt;thomas@codesourcery.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libgomp: Init hash table for 'indirect'-clause of 'declare target' on the host [PR114445, PR119857]</title>
<updated>2025-09-17T06:47:36+00:00</updated>
<author>
<name>Tobias Burnus</name>
<email>tburnus@baylibre.com</email>
</author>
<published>2025-09-17T06:47:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=da5803c794d16deb461c93588461856fbf6e54ac'/>
<id>da5803c794d16deb461c93588461856fbf6e54ac</id>
<content type='text'>
Especially with unified-shared memory and especially with C++'s virtual
functions, it is not uncommon to have on the device a function pointer
that points to the host function - but has an associated device.
If the pointed-to function is (explicitly or implicitly) 'declare target'
with the 'indirect' clause, it is added to the lookup table.

Before this commit, the conversion of the lookup table into a lookup
hash table happened every time a device kernel was launched on the first
team - albeit if already converted, the function immediately returned.

Ignoring the overhead, there was also a race: If multiple teams were
launched, it could happen that another team of the same target region
already tried to use the lookup table which it was still being created.
Likewise when lauching a kernel with 'nowait' and directly afterward
another kernel, there could be a race of creating the table.

With this commit, the creating of the kernel has been moved to the
host-plugin's GOMP_OFFLOAD_load_image. The previous code stored a
pointer to the host/device pointer array, which makes it hard when
creating the hash table on the host (data is needed for finding the
slot) - but accessing it on the device (where the lookup has to work
as well). As the hash-table implementation (only) supports integral
value as payload (0 and 1 having special meaning), the solution was
to move to an uint128_t variable to store both the host and device
address.

As the host-side library is typically dynamically linked and the
device-side one statically, there is the problem of backward
compatibility. The current implementation permits both older
binaries and newer libgomp and newer binaries with older libgomp.
I could imagine us breaking the latter eventually, but for now
there is up and downward compatibility. (Obviously, the race is
only fixed if new + new is combined.)

Code wise, on the device exist GOMP_INDIRECT_ADDR_MAP which was
updated to point to the host/device-address array. Now additionally
GOMP_INDIRECT_ADDR_HMAP exists, which contains the hash-table map.

If the latter exists, libgomp only updates it and the former remains
a NULL pointer; it is also untouched if there are no indirect functions.
Being NULL therefore avoids the call to the device-side build_indirect_map.
The code also currently supports to have no hash and a linear walk. I think
that remained from testing; due to the backward-compat feature, it can
actually be turned of on either side.

libgomp/ChangeLog:

	PR libgomp/119857
	PR libgomp/114445
	* config/accel/target-indirect.c: Change to use uint128_t instead
	of a struct as data structure and add GOMP_INDIRECT_ADDR_HMAP as
	host-accessible variable.
	(struct indirect_map_t): Remove.
	(USE_HASHTAB_LOOKUP, INDIRECT_DEV_ADDR, INDIRECT_HOST_ADDR,
	SET_INDIRECT_HOST_ADDR, SET_INDIRECT_ADDRS): Define.
	(htab_free): Use __builtin_unreachable.
	(htab_hash, htab_eq, GOMP_target_map_indirect_ptr,
	build_indirect_map): Update for new representation and new
	pointer-to-hash variable.
	* config/gcn/team.c (gomp_gcn_enter_kernel): Only call
	build_indirect_map when GOMP_INDIRECT_ADDR_MAP.
	* config/nvptx/team.c (gomp_nvptx_main): Likewise.
	* libgomp-plugin.h (GOMP_INDIRECT_ADDR_HMAP): Define.
	* plugin/plugin-gcn.c: Conditionally include
	build-target-indirect-htab.h.
	(USE_HASHTAB_LOOKUP_FOR_INDIRECT): Define.
	(create_target_indirect_map): New prototype.
	(GOMP_OFFLOAD_load_image): Update to create the device's
	indirect-function hash table on the host.
	* plugin/plugin-nvptx.c: Conditionally include
	build-target-indirect-htab.h.
	(USE_HASHTAB_LOOKUP_FOR_INDIRECT): Define.
	(create_target_indirect_map): New prototype.
	(GOMP_OFFLOAD_load_image): Update to create the device's
	indirect-function hash table on the host.
	* plugin/build-target-indirect-htab.h: New file.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Especially with unified-shared memory and especially with C++'s virtual
functions, it is not uncommon to have on the device a function pointer
that points to the host function - but has an associated device.
If the pointed-to function is (explicitly or implicitly) 'declare target'
with the 'indirect' clause, it is added to the lookup table.

Before this commit, the conversion of the lookup table into a lookup
hash table happened every time a device kernel was launched on the first
team - albeit if already converted, the function immediately returned.

Ignoring the overhead, there was also a race: If multiple teams were
launched, it could happen that another team of the same target region
already tried to use the lookup table which it was still being created.
Likewise when lauching a kernel with 'nowait' and directly afterward
another kernel, there could be a race of creating the table.

With this commit, the creating of the kernel has been moved to the
host-plugin's GOMP_OFFLOAD_load_image. The previous code stored a
pointer to the host/device pointer array, which makes it hard when
creating the hash table on the host (data is needed for finding the
slot) - but accessing it on the device (where the lookup has to work
as well). As the hash-table implementation (only) supports integral
value as payload (0 and 1 having special meaning), the solution was
to move to an uint128_t variable to store both the host and device
address.

As the host-side library is typically dynamically linked and the
device-side one statically, there is the problem of backward
compatibility. The current implementation permits both older
binaries and newer libgomp and newer binaries with older libgomp.
I could imagine us breaking the latter eventually, but for now
there is up and downward compatibility. (Obviously, the race is
only fixed if new + new is combined.)

Code wise, on the device exist GOMP_INDIRECT_ADDR_MAP which was
updated to point to the host/device-address array. Now additionally
GOMP_INDIRECT_ADDR_HMAP exists, which contains the hash-table map.

If the latter exists, libgomp only updates it and the former remains
a NULL pointer; it is also untouched if there are no indirect functions.
Being NULL therefore avoids the call to the device-side build_indirect_map.
The code also currently supports to have no hash and a linear walk. I think
that remained from testing; due to the backward-compat feature, it can
actually be turned of on either side.

libgomp/ChangeLog:

	PR libgomp/119857
	PR libgomp/114445
	* config/accel/target-indirect.c: Change to use uint128_t instead
	of a struct as data structure and add GOMP_INDIRECT_ADDR_HMAP as
	host-accessible variable.
	(struct indirect_map_t): Remove.
	(USE_HASHTAB_LOOKUP, INDIRECT_DEV_ADDR, INDIRECT_HOST_ADDR,
	SET_INDIRECT_HOST_ADDR, SET_INDIRECT_ADDRS): Define.
	(htab_free): Use __builtin_unreachable.
	(htab_hash, htab_eq, GOMP_target_map_indirect_ptr,
	build_indirect_map): Update for new representation and new
	pointer-to-hash variable.
	* config/gcn/team.c (gomp_gcn_enter_kernel): Only call
	build_indirect_map when GOMP_INDIRECT_ADDR_MAP.
	* config/nvptx/team.c (gomp_nvptx_main): Likewise.
	* libgomp-plugin.h (GOMP_INDIRECT_ADDR_HMAP): Define.
	* plugin/plugin-gcn.c: Conditionally include
	build-target-indirect-htab.h.
	(USE_HASHTAB_LOOKUP_FOR_INDIRECT): Define.
	(create_target_indirect_map): New prototype.
	(GOMP_OFFLOAD_load_image): Update to create the device's
	indirect-function hash table on the host.
	* plugin/plugin-nvptx.c: Conditionally include
	build-target-indirect-htab.h.
	(USE_HASHTAB_LOOKUP_FOR_INDIRECT): Define.
	(create_target_indirect_map): New prototype.
	(GOMP_OFFLOAD_load_image): Update to create the device's
	indirect-function hash table on the host.
	* plugin/build-target-indirect-htab.h: New file.
</pre>
</div>
</content>
</entry>
<entry>
<title>libgomp: Add OpenMP's omp_target_memset/omp_target_memset_async</title>
<updated>2025-06-02T15:43:57+00:00</updated>
<author>
<name>Tobias Burnus</name>
<email>tburnus@baylibre.com</email>
</author>
<published>2025-06-02T15:43:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=4e47e2f833732c5d9a3c3e69dc753f99b3a56737'/>
<id>4e47e2f833732c5d9a3c3e69dc753f99b3a56737</id>
<content type='text'>
	PR libgomp/120444

include/ChangeLog:

	* cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare.
	* libgomp.h (struct gomp_device_descr): Add memset_func.
	* libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}.
	* libgomp.texi (Device Memory Routines): Document them.
	* omp.h.in (omp_target_memset, omp_target_memset_async): Declare.
	* omp_lib.f90.in (omp_target_memset, omp_target_memset_async):
	Add interfaces.
	* omp_lib.h.in (omp_target_memset, omp_target_memset_async): Likewise.
	* plugin/cuda-lib.def: Add cuMemsetD8.
	* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
	hsa_amd_memory_fill_fn.
	(init_hsa_runtime_functions): DLSYM_OPT_FN load it.
	(GOMP_OFFLOAD_memset): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New.
	* target.c (omp_target_memset_int, omp_target_memset,
	omp_target_memset_async_helper, omp_target_memset_async): New.
	(gomp_load_plugin_for_device): Add DLSYM (memset).
	* testsuite/libgomp.c-c++-common/omp_target_memset.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test.
	* testsuite/libgomp.fortran/omp_target_memset.f90: New test.
	* testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
	PR libgomp/120444

include/ChangeLog:

	* cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare.
	* libgomp.h (struct gomp_device_descr): Add memset_func.
	* libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}.
	* libgomp.texi (Device Memory Routines): Document them.
	* omp.h.in (omp_target_memset, omp_target_memset_async): Declare.
	* omp_lib.f90.in (omp_target_memset, omp_target_memset_async):
	Add interfaces.
	* omp_lib.h.in (omp_target_memset, omp_target_memset_async): Likewise.
	* plugin/cuda-lib.def: Add cuMemsetD8.
	* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
	hsa_amd_memory_fill_fn.
	(init_hsa_runtime_functions): DLSYM_OPT_FN load it.
	(GOMP_OFFLOAD_memset): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New.
	* target.c (omp_target_memset_int, omp_target_memset,
	omp_target_memset_async_helper, omp_target_memset_async): New.
	(gomp_load_plugin_for_device): Add DLSYM (memset).
	* testsuite/libgomp.c-c++-common/omp_target_memset.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test.
	* testsuite/libgomp.fortran/omp_target_memset.f90: New test.
	* testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.
</pre>
</div>
</content>
</entry>
<entry>
<title>libgomp: Add OpenACC's acc_memcpy_device{,_async} routines [PR93226]</title>
<updated>2025-05-29T20:47:06+00:00</updated>
<author>
<name>Tobias Burnus</name>
<email>tburnus@baylibre.com</email>
</author>
<published>2025-05-29T20:47:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=f4aa6b5a8d63050f5d61fcec222ed87be4c0a266'/>
<id>f4aa6b5a8d63050f5d61fcec222ed87be4c0a266</id>
<content type='text'>
libgomp/ChangeLog:

	PR libgomp/93226
	* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_dev2dev): New
	prototype.
	* libgomp.h (struct acc_dispatch_t): Add dev2dev_func.
	(gomp_copy_dev2dev): New prototype.
	* libgomp.map (OACC_2.6.1): New; add acc_memcpy_device{,_async}.
	* libgomp.texi (acc_memcpy_device): New.
	* oacc-mem.c (memcpy_tofrom_device): Change to take from/to
	device boolean; use memcpy not memmove; add early return if
	size == 0 or same device + same ptr.
	(acc_memcpy_to_device, acc_memcpy_to_device_async,
	acc_memcpy_from_device, acc_memcpy_from_device_async): Update.
	(acc_memcpy_device, acc_memcpy_device_async): New.
	* openacc.f90 (acc_memcpy_device, acc_memcpy_device_async):
	Add interface.
	* openacc_lib.h (acc_memcpy_device, acc_memcpy_device_async):
	Likewise.
	* openacc.h (acc_memcpy_device, acc_memcpy_device_async): Add
	prototype.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev):
	Update comment.
	(GOMP_OFFLOAD_openacc_async_dev2host): Update call.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* plugin/plugin-nvptx.c (cuda_memcpy_dev_sanity_check): New.
	(GOMP_OFFLOAD_dev2dev): Call it.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* target.c (gomp_copy_dev2dev): New.
	(gomp_load_plugin_for_device): Load dev2dev and async_dev2dev.
	* testsuite/libgomp.oacc-c-c++-common/acc_memcpy_device-1.c: New test.
	* testsuite/libgomp.oacc-fortran/acc_memcpy_device-1.f90: New test.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
libgomp/ChangeLog:

	PR libgomp/93226
	* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_dev2dev): New
	prototype.
	* libgomp.h (struct acc_dispatch_t): Add dev2dev_func.
	(gomp_copy_dev2dev): New prototype.
	* libgomp.map (OACC_2.6.1): New; add acc_memcpy_device{,_async}.
	* libgomp.texi (acc_memcpy_device): New.
	* oacc-mem.c (memcpy_tofrom_device): Change to take from/to
	device boolean; use memcpy not memmove; add early return if
	size == 0 or same device + same ptr.
	(acc_memcpy_to_device, acc_memcpy_to_device_async,
	acc_memcpy_from_device, acc_memcpy_from_device_async): Update.
	(acc_memcpy_device, acc_memcpy_device_async): New.
	* openacc.f90 (acc_memcpy_device, acc_memcpy_device_async):
	Add interface.
	* openacc_lib.h (acc_memcpy_device, acc_memcpy_device_async):
	Likewise.
	* openacc.h (acc_memcpy_device, acc_memcpy_device_async): Add
	prototype.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev):
	Update comment.
	(GOMP_OFFLOAD_openacc_async_dev2host): Update call.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* plugin/plugin-nvptx.c (cuda_memcpy_dev_sanity_check): New.
	(GOMP_OFFLOAD_dev2dev): Call it.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* target.c (gomp_copy_dev2dev): New.
	(gomp_load_plugin_for_device): Load dev2dev and async_dev2dev.
	* testsuite/libgomp.oacc-c-c++-common/acc_memcpy_device-1.c: New test.
	* testsuite/libgomp.oacc-fortran/acc_memcpy_device-1.f90: New test.
</pre>
</div>
</content>
</entry>
<entry>
<title>OpenMP: 'interop' construct - add ME support + target-independent libgomp</title>
<updated>2025-03-21T18:24:16+00:00</updated>
<author>
<name>Paul-Antoine Arras</name>
<email>parras@baylibre.com</email>
</author>
<published>2025-03-13T16:16:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=99e2906ae255fc7b8edb008d7cd47b28b078a809'/>
<id>99e2906ae255fc7b8edb008d7cd47b28b078a809</id>
<content type='text'>
This patch partially enables use of the OpenMP interop construct by adding
middle end support, mostly in the omplower pass, and in the target-independent
part of the libgomp runtime. It follows up on previous patches for C, C++ and
Fortran front ends support. The full interop feature requires another patch to
enable foreign runtime support in libgomp plugins.

gcc/ChangeLog:

	* builtin-types.def
	(BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR): New.
	* gimple-low.cc (lower_stmt): Handle GIMPLE_OMP_INTEROP.
	* gimple-pretty-print.cc (dump_gimple_omp_interop): New function.
	(pp_gimple_stmt_1): Handle GIMPLE_OMP_INTEROP.
	* gimple.cc (gimple_build_omp_interop): New function.
	(gimple_copy): Handle GIMPLE_OMP_INTEROP.
	* gimple.def (GIMPLE_OMP_INTEROP): Define.
	* gimple.h (gimple_build_omp_interop): Declare.
	(gimple_omp_interop_clauses): New function.
	(gimple_omp_interop_clauses_ptr): Likewise.
	(gimple_omp_interop_set_clauses): Likewise.
	(gimple_return_set_retval): Handle GIMPLE_OMP_INTEROP.
	* gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(gimplify_omp_interop): New function.
	(gimplify_expr): Replace sorry with call to gimplify_omp_interop.
	* omp-builtins.def (BUILT_IN_GOMP_INTEROP): Define.
	* omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(scan_omp_1_stmt): Handle GIMPLE_OMP_INTEROP.
	(lower_omp_interop_action_clauses): New function.
	(lower_omp_interop): Likewise.
	(lower_omp_1): Handle GIMPLE_OMP_INTEROP.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_clause_destroy): Make addressable.
	(c_parser_omp_clause_init): Make addressable.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_clause_init): Make addressable.

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): Make OMP_CLAUSE_DESTROY and
	OMP_CLAUSE_INIT addressable.
	* types.def (BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR):
	New.

include/ChangeLog:

	* gomp-constants.h (GOMP_DEVICE_DEFAULT_OMP_61, GOMP_INTEROP_TARGET,
	GOMP_INTEROP_TARGETSYNC, GOMP_INTEROP_FLAG_NOWAIT): Define.

libgomp/ChangeLog:

	* icv-device.c (omp_set_default_device): Check
	GOMP_DEVICE_DEFAULT_OMP_61.
	* libgomp-plugin.h (struct interop_obj_t): New.
	(enum gomp_interop_flag): New.
	(GOMP_OFFLOAD_interop): Declare.
	(GOMP_OFFLOAD_get_interop_int): Declare.
	(GOMP_OFFLOAD_get_interop_ptr): Declare.
	(GOMP_OFFLOAD_get_interop_str): Declare.
	(GOMP_OFFLOAD_get_interop_type_desc): Declare.
	* libgomp.h (_LIBGOMP_OMP_LOCK_DEFINED): Define.
	(struct gomp_device_descr): Add interop_func, get_interop_int_func,
	get_interop_ptr_func, get_interop_str_func, get_interop_type_desc_func.
	* libgomp.map: Add GOMP_interop.
	* libgomp_g.h (GOMP_interop): Declare.
	* target.c (resolve_device): Handle GOMP_DEVICE_DEFAULT_OMP_61.
	(omp_get_interop_int): Replace stub with actual implementation.
	(omp_get_interop_ptr): Likewise.
	(omp_get_interop_str): Likewise.
	(omp_get_interop_type_desc): Likewise.
	(struct interop_data_t): Define.
	(gomp_interop_internal): New function.
	(GOMP_interop): Likewise.
	(gomp_load_plugin_for_device): Load symbols for get_interop_int,
	get_interop_ptr, get_interop_str and get_interop_type_desc.
	* testsuite/libgomp.c-c++-common/interop-1.c: New test.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/interop-1.c: Remove dg-prune-output "sorry".
	* c-c++-common/gomp/interop-2.c: Likewise.
	* c-c++-common/gomp/interop-3.c: Likewise.
	* c-c++-common/gomp/interop-4.c: Remove dg-message "not supported".
	* g++.dg/gomp/interop-5.C: Likewise.
	* gfortran.dg/gomp/interop-4.f90: Likewise.
	* c-c++-common/gomp/interop-5.c: New test.
	* gfortran.dg/gomp/interop-5.f90: New test.

Co-authored-by: Tobias Burnus &lt;tburnus@baylibre.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch partially enables use of the OpenMP interop construct by adding
middle end support, mostly in the omplower pass, and in the target-independent
part of the libgomp runtime. It follows up on previous patches for C, C++ and
Fortran front ends support. The full interop feature requires another patch to
enable foreign runtime support in libgomp plugins.

gcc/ChangeLog:

	* builtin-types.def
	(BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR): New.
	* gimple-low.cc (lower_stmt): Handle GIMPLE_OMP_INTEROP.
	* gimple-pretty-print.cc (dump_gimple_omp_interop): New function.
	(pp_gimple_stmt_1): Handle GIMPLE_OMP_INTEROP.
	* gimple.cc (gimple_build_omp_interop): New function.
	(gimple_copy): Handle GIMPLE_OMP_INTEROP.
	* gimple.def (GIMPLE_OMP_INTEROP): Define.
	* gimple.h (gimple_build_omp_interop): Declare.
	(gimple_omp_interop_clauses): New function.
	(gimple_omp_interop_clauses_ptr): Likewise.
	(gimple_omp_interop_set_clauses): Likewise.
	(gimple_return_set_retval): Handle GIMPLE_OMP_INTEROP.
	* gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(gimplify_omp_interop): New function.
	(gimplify_expr): Replace sorry with call to gimplify_omp_interop.
	* omp-builtins.def (BUILT_IN_GOMP_INTEROP): Define.
	* omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(scan_omp_1_stmt): Handle GIMPLE_OMP_INTEROP.
	(lower_omp_interop_action_clauses): New function.
	(lower_omp_interop): Likewise.
	(lower_omp_1): Handle GIMPLE_OMP_INTEROP.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_clause_destroy): Make addressable.
	(c_parser_omp_clause_init): Make addressable.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_clause_init): Make addressable.

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): Make OMP_CLAUSE_DESTROY and
	OMP_CLAUSE_INIT addressable.
	* types.def (BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR):
	New.

include/ChangeLog:

	* gomp-constants.h (GOMP_DEVICE_DEFAULT_OMP_61, GOMP_INTEROP_TARGET,
	GOMP_INTEROP_TARGETSYNC, GOMP_INTEROP_FLAG_NOWAIT): Define.

libgomp/ChangeLog:

	* icv-device.c (omp_set_default_device): Check
	GOMP_DEVICE_DEFAULT_OMP_61.
	* libgomp-plugin.h (struct interop_obj_t): New.
	(enum gomp_interop_flag): New.
	(GOMP_OFFLOAD_interop): Declare.
	(GOMP_OFFLOAD_get_interop_int): Declare.
	(GOMP_OFFLOAD_get_interop_ptr): Declare.
	(GOMP_OFFLOAD_get_interop_str): Declare.
	(GOMP_OFFLOAD_get_interop_type_desc): Declare.
	* libgomp.h (_LIBGOMP_OMP_LOCK_DEFINED): Define.
	(struct gomp_device_descr): Add interop_func, get_interop_int_func,
	get_interop_ptr_func, get_interop_str_func, get_interop_type_desc_func.
	* libgomp.map: Add GOMP_interop.
	* libgomp_g.h (GOMP_interop): Declare.
	* target.c (resolve_device): Handle GOMP_DEVICE_DEFAULT_OMP_61.
	(omp_get_interop_int): Replace stub with actual implementation.
	(omp_get_interop_ptr): Likewise.
	(omp_get_interop_str): Likewise.
	(omp_get_interop_type_desc): Likewise.
	(struct interop_data_t): Define.
	(gomp_interop_internal): New function.
	(GOMP_interop): Likewise.
	(gomp_load_plugin_for_device): Load symbols for get_interop_int,
	get_interop_ptr, get_interop_str and get_interop_type_desc.
	* testsuite/libgomp.c-c++-common/interop-1.c: New test.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/interop-1.c: Remove dg-prune-output "sorry".
	* c-c++-common/gomp/interop-2.c: Likewise.
	* c-c++-common/gomp/interop-3.c: Likewise.
	* c-c++-common/gomp/interop-4.c: Remove dg-message "not supported".
	* g++.dg/gomp/interop-5.C: Likewise.
	* gfortran.dg/gomp/interop-4.f90: Likewise.
	* c-c++-common/gomp/interop-5.c: New test.
	* gfortran.dg/gomp/interop-5.f90: New test.

Co-authored-by: Tobias Burnus &lt;tburnus@baylibre.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Update copyright years.</title>
<updated>2025-01-02T10:59:57+00:00</updated>
<author>
<name>Jakub Jelinek</name>
<email>jakub@redhat.com</email>
</author>
<published>2025-01-02T10:59:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=6441eb6dc020faae0672ea724dfdb38c6a9bf6a1'/>
<id>6441eb6dc020faae0672ea724dfdb38c6a9bf6a1</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>OpenMP: Add get_device_from_uid/omp_get_uid_from_device routines</title>
<updated>2024-09-20T07:25:33+00:00</updated>
<author>
<name>Tobias Burnus</name>
<email>tburnus@baylibre.com</email>
</author>
<published>2024-09-20T07:25:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=bf4a5efa80ef8438deb0a99c9a02b1f550aaf814'/>
<id>bf4a5efa80ef8438deb0a99c9a02b1f550aaf814</id>
<content type='text'>
Those TR13/OpenMP 6.0 routines permit a reproducible offloading to
a specific device by mapping an OpenMP device number to a
unique ID (UID). The GPU device UIDs should be universally unique,
the one for the host is not.

gcc/ChangeLog:

	* omp-general.cc (omp_runtime_api_procname): Add
	get_device_from_uid and omp_get_uid_from_device routines.

include/ChangeLog:

	* cuda/cuda.h (cuDeviceGetUuid): Declare.
	(cuDeviceGetUuid_v2): Add prototype.

libgomp/ChangeLog:

	* config/gcn/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Add stub implementation.
	* config/nvptx/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Likewise.
	* fortran.c (omp_get_uid_from_device_,
	omp_get_uid_from_device_8_): New functions.
	* libgomp-plugin.h (GOMP_OFFLOAD_get_uid): Add prototype.
	* libgomp.h (struct gomp_device_descr): Add 'uid' and 'get_uid_func'.
	* libgomp.map (GOMP_6.0): New, includind the new UID routines.
	* libgomp.texi (OpenMP Technical Report 13): Mark UID routines as 'Y'.
	(Device Information Routines): Document new UID routines.
	(Offload-Target Specifics): Document UID format.
	* omp.h.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New prototype.
	* omp_lib.f90.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New interface.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuDeviceGetUuid and cuDeviceGetUuid_v2 via
	CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): New.
	* target.c (str_omp_initial_device): New static var.
	(STR_OMP_DEV_PREFIX): Define.
	(gomp_get_uid_for_device, omp_get_uid_from_device,
	omp_get_device_from_uid): New.
	(gomp_load_plugin_for_device): DLSYM_OPT the function 'get_uid'.
	(gomp_target_init): Set the device's 'uid' field to NULL.
	* testsuite/libgomp.c/device_uid.c: New test.
	* testsuite/libgomp.fortran/device_uid.f90: New test.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Those TR13/OpenMP 6.0 routines permit a reproducible offloading to
a specific device by mapping an OpenMP device number to a
unique ID (UID). The GPU device UIDs should be universally unique,
the one for the host is not.

gcc/ChangeLog:

	* omp-general.cc (omp_runtime_api_procname): Add
	get_device_from_uid and omp_get_uid_from_device routines.

include/ChangeLog:

	* cuda/cuda.h (cuDeviceGetUuid): Declare.
	(cuDeviceGetUuid_v2): Add prototype.

libgomp/ChangeLog:

	* config/gcn/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Add stub implementation.
	* config/nvptx/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Likewise.
	* fortran.c (omp_get_uid_from_device_,
	omp_get_uid_from_device_8_): New functions.
	* libgomp-plugin.h (GOMP_OFFLOAD_get_uid): Add prototype.
	* libgomp.h (struct gomp_device_descr): Add 'uid' and 'get_uid_func'.
	* libgomp.map (GOMP_6.0): New, includind the new UID routines.
	* libgomp.texi (OpenMP Technical Report 13): Mark UID routines as 'Y'.
	(Device Information Routines): Document new UID routines.
	(Offload-Target Specifics): Document UID format.
	* omp.h.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New prototype.
	* omp_lib.f90.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New interface.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuDeviceGetUuid and cuDeviceGetUuid_v2 via
	CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): New.
	* target.c (str_omp_initial_device): New static var.
	(STR_OMP_DEV_PREFIX): Define.
	(gomp_get_uid_for_device, omp_get_uid_from_device,
	omp_get_device_from_uid): New.
	(gomp_load_plugin_for_device): DLSYM_OPT the function 'get_uid'.
	(gomp_target_init): Set the device's 'uid' field to NULL.
	* testsuite/libgomp.c/device_uid.c: New test.
	* testsuite/libgomp.fortran/device_uid.f90: New test.
</pre>
</div>
</content>
</entry>
<entry>
<title>Update copyright years.</title>
<updated>2024-01-03T11:19:35+00:00</updated>
<author>
<name>Jakub Jelinek</name>
<email>jakub@redhat.com</email>
</author>
<published>2024-01-03T11:19:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=a945c346f57ba40fc80c14ac59be0d43624e559d'/>
<id>a945c346f57ba40fc80c14ac59be0d43624e559d</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>openmp: Add support for the 'indirect' clause in C/C++</title>
<updated>2023-11-07T15:44:50+00:00</updated>
<author>
<name>Kwok Cheung Yeung</name>
<email>kcy@codesourcery.com</email>
</author>
<published>2023-11-07T15:18:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=a49c7d3193bb0fd5589e12e725f5a130725ae171'/>
<id>a49c7d3193bb0fd5589e12e725f5a130725ae171</id>
<content type='text'>
This adds support for the 'indirect' clause in the 'declare target'
directive.  Functions declared as indirect may be called via function
pointers passed from the host in offloaded code.

Virtual calls to member functions via the object pointer in C++ are
currently not supported in target regions.

2023-11-07  Kwok Cheung Yeung  &lt;kcy@codesourcery.com&gt;

gcc/c-family/
	* c-attribs.cc (c_common_attribute_table): Add attribute for
	indirect functions.
	* c-pragma.h (enum parma_omp_clause): Add entry for indirect clause.

gcc/c/
	* c-decl.cc (c_decl_attributes): Add attribute for indirect
	functions.
	* c-lang.h (c_omp_declare_target_attr): Add indirect field.
	* c-parser.cc (c_parser_omp_clause_name): Handle indirect clause.
	(c_parser_omp_clause_indirect): New.
	(c_parser_omp_all_clauses): Handle indirect clause.
	(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(c_parser_omp_declare_target): Handle indirect clause.  Emit error
	message if device_type or indirect clauses used alone.  Emit error
	if indirect clause used with device_type that is not 'any'.
	(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(c_parser_omp_begin): Handle indirect clause.
	* c-typeck.cc (c_finish_omp_clauses): Handle indirect clause.

gcc/cp/
	* cp-tree.h (cp_omp_declare_target_attr): Add indirect field.
	* decl2.cc (cplus_decl_attributes): Add attribute for indirect
	functions.
	* parser.cc (cp_parser_omp_clause_name): Handle indirect clause.
	(cp_parser_omp_clause_indirect): New.
	(cp_parser_omp_all_clauses): Handle indirect clause.
	(handle_omp_declare_target_clause): Add extra parameter.  Add
	indirect attribute for indirect functions.
	(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(cp_parser_omp_declare_target): Handle indirect clause.  Emit error
	message if device_type or indirect clauses used alone.  Emit error
	if indirect clause used with device_type that is not 'any'.
	(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(cp_parser_omp_begin): Handle indirect clause.
	* semantics.cc (finish_omp_clauses): Handle indirect clause.

gcc/
	* lto-cgraph.cc (enum LTO_symtab_tags): Add tag for indirect
	functions.
	(output_offload_tables): Write indirect functions.
	(input_offload_tables): read indirect functions.
	* lto-section-names.h (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
	* omp-builtins.def (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR): New.
	* omp-offload.cc (offload_ind_funcs): New.
	(omp_discover_implicit_declare_target): Add functions marked with
	'omp declare target indirect' to indirect functions list.
	(omp_finish_file): Add indirect functions to section for offload
	indirect functions.
	(execute_omp_device_lower): Redirect indirect calls on target by
	passing function pointer to BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR.
	(pass_omp_device_lower::gate): Run pass_omp_device_lower if
	indirect functions are present on an accelerator device.
	* omp-offload.h (offload_ind_funcs): New.
	* tree-core.h (omp_clause_code): Add OMP_CLAUSE_INDIRECT.
	* tree.cc (omp_clause_num_ops): Add entry for OMP_CLAUSE_INDIRECT.
	(omp_clause_code_name): Likewise.
	* tree.h (OMP_CLAUSE_INDIRECT_EXPR): New.
	* config/gcn/mkoffload.cc (process_asm): Process offload_ind_funcs
	section.  Count number of indirect functions.
	(process_obj): Emit number of indirect functions.
	* config/nvptx/mkoffload.cc (ind_func_ids, ind_funcs_tail): New.
	(process): Emit offload_ind_func_table in PTX code.  Emit indirect
	function names and count in image.
	* config/nvptx/nvptx.cc (nvptx_record_offload_symbol): Mark
	indirect functions in PTX code with IND_FUNC_MAP.

gcc/testsuite/
	* c-c++-common/gomp/declare-target-7.c: Update expected error message.
	* c-c++-common/gomp/declare-target-indirect-1.c: New.
	* c-c++-common/gomp/declare-target-indirect-2.c: New.
	* g++.dg/gomp/attrs-21.C (v12): Update expected error message.
	* g++.dg/gomp/declare-target-indirect-1.C: New.
	* gcc.dg/gomp/attrs-21.c (v12): Update expected error message.

include/
	* gomp-constants.h (GOMP_VERSION): Increment to 3.
	(GOMP_VERSION_SUPPORTS_INDIRECT_FUNCS): New.

libgcc/
	* offloadstuff.c (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
	(__offload_ind_func_table): New.
	(__offload_ind_funcs_end): New.
	(__OFFLOAD_TABLE__): Add entries for indirect functions.

libgomp/
	* Makefile.am (libgomp_la_SOURCES): Add target-indirect.c.
	* Makefile.in: Regenerate.
	* libgomp-plugin.h (GOMP_INDIRECT_ADDR_MAP): New define.
	(GOMP_OFFLOAD_load_image): Add extra argument.
	* libgomp.h (struct indirect_splay_tree_key_s): New.
	(indirect_splay_tree_node, indirect_splay_tree,
	indirect_splay_tree_key): New.
	(indirect_splay_compare): New.
	* libgomp.map (GOMP_5.1.1): Add GOMP_target_map_indirect_ptr.
	* libgomp.texi (OpenMP 5.1): Update documentation on indirect
	calls in target region and on indirect clause.
	(Other new OpenMP 5.2 features): Add entry for virtual function calls.
	* libgomp_g.h (GOMP_target_map_indirect_ptr): Add prototype.
	* oacc-host.c (host_load_image): Add extra argument.
	* target.c (gomp_load_image_to_device): If the GOMP_VERSION is high
	enough, read host indirect functions table and pass to
	load_image_func.
	* config/accel/target-indirect.c: New.
	* config/linux/target-indirect.c: New.
	* config/gcn/team.c (build_indirect_map): Add prototype.
	(gomp_gcn_enter_kernel): Initialize support for indirect
	function calls on GCN target.
	* config/nvptx/team.c (build_indirect_map): Add prototype.
	(gomp_nvptx_main): Initialize support for indirect function
	calls on NVPTX target.
	* plugin/plugin-gcn.c (struct gcn_image_desc): Add field for
	indirect functions count.
	(GOMP_OFFLOAD_load_image): Add extra argument.  If the GOMP_VERSION
	is high enough, build address translation table and copy it to target
	memory.
	* plugin/plugin-nvptx.c (nvptx_tdata): Add field for indirect
	functions count.
	(GOMP_OFFLOAD_load_image): Add extra argument.  If the GOMP_VERSION
	is high enough, Build address translation table and copy it to target
	memory.
	* testsuite/libgomp.c-c++-common/declare-target-indirect-1.c: New.
	* testsuite/libgomp.c-c++-common/declare-target-indirect-2.c: New.
	* testsuite/libgomp.c++/declare-target-indirect-1.C: New.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds support for the 'indirect' clause in the 'declare target'
directive.  Functions declared as indirect may be called via function
pointers passed from the host in offloaded code.

Virtual calls to member functions via the object pointer in C++ are
currently not supported in target regions.

2023-11-07  Kwok Cheung Yeung  &lt;kcy@codesourcery.com&gt;

gcc/c-family/
	* c-attribs.cc (c_common_attribute_table): Add attribute for
	indirect functions.
	* c-pragma.h (enum parma_omp_clause): Add entry for indirect clause.

gcc/c/
	* c-decl.cc (c_decl_attributes): Add attribute for indirect
	functions.
	* c-lang.h (c_omp_declare_target_attr): Add indirect field.
	* c-parser.cc (c_parser_omp_clause_name): Handle indirect clause.
	(c_parser_omp_clause_indirect): New.
	(c_parser_omp_all_clauses): Handle indirect clause.
	(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(c_parser_omp_declare_target): Handle indirect clause.  Emit error
	message if device_type or indirect clauses used alone.  Emit error
	if indirect clause used with device_type that is not 'any'.
	(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(c_parser_omp_begin): Handle indirect clause.
	* c-typeck.cc (c_finish_omp_clauses): Handle indirect clause.

gcc/cp/
	* cp-tree.h (cp_omp_declare_target_attr): Add indirect field.
	* decl2.cc (cplus_decl_attributes): Add attribute for indirect
	functions.
	* parser.cc (cp_parser_omp_clause_name): Handle indirect clause.
	(cp_parser_omp_clause_indirect): New.
	(cp_parser_omp_all_clauses): Handle indirect clause.
	(handle_omp_declare_target_clause): Add extra parameter.  Add
	indirect attribute for indirect functions.
	(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(cp_parser_omp_declare_target): Handle indirect clause.  Emit error
	message if device_type or indirect clauses used alone.  Emit error
	if indirect clause used with device_type that is not 'any'.
	(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
	(cp_parser_omp_begin): Handle indirect clause.
	* semantics.cc (finish_omp_clauses): Handle indirect clause.

gcc/
	* lto-cgraph.cc (enum LTO_symtab_tags): Add tag for indirect
	functions.
	(output_offload_tables): Write indirect functions.
	(input_offload_tables): read indirect functions.
	* lto-section-names.h (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
	* omp-builtins.def (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR): New.
	* omp-offload.cc (offload_ind_funcs): New.
	(omp_discover_implicit_declare_target): Add functions marked with
	'omp declare target indirect' to indirect functions list.
	(omp_finish_file): Add indirect functions to section for offload
	indirect functions.
	(execute_omp_device_lower): Redirect indirect calls on target by
	passing function pointer to BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR.
	(pass_omp_device_lower::gate): Run pass_omp_device_lower if
	indirect functions are present on an accelerator device.
	* omp-offload.h (offload_ind_funcs): New.
	* tree-core.h (omp_clause_code): Add OMP_CLAUSE_INDIRECT.
	* tree.cc (omp_clause_num_ops): Add entry for OMP_CLAUSE_INDIRECT.
	(omp_clause_code_name): Likewise.
	* tree.h (OMP_CLAUSE_INDIRECT_EXPR): New.
	* config/gcn/mkoffload.cc (process_asm): Process offload_ind_funcs
	section.  Count number of indirect functions.
	(process_obj): Emit number of indirect functions.
	* config/nvptx/mkoffload.cc (ind_func_ids, ind_funcs_tail): New.
	(process): Emit offload_ind_func_table in PTX code.  Emit indirect
	function names and count in image.
	* config/nvptx/nvptx.cc (nvptx_record_offload_symbol): Mark
	indirect functions in PTX code with IND_FUNC_MAP.

gcc/testsuite/
	* c-c++-common/gomp/declare-target-7.c: Update expected error message.
	* c-c++-common/gomp/declare-target-indirect-1.c: New.
	* c-c++-common/gomp/declare-target-indirect-2.c: New.
	* g++.dg/gomp/attrs-21.C (v12): Update expected error message.
	* g++.dg/gomp/declare-target-indirect-1.C: New.
	* gcc.dg/gomp/attrs-21.c (v12): Update expected error message.

include/
	* gomp-constants.h (GOMP_VERSION): Increment to 3.
	(GOMP_VERSION_SUPPORTS_INDIRECT_FUNCS): New.

libgcc/
	* offloadstuff.c (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
	(__offload_ind_func_table): New.
	(__offload_ind_funcs_end): New.
	(__OFFLOAD_TABLE__): Add entries for indirect functions.

libgomp/
	* Makefile.am (libgomp_la_SOURCES): Add target-indirect.c.
	* Makefile.in: Regenerate.
	* libgomp-plugin.h (GOMP_INDIRECT_ADDR_MAP): New define.
	(GOMP_OFFLOAD_load_image): Add extra argument.
	* libgomp.h (struct indirect_splay_tree_key_s): New.
	(indirect_splay_tree_node, indirect_splay_tree,
	indirect_splay_tree_key): New.
	(indirect_splay_compare): New.
	* libgomp.map (GOMP_5.1.1): Add GOMP_target_map_indirect_ptr.
	* libgomp.texi (OpenMP 5.1): Update documentation on indirect
	calls in target region and on indirect clause.
	(Other new OpenMP 5.2 features): Add entry for virtual function calls.
	* libgomp_g.h (GOMP_target_map_indirect_ptr): Add prototype.
	* oacc-host.c (host_load_image): Add extra argument.
	* target.c (gomp_load_image_to_device): If the GOMP_VERSION is high
	enough, read host indirect functions table and pass to
	load_image_func.
	* config/accel/target-indirect.c: New.
	* config/linux/target-indirect.c: New.
	* config/gcn/team.c (build_indirect_map): Add prototype.
	(gomp_gcn_enter_kernel): Initialize support for indirect
	function calls on GCN target.
	* config/nvptx/team.c (build_indirect_map): Add prototype.
	(gomp_nvptx_main): Initialize support for indirect function
	calls on NVPTX target.
	* plugin/plugin-gcn.c (struct gcn_image_desc): Add field for
	indirect functions count.
	(GOMP_OFFLOAD_load_image): Add extra argument.  If the GOMP_VERSION
	is high enough, build address translation table and copy it to target
	memory.
	* plugin/plugin-nvptx.c (nvptx_tdata): Add field for indirect
	functions count.
	(GOMP_OFFLOAD_load_image): Add extra argument.  If the GOMP_VERSION
	is high enough, Build address translation table and copy it to target
	memory.
	* testsuite/libgomp.c-c++-common/declare-target-indirect-1.c: New.
	* testsuite/libgomp.c-c++-common/declare-target-indirect-2.c: New.
	* testsuite/libgomp.c++/declare-target-indirect-1.C: New.
</pre>
</div>
</content>
</entry>
</feed>
