<feed xmlns='http://www.w3.org/2005/Atom'>
<title>gcc.git/libgomp/libgomp.h, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/'/>
<entry>
<title>openmp, nvptx: ompx_gnu_managed_mem_alloc</title>
<updated>2025-11-13T14:16:09+00:00</updated>
<author>
<name>Andrew Stubbs</name>
<email>ams@codesourcery.com</email>
</author>
<published>2024-06-28T10:24:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=62174ec27b686bb9656c62a3b53d6cc08c9addb4'/>
<id>62174ec27b686bb9656c62a3b53d6cc08c9addb4</id>
<content type='text'>
This adds support for using Cuda Managed Memory with omp_alloc.  AMD support
will be added in a future patch.

There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a
corresponding memory space, which can be used to allocate memory in the
"managed" space.

The nvptx plugin is modified to make the necessary Cuda calls, via two new
(optional) plugin interfaces.

gcc/fortran/ChangeLog:

	* openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX
	and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the
	comment.

include/ChangeLog:

	* cuda/cuda.h (cuMemAllocManaged): Add declaration and related
	CU_MEM_ATTACH_GLOBAL flag.
	* gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201.
	(GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant.

libgomp/ChangeLog:

	* allocator.c (ompx_gnu_max_predefined_alloc): Update to
	ompx_gnu_managed_mem_alloc.
	(_Static_assert): Fix assertion messages for allocators and add
	new assertions for memspace constants.
	(omp_max_predefined_mem_space): New define.
	(ompx_gnu_min_predefined_mem_space): New define.
	(ompx_gnu_max_predefined_mem_space): New define.
	(MEMSPACE_ALLOC): Add check for non-standard memspaces.
	(MEMSPACE_CALLOC): Likewise.
	(MEMSPACE_REALLOC): Likewise.
	(MEMSPACE_VALIDATE): Likewise.
	(predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space.
	(omp_init_allocator): Add ompx_gnu_managed_mem_space validation.
	* config/gcn/allocator.c (gcn_memspace_alloc): Add check for
	non-standard memspaces.
	(gcn_memspace_calloc): Likewise.
	(gcn_memspace_realloc): Likewise.
	(gcn_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* config/linux/allocator.c (linux_memspace_alloc): Add managed
	memory space handling.
	(linux_memspace_calloc): Likewise.
	(linux_memspace_free): Likewise.
	(linux_memspace_realloc): Likewise (returns NULL for fallback).
	* config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for
	non-standard memspaces.
	(nvptx_memspace_calloc): Likewise.
	(nvptx_memspace_realloc): Likewise.
	(nvptx_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc,
	ompx_gnu_managed_mem_space, and some static asserts so I don't forget
	them again.
	* libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration.
	(GOMP_OFFLOAD_managed_free): New declaration.
	* libgomp.h (gomp_managed_alloc): New declaration.
	(gomp_managed_free): New declaration.
	(struct gomp_device_descr): Add managed_alloc_func and
	managed_free_func fields.
	* libgomp.texi: Document ompx_gnu_managed_mem_alloc and
	ompx_gnu_managed_mem_space, add C++ template documentation, and
	describe NVPTX and AMD support.
	* omp.h.in: Add ompx_gnu_managed_mem_space and
	ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++
	allocator template.
	* omp_lib.f90.in: Add Fortran bindings for new allocator and
	memory space.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuMemAllocManaged.
	* plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to
	support cuMemAllocManaged.
	(GOMP_OFFLOAD_alloc): Move contents to ...
	(cleanup_and_alloc): ... this new function, and add managed support.
	(GOMP_OFFLOAD_managed_alloc): New function.
	(GOMP_OFFLOAD_managed_free): New function.
	* target.c (gomp_managed_alloc): New function.
	(gomp_managed_free): New function.
	(gomp_load_plugin_for_device): Load optional managed_alloc
	and managed_free plugin APIs.
	* testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem.
	* testsuite/libgomp.c++/alloc-managed-1.C: New test.
	* testsuite/libgomp.c/alloc-managed-1.c: New test.
	* testsuite/libgomp.c/alloc-managed-2.c: New test.
	* testsuite/libgomp.c/alloc-managed-3.c: New test.
	* testsuite/libgomp.c/alloc-managed-4.c: New test.
	* testsuite/libgomp.fortran/alloc-managed-1.f90: New test.

Co-authored-by: Kwok Cheung Yeung &lt;kcyeung@baylibre.com&gt;
Co-authored-by: Thomas Schwinge &lt;tschwinge@baylibre.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds support for using Cuda Managed Memory with omp_alloc.  AMD support
will be added in a future patch.

There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a
corresponding memory space, which can be used to allocate memory in the
"managed" space.

The nvptx plugin is modified to make the necessary Cuda calls, via two new
(optional) plugin interfaces.

gcc/fortran/ChangeLog:

	* openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX
	and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the
	comment.

include/ChangeLog:

	* cuda/cuda.h (cuMemAllocManaged): Add declaration and related
	CU_MEM_ATTACH_GLOBAL flag.
	* gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201.
	(GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant.
	(GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant.

libgomp/ChangeLog:

	* allocator.c (ompx_gnu_max_predefined_alloc): Update to
	ompx_gnu_managed_mem_alloc.
	(_Static_assert): Fix assertion messages for allocators and add
	new assertions for memspace constants.
	(omp_max_predefined_mem_space): New define.
	(ompx_gnu_min_predefined_mem_space): New define.
	(ompx_gnu_max_predefined_mem_space): New define.
	(MEMSPACE_ALLOC): Add check for non-standard memspaces.
	(MEMSPACE_CALLOC): Likewise.
	(MEMSPACE_REALLOC): Likewise.
	(MEMSPACE_VALIDATE): Likewise.
	(predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space.
	(omp_init_allocator): Add ompx_gnu_managed_mem_space validation.
	* config/gcn/allocator.c (gcn_memspace_alloc): Add check for
	non-standard memspaces.
	(gcn_memspace_calloc): Likewise.
	(gcn_memspace_realloc): Likewise.
	(gcn_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* config/linux/allocator.c (linux_memspace_alloc): Add managed
	memory space handling.
	(linux_memspace_calloc): Likewise.
	(linux_memspace_free): Likewise.
	(linux_memspace_realloc): Likewise (returns NULL for fallback).
	* config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for
	non-standard memspaces.
	(nvptx_memspace_calloc): Likewise.
	(nvptx_memspace_realloc): Likewise.
	(nvptx_memspace_validate): Update to validate standard vs non-standard
	memspaces.
	* env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc,
	ompx_gnu_managed_mem_space, and some static asserts so I don't forget
	them again.
	* libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration.
	(GOMP_OFFLOAD_managed_free): New declaration.
	* libgomp.h (gomp_managed_alloc): New declaration.
	(gomp_managed_free): New declaration.
	(struct gomp_device_descr): Add managed_alloc_func and
	managed_free_func fields.
	* libgomp.texi: Document ompx_gnu_managed_mem_alloc and
	ompx_gnu_managed_mem_space, add C++ template documentation, and
	describe NVPTX and AMD support.
	* omp.h.in: Add ompx_gnu_managed_mem_space and
	ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++
	allocator template.
	* omp_lib.f90.in: Add Fortran bindings for new allocator and
	memory space.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuMemAllocManaged.
	* plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to
	support cuMemAllocManaged.
	(GOMP_OFFLOAD_alloc): Move contents to ...
	(cleanup_and_alloc): ... this new function, and add managed support.
	(GOMP_OFFLOAD_managed_alloc): New function.
	(GOMP_OFFLOAD_managed_free): New function.
	* target.c (gomp_managed_alloc): New function.
	(gomp_managed_free): New function.
	(gomp_load_plugin_for_device): Load optional managed_alloc
	and managed_free plugin APIs.
	* testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem.
	* testsuite/libgomp.c++/alloc-managed-1.C: New test.
	* testsuite/libgomp.c/alloc-managed-1.c: New test.
	* testsuite/libgomp.c/alloc-managed-2.c: New test.
	* testsuite/libgomp.c/alloc-managed-3.c: New test.
	* testsuite/libgomp.c/alloc-managed-4.c: New test.
	* testsuite/libgomp.fortran/alloc-managed-1.f90: New test.

Co-authored-by: Kwok Cheung Yeung &lt;kcyeung@baylibre.com&gt;
Co-authored-by: Thomas Schwinge &lt;tschwinge@baylibre.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libgomp: fine-grained pinned memory allocator</title>
<updated>2025-10-23T11:08:07+00:00</updated>
<author>
<name>Andrew Stubbs</name>
<email>ams@baylibre.com</email>
</author>
<published>2025-10-20T14:57:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=9e5a9aa49051c3d60f6d0071c0711b2202902479'/>
<id>9e5a9aa49051c3d60f6d0071c0711b2202902479</id>
<content type='text'>
This patch introduces a new custom memory allocator for use with pinned
memory (in the case where the Cuda allocator isn't available).  In future,
this allocator will also be used for Managed Memory.  Both memories are
incompatible with the system malloc because allocated memory cannot share a
page with memory allocated for other purposes.

This means that small allocations will no longer consume an entire page of
pinned memory.  Unfortunately, it also means that pinned memory pages will
never be unmapped (although they may be reused).  This isn't a technical
limitation; the "free" algorithm could be extended in future, if needed.

The implementation is not perfect; there are various corner cases (especially
related to extending onto new pages) where allocations and reallocations may
be sub-optimal, but it should still be a step forward in support for small
allocations.

I have considered using libmemkind's "fixed" memory but rejected it for three
reasons: 1) libmemkind may not always be present at runtime, 2) there's no
currently documented means to extend a "fixed" kind one page at a time
(although the code appears to have an undocumented function that may do the
job, and/or extending libmemkind to support the MAP_LOCKED mmap flag with its
regular kinds would be straight-forward), 3) Managed Memory benefits from
having the metadata located in different memory and using an external
implementation makes it hard to guarantee this.

libgomp/ChangeLog:

	* Makefile.am (libgomp_la_SOURCES): Add simple-allocator.c.
	* Makefile.in: Regenerate.
	* basic-allocator.c: Mention simple-allocator in the comment.
	* config/linux/allocator.c: Include unistd.h.
	(pin_ctx): New variable.
	(ctxlock): New variable.
	(linux_init_pin_ctx): New function.
	(linux_memspace_alloc): Use simple-allocator for pinned memory.
	(linux_memspace_free): Likewise.
	(linux_memspace_realloc): Likewise.
	* libgomp.h (gomp_simple_alloc_init_context): New prototype.
	(gomp_simple_alloc_register_memory): New prototype.
	(gomp_simple_alloc): New prototype.
	(gomp_simple_free): New prototype.
	(gomp_simple_realloc): New prototype.
	* libgomp.texi: Update pinned memory trait documentation.
	* testsuite/libgomp.c/alloc-pinned-8.c: New test.
	* simple-allocator.c: New file.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch introduces a new custom memory allocator for use with pinned
memory (in the case where the Cuda allocator isn't available).  In future,
this allocator will also be used for Managed Memory.  Both memories are
incompatible with the system malloc because allocated memory cannot share a
page with memory allocated for other purposes.

This means that small allocations will no longer consume an entire page of
pinned memory.  Unfortunately, it also means that pinned memory pages will
never be unmapped (although they may be reused).  This isn't a technical
limitation; the "free" algorithm could be extended in future, if needed.

The implementation is not perfect; there are various corner cases (especially
related to extending onto new pages) where allocations and reallocations may
be sub-optimal, but it should still be a step forward in support for small
allocations.

I have considered using libmemkind's "fixed" memory but rejected it for three
reasons: 1) libmemkind may not always be present at runtime, 2) there's no
currently documented means to extend a "fixed" kind one page at a time
(although the code appears to have an undocumented function that may do the
job, and/or extending libmemkind to support the MAP_LOCKED mmap flag with its
regular kinds would be straight-forward), 3) Managed Memory benefits from
having the metadata located in different memory and using an external
implementation makes it hard to guarantee this.

libgomp/ChangeLog:

	* Makefile.am (libgomp_la_SOURCES): Add simple-allocator.c.
	* Makefile.in: Regenerate.
	* basic-allocator.c: Mention simple-allocator in the comment.
	* config/linux/allocator.c: Include unistd.h.
	(pin_ctx): New variable.
	(ctxlock): New variable.
	(linux_init_pin_ctx): New function.
	(linux_memspace_alloc): Use simple-allocator for pinned memory.
	(linux_memspace_free): Likewise.
	(linux_memspace_realloc): Likewise.
	* libgomp.h (gomp_simple_alloc_init_context): New prototype.
	(gomp_simple_alloc_register_memory): New prototype.
	(gomp_simple_alloc): New prototype.
	(gomp_simple_free): New prototype.
	(gomp_simple_realloc): New prototype.
	* libgomp.texi: Update pinned memory trait documentation.
	* testsuite/libgomp.c/alloc-pinned-8.c: New test.
	* simple-allocator.c: New file.
</pre>
</div>
</content>
</entry>
<entry>
<title>libgomp, nvptx: Cuda pinned memory</title>
<updated>2025-10-23T11:08:06+00:00</updated>
<author>
<name>Andrew Stubbs</name>
<email>ams@baylibre.com</email>
</author>
<published>2025-10-14T11:22:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=3b8d9d579c2931f1d8d2c89ff67735bc77df55ad'/>
<id>3b8d9d579c2931f1d8d2c89ff67735bc77df55ad</id>
<content type='text'>
Use Cuda to pin memory, instead of Linux mlock, when available.

There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.

The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX.  At present, the other supported devices do
not have equivalent capabilities (or requirements).

libgomp/ChangeLog:

	* config/linux/allocator.c: Include assert.h.
	(using_device_for_page_locked): New variable.
	(linux_memspace_alloc): Add init0 parameter. Support device pinning.
	(linux_memspace_calloc): Set init0 to true.
	(linux_memspace_free): Support device pinning.
	(linux_memspace_realloc): Support device pinning.
	(MEMSPACE_ALLOC): Set init0 to false.
	* libgomp-plugin.h
	(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* libgomp.h (gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(struct gomp_device_descr): Add page_locked_host_alloc_func and
	page_locked_host_free_func.
	* libgomp.texi: Adjust the docs for the pinned trait.
	* plugin/plugin-nvptx.c
	(GOMP_OFFLOAD_page_locked_host_alloc): New function.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* target.c (device_for_page_locked): New variable.
	(get_device_for_page_locked): New function.
	(gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(gomp_load_plugin_for_device): Add page_locked_host_alloc and
	page_locked_host_free.
	* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
	devices.
	* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.

Co-Authored-By: Thomas Schwinge &lt;thomas@codesourcery.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use Cuda to pin memory, instead of Linux mlock, when available.

There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.

The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX.  At present, the other supported devices do
not have equivalent capabilities (or requirements).

libgomp/ChangeLog:

	* config/linux/allocator.c: Include assert.h.
	(using_device_for_page_locked): New variable.
	(linux_memspace_alloc): Add init0 parameter. Support device pinning.
	(linux_memspace_calloc): Set init0 to true.
	(linux_memspace_free): Support device pinning.
	(linux_memspace_realloc): Support device pinning.
	(MEMSPACE_ALLOC): Set init0 to false.
	* libgomp-plugin.h
	(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* libgomp.h (gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(struct gomp_device_descr): Add page_locked_host_alloc_func and
	page_locked_host_free_func.
	* libgomp.texi: Adjust the docs for the pinned trait.
	* plugin/plugin-nvptx.c
	(GOMP_OFFLOAD_page_locked_host_alloc): New function.
	(GOMP_OFFLOAD_page_locked_host_free): Likewise.
	* target.c (device_for_page_locked): New variable.
	(get_device_for_page_locked): New function.
	(gomp_page_locked_host_alloc): Likewise.
	(gomp_page_locked_host_free): Likewise.
	(gomp_load_plugin_for_device): Add page_locked_host_alloc and
	page_locked_host_free.
	* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
	devices.
	* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.

Co-Authored-By: Thomas Schwinge &lt;thomas@codesourcery.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libgomp: Add OpenMP's omp_target_memset/omp_target_memset_async</title>
<updated>2025-06-02T15:43:57+00:00</updated>
<author>
<name>Tobias Burnus</name>
<email>tburnus@baylibre.com</email>
</author>
<published>2025-06-02T15:43:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=4e47e2f833732c5d9a3c3e69dc753f99b3a56737'/>
<id>4e47e2f833732c5d9a3c3e69dc753f99b3a56737</id>
<content type='text'>
	PR libgomp/120444

include/ChangeLog:

	* cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare.
	* libgomp.h (struct gomp_device_descr): Add memset_func.
	* libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}.
	* libgomp.texi (Device Memory Routines): Document them.
	* omp.h.in (omp_target_memset, omp_target_memset_async): Declare.
	* omp_lib.f90.in (omp_target_memset, omp_target_memset_async):
	Add interfaces.
	* omp_lib.h.in (omp_target_memset, omp_target_memset_async): Likewise.
	* plugin/cuda-lib.def: Add cuMemsetD8.
	* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
	hsa_amd_memory_fill_fn.
	(init_hsa_runtime_functions): DLSYM_OPT_FN load it.
	(GOMP_OFFLOAD_memset): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New.
	* target.c (omp_target_memset_int, omp_target_memset,
	omp_target_memset_async_helper, omp_target_memset_async): New.
	(gomp_load_plugin_for_device): Add DLSYM (memset).
	* testsuite/libgomp.c-c++-common/omp_target_memset.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test.
	* testsuite/libgomp.fortran/omp_target_memset.f90: New test.
	* testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
	PR libgomp/120444

include/ChangeLog:

	* cuda/cuda.h (cuMemsetD8, cuMemsetD8Async): Declare.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_memset): Declare.
	* libgomp.h (struct gomp_device_descr): Add memset_func.
	* libgomp.map (GOMP_6.0.1): Add omp_target_memset{,_async}.
	* libgomp.texi (Device Memory Routines): Document them.
	* omp.h.in (omp_target_memset, omp_target_memset_async): Declare.
	* omp_lib.f90.in (omp_target_memset, omp_target_memset_async):
	Add interfaces.
	* omp_lib.h.in (omp_target_memset, omp_target_memset_async): Likewise.
	* plugin/cuda-lib.def: Add cuMemsetD8.
	* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
	hsa_amd_memory_fill_fn.
	(init_hsa_runtime_functions): DLSYM_OPT_FN load it.
	(GOMP_OFFLOAD_memset): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memset): New.
	* target.c (omp_target_memset_int, omp_target_memset,
	omp_target_memset_async_helper, omp_target_memset_async): New.
	(gomp_load_plugin_for_device): Add DLSYM (memset).
	* testsuite/libgomp.c-c++-common/omp_target_memset.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-2.c: New test.
	* testsuite/libgomp.c-c++-common/omp_target_memset-3.c: New test.
	* testsuite/libgomp.fortran/omp_target_memset.f90: New test.
	* testsuite/libgomp.fortran/omp_target_memset-2.f90: New test.
</pre>
</div>
</content>
</entry>
<entry>
<title>libgomp: Add OpenACC's acc_memcpy_device{,_async} routines [PR93226]</title>
<updated>2025-05-29T20:47:06+00:00</updated>
<author>
<name>Tobias Burnus</name>
<email>tburnus@baylibre.com</email>
</author>
<published>2025-05-29T20:47:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=f4aa6b5a8d63050f5d61fcec222ed87be4c0a266'/>
<id>f4aa6b5a8d63050f5d61fcec222ed87be4c0a266</id>
<content type='text'>
libgomp/ChangeLog:

	PR libgomp/93226
	* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_dev2dev): New
	prototype.
	* libgomp.h (struct acc_dispatch_t): Add dev2dev_func.
	(gomp_copy_dev2dev): New prototype.
	* libgomp.map (OACC_2.6.1): New; add acc_memcpy_device{,_async}.
	* libgomp.texi (acc_memcpy_device): New.
	* oacc-mem.c (memcpy_tofrom_device): Change to take from/to
	device boolean; use memcpy not memmove; add early return if
	size == 0 or same device + same ptr.
	(acc_memcpy_to_device, acc_memcpy_to_device_async,
	acc_memcpy_from_device, acc_memcpy_from_device_async): Update.
	(acc_memcpy_device, acc_memcpy_device_async): New.
	* openacc.f90 (acc_memcpy_device, acc_memcpy_device_async):
	Add interface.
	* openacc_lib.h (acc_memcpy_device, acc_memcpy_device_async):
	Likewise.
	* openacc.h (acc_memcpy_device, acc_memcpy_device_async): Add
	prototype.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev):
	Update comment.
	(GOMP_OFFLOAD_openacc_async_dev2host): Update call.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* plugin/plugin-nvptx.c (cuda_memcpy_dev_sanity_check): New.
	(GOMP_OFFLOAD_dev2dev): Call it.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* target.c (gomp_copy_dev2dev): New.
	(gomp_load_plugin_for_device): Load dev2dev and async_dev2dev.
	* testsuite/libgomp.oacc-c-c++-common/acc_memcpy_device-1.c: New test.
	* testsuite/libgomp.oacc-fortran/acc_memcpy_device-1.f90: New test.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
libgomp/ChangeLog:

	PR libgomp/93226
	* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_dev2dev): New
	prototype.
	* libgomp.h (struct acc_dispatch_t): Add dev2dev_func.
	(gomp_copy_dev2dev): New prototype.
	* libgomp.map (OACC_2.6.1): New; add acc_memcpy_device{,_async}.
	* libgomp.texi (acc_memcpy_device): New.
	* oacc-mem.c (memcpy_tofrom_device): Change to take from/to
	device boolean; use memcpy not memmove; add early return if
	size == 0 or same device + same ptr.
	(acc_memcpy_to_device, acc_memcpy_to_device_async,
	acc_memcpy_from_device, acc_memcpy_from_device_async): Update.
	(acc_memcpy_device, acc_memcpy_device_async): New.
	* openacc.f90 (acc_memcpy_device, acc_memcpy_device_async):
	Add interface.
	* openacc_lib.h (acc_memcpy_device, acc_memcpy_device_async):
	Likewise.
	* openacc.h (acc_memcpy_device, acc_memcpy_device_async): Add
	prototype.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev):
	Update comment.
	(GOMP_OFFLOAD_openacc_async_dev2host): Update call.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* plugin/plugin-nvptx.c (cuda_memcpy_dev_sanity_check): New.
	(GOMP_OFFLOAD_dev2dev): Call it.
	(GOMP_OFFLOAD_openacc_async_dev2dev): New.
	* target.c (gomp_copy_dev2dev): New.
	(gomp_load_plugin_for_device): Load dev2dev and async_dev2dev.
	* testsuite/libgomp.oacc-c-c++-common/acc_memcpy_device-1.c: New test.
	* testsuite/libgomp.oacc-fortran/acc_memcpy_device-1.f90: New test.
</pre>
</div>
</content>
</entry>
<entry>
<title>OpenMP: Fix mapping of zero-sized arrays with non-literal size: map(var[:n]), n = 0</title>
<updated>2025-05-14T18:06:49+00:00</updated>
<author>
<name>Tobias Burnus</name>
<email>tburnus@baylibre.com</email>
</author>
<published>2025-05-14T18:06:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=814e29e390b1e9253f9a38e0d84f5ebe5de0c13e'/>
<id>814e29e390b1e9253f9a38e0d84f5ebe5de0c13e</id>
<content type='text'>
For map(ptr[:0]), the used map kind is GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION
and it is permitted that 'ptr' does not exist. 'ptr' is set to the device
pointee if it exists or to the host value otherwise.

For map(ptr[:3]), the variable is first mapped and then ptr is updated to point
to the just-mapped device data; the attachment uses GOMP_MAP_ATTACH.

For map(ptr[:n]), generates always a GOMP_MAP_ATTACH, but when n == 0, it
was failing with:
   "pointer target not mapped for attach"

The solution is not to fail but first to check whether it was mapped before.
It turned out that for the mapping part, GCC adds a run-time check whether
n == 0 - and uses GOMP_MAP_ZERO_LEN_ARRAY_SECTION for the mapping.
Thus, we just have to check whether there such a mapping for the address
for which the GOMP_MAP_ATTACH. was requested. And, if there was, the
error diagnostic can be skipped.

Unsurprisingly, this issue occurs in real-world code; it was detected in
a code that distributes work via MPI and for some processes, some bounds
ended up to be zero.

libgomp/ChangeLog:

	* target.c (gomp_attach_pointer): Return bool; accept additional
	bool to optionally silence the fatal pointee-not-found error.
	(gomp_map_vars_internal): If the pointee could not be found,
	check whether it was mapped as GOMP_MAP_ZERO_LEN_ARRAY_SECTION.
	* libgomp.h (gomp_attach_pointer): Update prototype.
	* oacc-mem.c (acc_attach_async, goacc_enter_data_internal): Update
	calls.
	* testsuite/libgomp.c/target-map-zero-sized.c: New test.
	* testsuite/libgomp.c/target-map-zero-sized-2.c: New test.
	* testsuite/libgomp.c/target-map-zero-sized-3.c: New test.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For map(ptr[:0]), the used map kind is GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION
and it is permitted that 'ptr' does not exist. 'ptr' is set to the device
pointee if it exists or to the host value otherwise.

For map(ptr[:3]), the variable is first mapped and then ptr is updated to point
to the just-mapped device data; the attachment uses GOMP_MAP_ATTACH.

For map(ptr[:n]), generates always a GOMP_MAP_ATTACH, but when n == 0, it
was failing with:
   "pointer target not mapped for attach"

The solution is not to fail but first to check whether it was mapped before.
It turned out that for the mapping part, GCC adds a run-time check whether
n == 0 - and uses GOMP_MAP_ZERO_LEN_ARRAY_SECTION for the mapping.
Thus, we just have to check whether there such a mapping for the address
for which the GOMP_MAP_ATTACH. was requested. And, if there was, the
error diagnostic can be skipped.

Unsurprisingly, this issue occurs in real-world code; it was detected in
a code that distributes work via MPI and for some processes, some bounds
ended up to be zero.

libgomp/ChangeLog:

	* target.c (gomp_attach_pointer): Return bool; accept additional
	bool to optionally silence the fatal pointee-not-found error.
	(gomp_map_vars_internal): If the pointee could not be found,
	check whether it was mapped as GOMP_MAP_ZERO_LEN_ARRAY_SECTION.
	* libgomp.h (gomp_attach_pointer): Update prototype.
	* oacc-mem.c (acc_attach_async, goacc_enter_data_internal): Update
	calls.
	* testsuite/libgomp.c/target-map-zero-sized.c: New test.
	* testsuite/libgomp.c/target-map-zero-sized-2.c: New test.
	* testsuite/libgomp.c/target-map-zero-sized-3.c: New test.
</pre>
</div>
</content>
</entry>
<entry>
<title>OpenMP: 'interop' construct - add ME support + target-independent libgomp</title>
<updated>2025-03-21T18:24:16+00:00</updated>
<author>
<name>Paul-Antoine Arras</name>
<email>parras@baylibre.com</email>
</author>
<published>2025-03-13T16:16:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=99e2906ae255fc7b8edb008d7cd47b28b078a809'/>
<id>99e2906ae255fc7b8edb008d7cd47b28b078a809</id>
<content type='text'>
This patch partially enables use of the OpenMP interop construct by adding
middle end support, mostly in the omplower pass, and in the target-independent
part of the libgomp runtime. It follows up on previous patches for C, C++ and
Fortran front ends support. The full interop feature requires another patch to
enable foreign runtime support in libgomp plugins.

gcc/ChangeLog:

	* builtin-types.def
	(BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR): New.
	* gimple-low.cc (lower_stmt): Handle GIMPLE_OMP_INTEROP.
	* gimple-pretty-print.cc (dump_gimple_omp_interop): New function.
	(pp_gimple_stmt_1): Handle GIMPLE_OMP_INTEROP.
	* gimple.cc (gimple_build_omp_interop): New function.
	(gimple_copy): Handle GIMPLE_OMP_INTEROP.
	* gimple.def (GIMPLE_OMP_INTEROP): Define.
	* gimple.h (gimple_build_omp_interop): Declare.
	(gimple_omp_interop_clauses): New function.
	(gimple_omp_interop_clauses_ptr): Likewise.
	(gimple_omp_interop_set_clauses): Likewise.
	(gimple_return_set_retval): Handle GIMPLE_OMP_INTEROP.
	* gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(gimplify_omp_interop): New function.
	(gimplify_expr): Replace sorry with call to gimplify_omp_interop.
	* omp-builtins.def (BUILT_IN_GOMP_INTEROP): Define.
	* omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(scan_omp_1_stmt): Handle GIMPLE_OMP_INTEROP.
	(lower_omp_interop_action_clauses): New function.
	(lower_omp_interop): Likewise.
	(lower_omp_1): Handle GIMPLE_OMP_INTEROP.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_clause_destroy): Make addressable.
	(c_parser_omp_clause_init): Make addressable.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_clause_init): Make addressable.

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): Make OMP_CLAUSE_DESTROY and
	OMP_CLAUSE_INIT addressable.
	* types.def (BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR):
	New.

include/ChangeLog:

	* gomp-constants.h (GOMP_DEVICE_DEFAULT_OMP_61, GOMP_INTEROP_TARGET,
	GOMP_INTEROP_TARGETSYNC, GOMP_INTEROP_FLAG_NOWAIT): Define.

libgomp/ChangeLog:

	* icv-device.c (omp_set_default_device): Check
	GOMP_DEVICE_DEFAULT_OMP_61.
	* libgomp-plugin.h (struct interop_obj_t): New.
	(enum gomp_interop_flag): New.
	(GOMP_OFFLOAD_interop): Declare.
	(GOMP_OFFLOAD_get_interop_int): Declare.
	(GOMP_OFFLOAD_get_interop_ptr): Declare.
	(GOMP_OFFLOAD_get_interop_str): Declare.
	(GOMP_OFFLOAD_get_interop_type_desc): Declare.
	* libgomp.h (_LIBGOMP_OMP_LOCK_DEFINED): Define.
	(struct gomp_device_descr): Add interop_func, get_interop_int_func,
	get_interop_ptr_func, get_interop_str_func, get_interop_type_desc_func.
	* libgomp.map: Add GOMP_interop.
	* libgomp_g.h (GOMP_interop): Declare.
	* target.c (resolve_device): Handle GOMP_DEVICE_DEFAULT_OMP_61.
	(omp_get_interop_int): Replace stub with actual implementation.
	(omp_get_interop_ptr): Likewise.
	(omp_get_interop_str): Likewise.
	(omp_get_interop_type_desc): Likewise.
	(struct interop_data_t): Define.
	(gomp_interop_internal): New function.
	(GOMP_interop): Likewise.
	(gomp_load_plugin_for_device): Load symbols for get_interop_int,
	get_interop_ptr, get_interop_str and get_interop_type_desc.
	* testsuite/libgomp.c-c++-common/interop-1.c: New test.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/interop-1.c: Remove dg-prune-output "sorry".
	* c-c++-common/gomp/interop-2.c: Likewise.
	* c-c++-common/gomp/interop-3.c: Likewise.
	* c-c++-common/gomp/interop-4.c: Remove dg-message "not supported".
	* g++.dg/gomp/interop-5.C: Likewise.
	* gfortran.dg/gomp/interop-4.f90: Likewise.
	* c-c++-common/gomp/interop-5.c: New test.
	* gfortran.dg/gomp/interop-5.f90: New test.

Co-authored-by: Tobias Burnus &lt;tburnus@baylibre.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch partially enables use of the OpenMP interop construct by adding
middle end support, mostly in the omplower pass, and in the target-independent
part of the libgomp runtime. It follows up on previous patches for C, C++ and
Fortran front ends support. The full interop feature requires another patch to
enable foreign runtime support in libgomp plugins.

gcc/ChangeLog:

	* builtin-types.def
	(BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR): New.
	* gimple-low.cc (lower_stmt): Handle GIMPLE_OMP_INTEROP.
	* gimple-pretty-print.cc (dump_gimple_omp_interop): New function.
	(pp_gimple_stmt_1): Handle GIMPLE_OMP_INTEROP.
	* gimple.cc (gimple_build_omp_interop): New function.
	(gimple_copy): Handle GIMPLE_OMP_INTEROP.
	* gimple.def (GIMPLE_OMP_INTEROP): Define.
	* gimple.h (gimple_build_omp_interop): Declare.
	(gimple_omp_interop_clauses): New function.
	(gimple_omp_interop_clauses_ptr): Likewise.
	(gimple_omp_interop_set_clauses): Likewise.
	(gimple_return_set_retval): Handle GIMPLE_OMP_INTEROP.
	* gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(gimplify_omp_interop): New function.
	(gimplify_expr): Replace sorry with call to gimplify_omp_interop.
	* omp-builtins.def (BUILT_IN_GOMP_INTEROP): Define.
	* omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_INIT,
	OMP_CLAUSE_USE and OMP_CLAUSE_DESTROY.
	(scan_omp_1_stmt): Handle GIMPLE_OMP_INTEROP.
	(lower_omp_interop_action_clauses): New function.
	(lower_omp_interop): Likewise.
	(lower_omp_1): Handle GIMPLE_OMP_INTEROP.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_clause_destroy): Make addressable.
	(c_parser_omp_clause_init): Make addressable.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_clause_init): Make addressable.

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): Make OMP_CLAUSE_DESTROY and
	OMP_CLAUSE_INIT addressable.
	* types.def (BT_FN_VOID_INT_INT_PTR_PTR_PTR_INT_PTR_INT_PTR_UINT_PTR):
	New.

include/ChangeLog:

	* gomp-constants.h (GOMP_DEVICE_DEFAULT_OMP_61, GOMP_INTEROP_TARGET,
	GOMP_INTEROP_TARGETSYNC, GOMP_INTEROP_FLAG_NOWAIT): Define.

libgomp/ChangeLog:

	* icv-device.c (omp_set_default_device): Check
	GOMP_DEVICE_DEFAULT_OMP_61.
	* libgomp-plugin.h (struct interop_obj_t): New.
	(enum gomp_interop_flag): New.
	(GOMP_OFFLOAD_interop): Declare.
	(GOMP_OFFLOAD_get_interop_int): Declare.
	(GOMP_OFFLOAD_get_interop_ptr): Declare.
	(GOMP_OFFLOAD_get_interop_str): Declare.
	(GOMP_OFFLOAD_get_interop_type_desc): Declare.
	* libgomp.h (_LIBGOMP_OMP_LOCK_DEFINED): Define.
	(struct gomp_device_descr): Add interop_func, get_interop_int_func,
	get_interop_ptr_func, get_interop_str_func, get_interop_type_desc_func.
	* libgomp.map: Add GOMP_interop.
	* libgomp_g.h (GOMP_interop): Declare.
	* target.c (resolve_device): Handle GOMP_DEVICE_DEFAULT_OMP_61.
	(omp_get_interop_int): Replace stub with actual implementation.
	(omp_get_interop_ptr): Likewise.
	(omp_get_interop_str): Likewise.
	(omp_get_interop_type_desc): Likewise.
	(struct interop_data_t): Define.
	(gomp_interop_internal): New function.
	(GOMP_interop): Likewise.
	(gomp_load_plugin_for_device): Load symbols for get_interop_int,
	get_interop_ptr, get_interop_str and get_interop_type_desc.
	* testsuite/libgomp.c-c++-common/interop-1.c: New test.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/interop-1.c: Remove dg-prune-output "sorry".
	* c-c++-common/gomp/interop-2.c: Likewise.
	* c-c++-common/gomp/interop-3.c: Likewise.
	* c-c++-common/gomp/interop-4.c: Remove dg-message "not supported".
	* g++.dg/gomp/interop-5.C: Likewise.
	* gfortran.dg/gomp/interop-4.f90: Likewise.
	* c-c++-common/gomp/interop-5.c: New test.
	* gfortran.dg/gomp/interop-5.f90: New test.

Co-authored-by: Tobias Burnus &lt;tburnus@baylibre.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Update copyright years.</title>
<updated>2025-01-02T10:59:57+00:00</updated>
<author>
<name>Jakub Jelinek</name>
<email>jakub@redhat.com</email>
</author>
<published>2025-01-02T10:59:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=6441eb6dc020faae0672ea724dfdb38c6a9bf6a1'/>
<id>6441eb6dc020faae0672ea724dfdb38c6a9bf6a1</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>OpenMP: Add get_device_from_uid/omp_get_uid_from_device routines</title>
<updated>2024-09-20T07:25:33+00:00</updated>
<author>
<name>Tobias Burnus</name>
<email>tburnus@baylibre.com</email>
</author>
<published>2024-09-20T07:25:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=bf4a5efa80ef8438deb0a99c9a02b1f550aaf814'/>
<id>bf4a5efa80ef8438deb0a99c9a02b1f550aaf814</id>
<content type='text'>
Those TR13/OpenMP 6.0 routines permit a reproducible offloading to
a specific device by mapping an OpenMP device number to a
unique ID (UID). The GPU device UIDs should be universally unique,
the one for the host is not.

gcc/ChangeLog:

	* omp-general.cc (omp_runtime_api_procname): Add
	get_device_from_uid and omp_get_uid_from_device routines.

include/ChangeLog:

	* cuda/cuda.h (cuDeviceGetUuid): Declare.
	(cuDeviceGetUuid_v2): Add prototype.

libgomp/ChangeLog:

	* config/gcn/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Add stub implementation.
	* config/nvptx/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Likewise.
	* fortran.c (omp_get_uid_from_device_,
	omp_get_uid_from_device_8_): New functions.
	* libgomp-plugin.h (GOMP_OFFLOAD_get_uid): Add prototype.
	* libgomp.h (struct gomp_device_descr): Add 'uid' and 'get_uid_func'.
	* libgomp.map (GOMP_6.0): New, includind the new UID routines.
	* libgomp.texi (OpenMP Technical Report 13): Mark UID routines as 'Y'.
	(Device Information Routines): Document new UID routines.
	(Offload-Target Specifics): Document UID format.
	* omp.h.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New prototype.
	* omp_lib.f90.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New interface.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuDeviceGetUuid and cuDeviceGetUuid_v2 via
	CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): New.
	* target.c (str_omp_initial_device): New static var.
	(STR_OMP_DEV_PREFIX): Define.
	(gomp_get_uid_for_device, omp_get_uid_from_device,
	omp_get_device_from_uid): New.
	(gomp_load_plugin_for_device): DLSYM_OPT the function 'get_uid'.
	(gomp_target_init): Set the device's 'uid' field to NULL.
	* testsuite/libgomp.c/device_uid.c: New test.
	* testsuite/libgomp.fortran/device_uid.f90: New test.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Those TR13/OpenMP 6.0 routines permit a reproducible offloading to
a specific device by mapping an OpenMP device number to a
unique ID (UID). The GPU device UIDs should be universally unique,
the one for the host is not.

gcc/ChangeLog:

	* omp-general.cc (omp_runtime_api_procname): Add
	get_device_from_uid and omp_get_uid_from_device routines.

include/ChangeLog:

	* cuda/cuda.h (cuDeviceGetUuid): Declare.
	(cuDeviceGetUuid_v2): Add prototype.

libgomp/ChangeLog:

	* config/gcn/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Add stub implementation.
	* config/nvptx/target.c (omp_get_uid_from_device,
	omp_get_device_from_uid): Likewise.
	* fortran.c (omp_get_uid_from_device_,
	omp_get_uid_from_device_8_): New functions.
	* libgomp-plugin.h (GOMP_OFFLOAD_get_uid): Add prototype.
	* libgomp.h (struct gomp_device_descr): Add 'uid' and 'get_uid_func'.
	* libgomp.map (GOMP_6.0): New, includind the new UID routines.
	* libgomp.texi (OpenMP Technical Report 13): Mark UID routines as 'Y'.
	(Device Information Routines): Document new UID routines.
	(Offload-Target Specifics): Document UID format.
	* omp.h.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New prototype.
	* omp_lib.f90.in (omp_get_device_from_uid, omp_get_uid_from_device):
	New interface.
	* omp_lib.h.in: Likewise.
	* plugin/cuda-lib.def: Add cuDeviceGetUuid and cuDeviceGetUuid_v2 via
	CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): New.
	* target.c (str_omp_initial_device): New static var.
	(STR_OMP_DEV_PREFIX): Define.
	(gomp_get_uid_for_device, omp_get_uid_from_device,
	omp_get_device_from_uid): New.
	(gomp_load_plugin_for_device): DLSYM_OPT the function 'get_uid'.
	(gomp_target_init): Set the device's 'uid' field to NULL.
	* testsuite/libgomp.c/device_uid.c: New test.
	* testsuite/libgomp.fortran/device_uid.f90: New test.
</pre>
</div>
</content>
</entry>
<entry>
<title>OpenACC 2.7: Adjust acc_map_data/acc_unmap_data interaction with reference counters</title>
<updated>2024-04-16T09:04:11+00:00</updated>
<author>
<name>Chung-Lin Tang</name>
<email>cltang@baylibre.com</email>
</author>
<published>2024-04-16T09:03:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/gcc.git/commit/?id=a7578a077ed8b64b94282aa55faf7037690abbc5'/>
<id>a7578a077ed8b64b94282aa55faf7037690abbc5</id>
<content type='text'>
This patch adjusts the implementation of acc_map_data/acc_unmap_data API library
routines to more fit the description in the OpenACC 2.7 specification.

Instead of using REFCOUNT_INFINITY, we now define a REFCOUNT_ACC_MAP_DATA
special value to mark acc_map_data-created mappings. Adjustment around
mapping related code to respect OpenACC semantics are also added.

libgomp/ChangeLog:

	* libgomp.h (REFCOUNT_ACC_MAP_DATA): Define as (REFCOUNT_SPECIAL | 2).
	* oacc-mem.c (acc_map_data): Adjust to use REFCOUNT_ACC_MAP_DATA,
	initialize dynamic_refcount as 1.
	(acc_unmap_data): Adjust to use REFCOUNT_ACC_MAP_DATA,
	(goacc_map_var_existing): Add REFCOUNT_ACC_MAP_DATA case.
	(goacc_exit_datum_1): Add REFCOUNT_ACC_MAP_DATA case, respect
	REFCOUNT_ACC_MAP_DATA when decrementing/finalizing. Force lowest
	dynamic_refcount to be 1 for REFCOUNT_ACC_MAP_DATA.
	(goacc_enter_data_internal): Add REFCOUNT_ACC_MAP_DATA case.
	* target.c (gomp_increment_refcount): Return early for
	REFCOUNT_ACC_MAP_DATA case.
	(gomp_decrement_refcount): Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-96.c: New testcase.
	* testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c: Adjust
	testcase error output scan test.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adjusts the implementation of acc_map_data/acc_unmap_data API library
routines to more fit the description in the OpenACC 2.7 specification.

Instead of using REFCOUNT_INFINITY, we now define a REFCOUNT_ACC_MAP_DATA
special value to mark acc_map_data-created mappings. Adjustment around
mapping related code to respect OpenACC semantics are also added.

libgomp/ChangeLog:

	* libgomp.h (REFCOUNT_ACC_MAP_DATA): Define as (REFCOUNT_SPECIAL | 2).
	* oacc-mem.c (acc_map_data): Adjust to use REFCOUNT_ACC_MAP_DATA,
	initialize dynamic_refcount as 1.
	(acc_unmap_data): Adjust to use REFCOUNT_ACC_MAP_DATA,
	(goacc_map_var_existing): Add REFCOUNT_ACC_MAP_DATA case.
	(goacc_exit_datum_1): Add REFCOUNT_ACC_MAP_DATA case, respect
	REFCOUNT_ACC_MAP_DATA when decrementing/finalizing. Force lowest
	dynamic_refcount to be 1 for REFCOUNT_ACC_MAP_DATA.
	(goacc_enter_data_internal): Add REFCOUNT_ACC_MAP_DATA case.
	* target.c (gomp_increment_refcount): Return early for
	REFCOUNT_ACC_MAP_DATA case.
	(gomp_decrement_refcount): Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-96.c: New testcase.
	* testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c: Adjust
	testcase error output scan test.
</pre>
</div>
</content>
</entry>
</feed>
