summaryrefslogtreecommitdiff
path: root/libgomp/testsuite/libgomp.c
AgeCommit message (Collapse)Author
2025-11-21ipa: Remove LTO requirement for builtin callback carriers.Josef Melcr
Due to the if statement in ipa_compute_jump_functions_for_bb, callback edges were never constructed for builtin functions unless LTO was enabled. This patch corrects this behavior, allowing GCC to optimize callbacks more broadly. It also extends our testing capabilities. gcc/ChangeLog: * attr-callback.cc (callback_edge_callee_has_attr): New function. * attr-callback.h (callback_edge_callee_has_attr): New function decl. * ipa-prop.cc (ipa_compute_jump_functions_for_bb): Don't skip callback carriers when calculating jump functions. libgomp/ChangeLog: * testsuite/libgomp.c/ipcp-cb-spec1.c: Remove LTO requirement. * testsuite/libgomp.c/ipcp-cb-spec2.c: Likewise. * testsuite/libgomp.c/ipcp-cb1.c: Likewise. Signed-off-by: Josef Melcr <josef.melcr@suse.com>
2025-11-13openmp, nvptx: ompx_gnu_managed_mem_allocAndrew Stubbs
This adds support for using Cuda Managed Memory with omp_alloc. AMD support will be added in a future patch. There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a corresponding memory space, which can be used to allocate memory in the "managed" space. The nvptx plugin is modified to make the necessary Cuda calls, via two new (optional) plugin interfaces. gcc/fortran/ChangeLog: * openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the comment. include/ChangeLog: * cuda/cuda.h (cuMemAllocManaged): Add declaration and related CU_MEM_ATTACH_GLOBAL flag. * gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201. (GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant. (GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant. (GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant. libgomp/ChangeLog: * allocator.c (ompx_gnu_max_predefined_alloc): Update to ompx_gnu_managed_mem_alloc. (_Static_assert): Fix assertion messages for allocators and add new assertions for memspace constants. (omp_max_predefined_mem_space): New define. (ompx_gnu_min_predefined_mem_space): New define. (ompx_gnu_max_predefined_mem_space): New define. (MEMSPACE_ALLOC): Add check for non-standard memspaces. (MEMSPACE_CALLOC): Likewise. (MEMSPACE_REALLOC): Likewise. (MEMSPACE_VALIDATE): Likewise. (predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space. (omp_init_allocator): Add ompx_gnu_managed_mem_space validation. * config/gcn/allocator.c (gcn_memspace_alloc): Add check for non-standard memspaces. (gcn_memspace_calloc): Likewise. (gcn_memspace_realloc): Likewise. (gcn_memspace_validate): Update to validate standard vs non-standard memspaces. * config/linux/allocator.c (linux_memspace_alloc): Add managed memory space handling. (linux_memspace_calloc): Likewise. (linux_memspace_free): Likewise. (linux_memspace_realloc): Likewise (returns NULL for fallback). * config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for non-standard memspaces. (nvptx_memspace_calloc): Likewise. (nvptx_memspace_realloc): Likewise. (nvptx_memspace_validate): Update to validate standard vs non-standard memspaces. * env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc, ompx_gnu_managed_mem_space, and some static asserts so I don't forget them again. * libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration. (GOMP_OFFLOAD_managed_free): New declaration. * libgomp.h (gomp_managed_alloc): New declaration. (gomp_managed_free): New declaration. (struct gomp_device_descr): Add managed_alloc_func and managed_free_func fields. * libgomp.texi: Document ompx_gnu_managed_mem_alloc and ompx_gnu_managed_mem_space, add C++ template documentation, and describe NVPTX and AMD support. * omp.h.in: Add ompx_gnu_managed_mem_space and ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++ allocator template. * omp_lib.f90.in: Add Fortran bindings for new allocator and memory space. * omp_lib.h.in: Likewise. * plugin/cuda-lib.def: Add cuMemAllocManaged. * plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to support cuMemAllocManaged. (GOMP_OFFLOAD_alloc): Move contents to ... (cleanup_and_alloc): ... this new function, and add managed support. (GOMP_OFFLOAD_managed_alloc): New function. (GOMP_OFFLOAD_managed_free): New function. * target.c (gomp_managed_alloc): New function. (gomp_managed_free): New function. (gomp_load_plugin_for_device): Load optional managed_alloc and managed_free plugin APIs. * testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem. * testsuite/libgomp.c++/alloc-managed-1.C: New test. * testsuite/libgomp.c/alloc-managed-1.c: New test. * testsuite/libgomp.c/alloc-managed-2.c: New test. * testsuite/libgomp.c/alloc-managed-3.c: New test. * testsuite/libgomp.c/alloc-managed-4.c: New test. * testsuite/libgomp.fortran/alloc-managed-1.f90: New test. Co-authored-by: Kwok Cheung Yeung <kcyeung@baylibre.com> Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
2025-11-12OpenMP: Add omp_default_device named constant [PR119677]Tobias Burnus
OpenMP TR 14 (OpenMP 6.1) adds omp_default_device < -1 as named constant alongside omp_initial_device and omp_default_device. GCC supports it already internally via GOMP_DEVICE_DEFAULT_OMP_61, but this patch now adds the omp_default_device enum/PARAMETER to omp.h / omp_lib. Note that PR119677 requests some cleanups, which still have to be done. PR libgomp/119677 gcc/fortran/ChangeLog: * intrinsic.texi (OpenMP Modules): Add omp_default_device. * openmp.cc (gfc_resolve_omp_context_selector): Accept omp_default_device as conforming device number. libgomp/ChangeLog: * omp.h.in (omp_default_device): New enum value. * omp_lib.f90.in: New parameter. * omp_lib.h.in: Likewise * target.c (gomp_get_default_device): New. Split off from ... (resolve_device): ... here; call it. (omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy_check, omp_target_memset, omp_target_memset_async, omp_target_associate_ptr, omp_get_mapped_ptr, omp_target_is_accessible, omp_pause_resource, omp_get_uid_from_device): Handle omp_default_device. * testsuite/libgomp.c/device_uid.c: Likewise. * testsuite/libgomp.fortran/device_uid.f90: Likewise. * testsuite/libgomp.c-c++-common/omp-default-device.c: New test. * testsuite/libgomp.fortran/omp-default-device.f90: New test.
2025-11-03Fix 'libgomp.c/pr122281.c' for non-USM offloading execution [PR122281]Thomas Schwinge
... where it currently runs into: libgomp: cuCtxSynchronize error: an illegal memory access was encountered ... for nvptx, or similarly for GCN: Memory access fault by GPU node-1 (Agent handle: 0x34d77290) on address 0x7fff3c553000. Reason: Page not present or supervisor privilege. Fix-up for commit r16-4961-ge2cbcd1b27c0da92bdcd96664064d3d0c1d44e6f "Fix gimple_copy for OpenMP atomic load/store [PR122281, PR105001]". PR libgomp/122281 libgomp/ * testsuite/libgomp.c/pr122281.c: Fix for non-USM offloading execution.
2025-11-03Fix gimple_copy for OpenMP atomic load/store [PR122281, PR105001]Tobias Burnus
PR libgomp/122281 PR middle-end/105001 gcc/ChangeLog: * gimple.cc (gimple_copy): Add missing unshare_expr for GIMPLE_OMP_ATOMIC_LOAD and GIMPLE_OMP_ATOMIC_STORE.
2025-10-23libgomp: fine-grained pinned memory allocatorAndrew Stubbs
This patch introduces a new custom memory allocator for use with pinned memory (in the case where the Cuda allocator isn't available). In future, this allocator will also be used for Managed Memory. Both memories are incompatible with the system malloc because allocated memory cannot share a page with memory allocated for other purposes. This means that small allocations will no longer consume an entire page of pinned memory. Unfortunately, it also means that pinned memory pages will never be unmapped (although they may be reused). This isn't a technical limitation; the "free" algorithm could be extended in future, if needed. The implementation is not perfect; there are various corner cases (especially related to extending onto new pages) where allocations and reallocations may be sub-optimal, but it should still be a step forward in support for small allocations. I have considered using libmemkind's "fixed" memory but rejected it for three reasons: 1) libmemkind may not always be present at runtime, 2) there's no currently documented means to extend a "fixed" kind one page at a time (although the code appears to have an undocumented function that may do the job, and/or extending libmemkind to support the MAP_LOCKED mmap flag with its regular kinds would be straight-forward), 3) Managed Memory benefits from having the metadata located in different memory and using an external implementation makes it hard to guarantee this. libgomp/ChangeLog: * Makefile.am (libgomp_la_SOURCES): Add simple-allocator.c. * Makefile.in: Regenerate. * basic-allocator.c: Mention simple-allocator in the comment. * config/linux/allocator.c: Include unistd.h. (pin_ctx): New variable. (ctxlock): New variable. (linux_init_pin_ctx): New function. (linux_memspace_alloc): Use simple-allocator for pinned memory. (linux_memspace_free): Likewise. (linux_memspace_realloc): Likewise. * libgomp.h (gomp_simple_alloc_init_context): New prototype. (gomp_simple_alloc_register_memory): New prototype. (gomp_simple_alloc): New prototype. (gomp_simple_free): New prototype. (gomp_simple_realloc): New prototype. * libgomp.texi: Update pinned memory trait documentation. * testsuite/libgomp.c/alloc-pinned-8.c: New test. * simple-allocator.c: New file.
2025-10-23libgomp, nvptx: Cuda pinned memoryAndrew Stubbs
Use Cuda to pin memory, instead of Linux mlock, when available. There are two advantages: firstly, this gives a significant speed boost for NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit setting. The design adds a device independent plugin API for allocating pinned memory, and then implements it for NVPTX. At present, the other supported devices do not have equivalent capabilities (or requirements). libgomp/ChangeLog: * config/linux/allocator.c: Include assert.h. (using_device_for_page_locked): New variable. (linux_memspace_alloc): Add init0 parameter. Support device pinning. (linux_memspace_calloc): Set init0 to true. (linux_memspace_free): Support device pinning. (linux_memspace_realloc): Support device pinning. (MEMSPACE_ALLOC): Set init0 to false. * libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_alloc): New prototype. (GOMP_OFFLOAD_page_locked_host_free): Likewise. * libgomp.h (gomp_page_locked_host_alloc): Likewise. (gomp_page_locked_host_free): Likewise. (struct gomp_device_descr): Add page_locked_host_alloc_func and page_locked_host_free_func. * libgomp.texi: Adjust the docs for the pinned trait. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_page_locked_host_alloc): New function. (GOMP_OFFLOAD_page_locked_host_free): Likewise. * target.c (device_for_page_locked): New variable. (get_device_for_page_locked): New function. (gomp_page_locked_host_alloc): Likewise. (gomp_page_locked_host_free): Likewise. (gomp_load_plugin_for_device): Add page_locked_host_alloc and page_locked_host_free. * testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX devices. * testsuite/libgomp.c/alloc-pinned-2.c: Likewise. * testsuite/libgomp.c/alloc-pinned-3.c: Likewise. * testsuite/libgomp.c/alloc-pinned-4.c: Likewise. * testsuite/libgomp.c/alloc-pinned-5.c: Likewise. * testsuite/libgomp.c/alloc-pinned-6.c: Likewise. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2025-10-21testsuite: Move ipcp-cb* from ipa to libgompJosef Melcr
This patch addresses the incorrectly placed tests, which fail if the testsuite is ran and gcc has not been installed yet, as discussed here: https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698095.html. gcc/testsuite/ChangeLog: * gcc.dg/ipa/ipcp-cb-spec1.c: Moved to libgomp/testsuite/libgomp.c/. * gcc.dg/ipa/ipcp-cb-spec2.c: Likewise. * gcc.dg/ipa/ipcp-cb1.c: Likewise. libgomp/ChangeLog: * testsuite/libgomp.c/ipcp-cb-spec1.c: Moved from gcc/testsuite/gcc.dg/ipa/. * testsuite/libgomp.c/ipcp-cb-spec2.c: Likewise. * testsuite/libgomp.c/ipcp-cb1.c: Likewise. Signed-off-by: Josef Melcr <jmelcr02@gmail.com>
2025-10-16libgomp.c/declare-variant-4-gfx*: Add missing archs + dg-excess-errorsTobias Burnus
Add missing tests for gfx* context selectors; mark all but the default-arch declare-variant-4.c with 'dg-excess-errors' to silence libgomp not-found errors (still passing the scan-offload-tree-dump check) - or at least causing just UNRESOLVED errors if the error is "built without library support ... consider compiling for the associated generic architecture". In case the multilib is configured, the result will be an XPASS. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4-gfx10-3-generic.c: Add dg-excess-errors to handle possible missing libgomp multi lib. * testsuite/libgomp.c/declare-variant-4-gfx1030.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx1036.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx11-generic.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx1100.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx1103.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx9-4-generic.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx9-generic.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx900.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx906.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx908.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx90a.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx90c.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx942.c: Likewise. * testsuite/libgomp.c/declare-variant-4-gfx1031.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1032.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1033.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1034.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1035.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1101.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1102.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1150.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1151.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1152.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1153.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx902.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx904.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx909.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx950.c: New test.
2025-10-15gcn: Add missing GFX9_4_GENERIC, OpenMP context-selector updateTobias Burnus
The definition for gfx942 and gfx950 missed the GFX9_4_GENERIC family flag. For OpenMP context selectors: The t-omp-device file missed the generic selectors. Additionally, there is now a note in the OpenMP documentation that there is a one-to-one match for ISA names, ignoring any compatibility. For instance, for Nvidia GPUs 'isa("sm_70")' is only true when compiling for 'sm_70', even though sm < 7.0 code also runs on sm_70 hardware. And, for AMD GPUs, gfx9-4-generic neither matches 'gfx942' (even though such generic code runs on gfx942) - nor the reverse (although all gfx9-4-generic code runs on gfx942). gcc/ChangeLog: * config/gcn/gcn-devices.def (gfx942, gfx950): Set generic name to GFX9_4_GENERIC. * config/gcn/t-omp-device: Include generic names for OpenMP's ISA trait. libgomp/ChangeLog: * libgomp.texi (OpenMP Context Selectors): Add note that there is currently an exact match between ISA and compilation, ignoring compatibilities in both ways. * testsuite/libgomp.c/declare-variant-4.h: Add missing variant functions for specific and generic AMD GPUs. * testsuite/libgomp.c/declare-variant-4-gfx10-3-generic.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx11-generic.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx9-4-generic.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx9-generic.c: New test.
2025-06-10gcn: Add experimental MI300 (gfx942) supportTobias Burnus
As gfx942 and gfx950 belong to gfx9-4-generic, the latter two are also added. Note that there are no specific optimizations for MI300, yet. For none of the mentioned devices, any multilib is build by default; use '--with-multilib-list=' when configuring GCC to build them alongside. gfx942 was added in LLVM (and its mc assembler, used by GCC) in version 18, generic support in LLVM 19 and gfx950 in LLVM 20. gcc/ChangeLog: * config/gcn/gcn-devices.def: Add gfx942, gfx950 and gfx9-4-generic. * config/gcn/gcn-opts.h (TARGET_CDNA3, TARGET_CDNA3_PLUS, TARGET_GLC_NAME, TARGET_TARGET_SC_CACHE): Define. (TARGET_ARCHITECTED_FLAT_SCRATCH): Use also for CDNA3. * config/gcn/gcn.h (gcn_isa): Add ISA_CDNA3 to the enum. * config/gcn/gcn.cc (print_operand): Update 'g' to use TARGET_GLC_NAME; add 'G' to print TARGET_GLC_NAME unconditionally. * config/gcn/gcn-valu.md (scatter, gather): Use TARGET_GLC_NAME. * config/gcn/gcn.md: Use %G<num> instead of glc; use 'buffer_inv sc1' for TARGET_TARGET_SC_CACHE. * doc/invoke.texi (march): Add gfx942, gfx950 and gfx9-4-generic. * doc/install.texi (amdgcn*-*-*): Add gfx942, gfx950 and gfx9-4-generic. * config/gcn/gcn-tables.opt: Regenerate. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4.h (gfx942): New variant function. * testsuite/libgomp.c/declare-variant-4-gfx942.c: New test.
2025-06-06libgomp.c/target-map-zero-sized-3.c: Fix code for non-USM offload [PR120530]Tobias Burnus
A mapping clause was missing, causing the code to fail with offloading when a host pointer was not device accessible. libgomp/ChangeLog: PR target/120530 * testsuite/libgomp.c/target-map-zero-sized-3.c (main): Add missing map clause; remove unused variable.
2025-05-14OpenMP: Fix mapping of zero-sized arrays with non-literal size: ↵Tobias Burnus
map(var[:n]), n = 0 For map(ptr[:0]), the used map kind is GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION and it is permitted that 'ptr' does not exist. 'ptr' is set to the device pointee if it exists or to the host value otherwise. For map(ptr[:3]), the variable is first mapped and then ptr is updated to point to the just-mapped device data; the attachment uses GOMP_MAP_ATTACH. For map(ptr[:n]), generates always a GOMP_MAP_ATTACH, but when n == 0, it was failing with: "pointer target not mapped for attach" The solution is not to fail but first to check whether it was mapped before. It turned out that for the mapping part, GCC adds a run-time check whether n == 0 - and uses GOMP_MAP_ZERO_LEN_ARRAY_SECTION for the mapping. Thus, we just have to check whether there such a mapping for the address for which the GOMP_MAP_ATTACH. was requested. And, if there was, the error diagnostic can be skipped. Unsurprisingly, this issue occurs in real-world code; it was detected in a code that distributes work via MPI and for some processes, some bounds ended up to be zero. libgomp/ChangeLog: * target.c (gomp_attach_pointer): Return bool; accept additional bool to optionally silence the fatal pointee-not-found error. (gomp_map_vars_internal): If the pointee could not be found, check whether it was mapped as GOMP_MAP_ZERO_LEN_ARRAY_SECTION. * libgomp.h (gomp_attach_pointer): Update prototype. * oacc-mem.c (acc_attach_async, goacc_enter_data_internal): Update calls. * testsuite/libgomp.c/target-map-zero-sized.c: New test. * testsuite/libgomp.c/target-map-zero-sized-2.c: New test. * testsuite/libgomp.c/target-map-zero-sized-3.c: New test.
2025-05-12nvptx: Support '-march=sm_61'Thomas Schwinge
gcc/ * config/nvptx/nvptx-sm.def: Add '61'. * config/nvptx/nvptx-gen.h: Regenerate. * config/nvptx/nvptx-gen.opt: Likewise. * config/nvptx/nvptx.cc (first_ptx_version_supporting_sm): Adjust. * config/nvptx/nvptx.opt (-march-map=sm_61, -march-map=sm_62): Likewise. * config.gcc: Likewise. * doc/invoke.texi (Nvidia PTX Options): Document '-march=sm_61'. * config/nvptx/gen-multilib-matches-tests: Extend. gcc/testsuite/ * gcc.target/nvptx/march-map=sm_61.c: Adjust. * gcc.target/nvptx/march-map=sm_62.c: Likewise. * gcc.target/nvptx/march=sm_61.c: New. libgomp/ * testsuite/libgomp.c/declare-variant-3-sm61.c: New. * testsuite/libgomp.c/declare-variant-3.h: Adjust.
2025-05-09libgomp.{c,fortran}/interop-{hip,cuda}: Fix dg-run target selectionTobias Burnus
While the tests checked whether the CUDA/HIP runtime is available before processing them, the execution was then done unconditionally, leading to FAIL when the default device was the host (or the wrong offload device). Now the test is only executed ('run') when the default device is an Nvidia or AMD GPU (depending on the test case, cf. the test file name). Otherwise, only a 'link' test is done. (Except when the effective-target check cannot find the runtime lib - then the test is skipped [as before].) Note: The cublas/hipblas tests use variant functions and iterate over all devices, such that the cublas or hipblas, respectively, is only called when the active device is an AMD or Nvidia device, respectively, while for the host and other device types the fallback is called. libgomp/ChangeLog: * testsuite/libgomp.c/interop-cuda-full.c: Use 'link' instead of 'run' when the default device is "! offload_device_nvptx". * testsuite/libgomp.c/interop-cuda-libonly.c: Likewise. * testsuite/libgomp.c/interop-hip-nvidia-full.c: Likewise. * testsuite/libgomp.c/interop-hip-nvidia-no-headers.c: Likewise. * testsuite/libgomp.c/interop-hip-nvidia-no-hip-header.c: Likewise. * testsuite/libgomp.fortran/interop-hip-nvidia-full.F90: Likewise. * testsuite/libgomp.fortran/interop-hip-nvidia-no-module.F90: Likewise. * testsuite/libgomp.c/interop-hip-amd-full.c: Use 'link' instead of 'run' when the default device is "! offload_device_gcn". * testsuite/libgomp.c/interop-hip-amd-no-hip-header.c: Likewise. * testsuite/libgomp.fortran/interop-hip-amd-full.F90: Likewise. * testsuite/libgomp.fortran/interop-hip-amd-no-module.F90: Likewise.
2025-05-05'libgomp.c/interop-hsa.c': GCN offloading onlyThomas Schwinge
Fix-up for commit 8d84ea28510054fbbb8a2b7441916bd75e29163f "OpenMP, GCN: Add interop-hsa testcase", which added 'libgomp.c/interop-hsa.c'. If nvptx offloading compilation is enabled in addition to GCN, the former ICEs: during RTL pass: final [...]/libgomp.c/interop-hsa.c: In function 'get_kernel_ptr': [...]/libgomp.c/interop-hsa.c:34:1: internal compiler error: RTL check: expected code 'subreg', have 'reg' in nvptx_print_operand, at config/nvptx/nvptx.cc:3082 0x1ccdb96 internal_error(char const*, ...) [...]/gcc/diagnostic-global-context.cc:517 0x7446c3 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, char const*) [...]/gcc/rtl.cc:770 0x7fa533 nvptx_print_operand [...]/gcc/config/nvptx/nvptx.cc:3082 0xb25f34 output_operand(rtx_def*, int) [...]/gcc/final.cc:3641 0xb26f07 output_asm_insn(char const*, rtx_def**) [...]/gcc/final.cc:3534 0xb29d91 output_asm_insn(char const*, rtx_def**) [...]/gcc/final.cc:2639 0xb29d91 final_scan_insn_1 [...]/gcc/final.cc:2642 0xb2a59f final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*) [...]/gcc/final.cc:2892 0xb2a68c final_1 [...]/gcc/final.cc:1983 0xb2b378 rest_of_handle_final [...]/gcc/final.cc:4250 0xb2b378 execute [...]/gcc/final.cc:4328 Regardless of the issue that nvptx offloading compilation probably shouldn't ICE, the 'asm' insert clearly is valid for GCN only. libgomp/ * testsuite/libgomp.c/interop-hsa.c: GCN offloading only.
2025-04-25OpenMP, GCN: Add interop-hsa testcaseAndrew Stubbs
This testcase ensures that the interop HSA support is sufficient to run a kernel manually on the same device. libgomp/ChangeLog: * testsuite/libgomp.c/interop-hsa.c: New test.
2025-04-24libgomp/testsuite: Fix hip_header_nvidia check, add workaround to testTobias Burnus
This is all about using the AMD's HIP header files with __HIP_PLATFORM_NVIDIA__ defined, i.e. HIP with Nvidia/CUDA; in that case, HIP is a thin layer on top of CUDA. First, the check_effective_target_gomp_hip_header_nvidia check failed; to fix it, -Wno-deprecated-declarations was added - and likewise to the two affected testcases that actually used the HIP headers on Nvidia. Doing so, the HIP tested was successful but the HIP-BLAS one showed two issues: * One seems to be related to include search paths as the HIP header uses #include "library_types.h" to include that CUDA header. Seemingly, it tried to included (again) the HIP header hip/library_types.h, not the CUDA one. I guess, some tweaking of -isystem vs. -I could have prevented this, but the simpler workaround was to just explicitly include the CUDA one before the HIP header files. * Once done, everything compiled but linking failed as the association between three HIP-BLAS functions and their CUDA-BLAS ones did not work. Solution: Just add three #define for mapping them. libgomp/ChangeLog: * testsuite/lib/libgomp.exp (check_effective_target_gomp_hip_header_nvidia): Compile with "-Wno-deprecated-declarations". * testsuite/libgomp.c/interop-hip-nvidia-full.c: Likewise. * testsuite/libgomp.c/interop-hipblas-nvidia-full.c: Likewise. * testsuite/libgomp.c/interop-hipblas.h: Add workarounds when using the HIP headers with __HIP_PLATFORM_NVIDIA__.
2025-04-24libgomp: Add additional OpenMP interop runtime testsTobias Burnus
Add checks for nowait/depend and for checks that the returned CUDA, CUDA_DRIVER and HIP interop objects actually work. While the CUDA/CUDA_DRIVER ones are only for Nvidia GPUs, HIP works on both AMD and Nvidia GPUs; on Nvidia GPUs, it is a very thin wrapper around CUDA. For Fortran, only a HIP test has been added - using hipfort. While libgomp.c-c++-common/interop-2.c always works - even without GPU - and checks for depend / nowait, all others require that runtime libraries are found at link (and execution) time: For Nvidia GPUs, libcuda + libcudart or libcublas, For AMD GPUs, libamdhip64 or libhipblas. The header files and hipfort modules do not need to be present as a fallback has been implemented, but if they are, they get used. Due to the combinations, the basic 1x C/C++, 4x C and 1x Fortran tests yield 1x C/C++, 14x C and 4 Fortran run-test files. libgomp/ChangeLog: * testsuite/lib/libgomp.exp (check_effective_target_openacc_cublas, check_effective_target_openacc_cudart): Update description as the check requires more. (check_effective_target_openacc_libcuda, check_effective_target_openacc_libcublas, check_effective_target_openacc_libcudart, check_effective_target_gomp_hip_header_amd, check_effective_target_gomp_hip_header_nvidia, check_effective_target_gomp_hipfort_module, check_effective_target_gomp_libamdhip64, check_effective_target_gomp_libhipblas): New. * testsuite/libgomp.c-c++-common/interop-2.c: New test. * testsuite/libgomp.c/interop-cublas-full.c: New test. * testsuite/libgomp.c/interop-cublas-libonly.c: New test. * testsuite/libgomp.c/interop-cuda-full.c: New test. * testsuite/libgomp.c/interop-cuda-libonly.c: New test. * testsuite/libgomp.c/interop-hip-amd-full.c: New test. * testsuite/libgomp.c/interop-hip-amd-no-hip-header.c: New test. * testsuite/libgomp.c/interop-hip-nvidia-full.c: New test. * testsuite/libgomp.c/interop-hip-nvidia-no-headers.c: New test. * testsuite/libgomp.c/interop-hip-nvidia-no-hip-header.c: New test. * testsuite/libgomp.c/interop-hip.h: New test. * testsuite/libgomp.c/interop-hipblas-amd-full.c: New test. * testsuite/libgomp.c/interop-hipblas-amd-no-hip-header.c: New test. * testsuite/libgomp.c/interop-hipblas-nvidia-full.c: New test. * testsuite/libgomp.c/interop-hipblas-nvidia-no-headers.c: New test. * testsuite/libgomp.c/interop-hipblas-nvidia-no-hip-header.c: New test. * testsuite/libgomp.c/interop-hipblas.h: New test. * testsuite/libgomp.fortran/interop-hip-amd-full.F90: New test. * testsuite/libgomp.fortran/interop-hip-amd-no-module.F90: New test. * testsuite/libgomp.fortran/interop-hip-nvidia-full.F90: New test. * testsuite/libgomp.fortran/interop-hip-nvidia-no-module.F90: New test. * testsuite/libgomp.fortran/interop-hip.h: New test.
2025-04-08OpenMP: Fix append_args handling in modify_call_for_omp_dispatchTobias Burnus
At tree level, the addr ref is also required for array dummy arguments, contrary to C; the GOMP_interop calls in modify_call_for_omp_dispatch were updated accordingly (using build_fold_addr_expr). As the GOMP_interop calls had no location data associated with them, the init call happened as soon as executing the previous line of code, which was confusing; solution: use the location data of the function call itself. PR middle-end/119662 gcc/ChangeLog: * gimplify.cc (modify_call_for_omp_dispatch): Fix GOMP_interop arg passing; add location info to function calls. libgomp/ChangeLog: * testsuite/libgomp.c/append-args-fr-1.c: New test. * testsuite/libgomp.c/append-args-fr.h: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/append-args-interop.c: Update for fixed GOMP_interop call. * g++.dg/gomp/append-args-8.C: Likewise. * gfortran.dg/gomp/append-args-interop.f90: Likewise.
2025-03-21libgomp/plugin: Add initial interop support to nvptx + gcnTobias Burnus
The interop directive operates on an opaque object that represents a foreign runtime. This commit adds support for this to the two offloading plugins. For nvptx, it supports cuda, cuda_driver and hip; the latter is AMD's version of CUDA which for Nvidia devices boils down to normal CUDA. Thus, at the end for this limited use, cuda/cuda_driver/hip are all the same - and for plugin-nvptx.c, the they differ only in terms of what gets fr_id, fr_name and get_interop_type_desc return. For gcn, it supports hip and hsa. Regarding get-mapped-ptr-1.c: That's actually a fix for the GOMP_interop commit r15-8654-g99e2906ae255fc that added GOMP_DEVICE_DEFAULT_OMP_61 alias omp_default_device, which is a conforming device number. But that test used -5 as check for a non-conforming device number. libgomp/ChangeLog: * plugin/plugin-gcn.c (_LIBGOMP_PLUGIN_INCLUDE): Define. (struct hsa_runtime_fn_info): Add two queue functions. (hipError_t, hipCtx_t, hipStream_s, hipStream_t): New types. (struct hip_runtime_fn_info): New. (hip_runtime_lib, hip_fns): New global vars. (init_environment_variables): Handle hip_runtime_lib. (init_hsa_runtime_functions): Load the two queue functions. (init_hip_runtime_functions, GOMP_OFFLOAD_interop, GOMP_OFFLOAD_get_interop_int, GOMP_OFFLOAD_get_interop_ptr, GOMP_OFFLOAD_get_interop_str, GOMP_OFFLOAD_get_interop_type_desc): New. * plugin/plugin-nvptx.c (_LIBGOMP_PLUGIN_INCLUDE): Define. (GOMP_OFFLOAD_interop, GOMP_OFFLOAD_get_interop_int, GOMP_OFFLOAD_get_interop_ptr, GOMP_OFFLOAD_get_interop_str, GOMP_OFFLOAD_get_interop_type_desc): New. * testsuite/libgomp.c/interop-fr-1.c: New test. * testsuite/libgomp.c-c++-common/get-mapped-ptr-1.c: Use -6 not -5 as non-conforming device number.
2025-01-02Update copyright years.Jakub Jelinek
2024-12-06nvptx: Support '-march=sm_89'Thomas Schwinge
gcc/ * config/nvptx/nvptx-sm.def: Add '89'. * config/nvptx/nvptx-gen.h: Regenerate. * config/nvptx/nvptx-gen.opt: Likewise. * config/nvptx/nvptx.cc (first_ptx_version_supporting_sm): Adjust. * config/nvptx/nvptx.opt (-march-map=sm_89, -march-map=sm_90) (march-map=sm_90a): Likewise. * config.gcc: Likewise. * doc/invoke.texi (Nvidia PTX Options): Document '-march=sm_89'. * config/nvptx/gen-multilib-matches-tests: Extend. gcc/testsuite/ * gcc.target/nvptx/march-map=sm_89.c: Adjust. * gcc.target/nvptx/march-map=sm_90.c: Likewise. * gcc.target/nvptx/march-map=sm_90a.c: Likewise. * gcc.target/nvptx/march=sm_89.c: New. libgomp/ * testsuite/libgomp.c/declare-variant-3-sm89.c: New. * testsuite/libgomp.c/declare-variant-3.h: Adjust.
2024-12-06nvptx: Support '-march=sm_52'Thomas Schwinge
gcc/ * config/nvptx/nvptx-sm.def: Add '52'. * config/nvptx/nvptx-gen.h: Regenerate. * config/nvptx/nvptx-gen.opt: Likewise. * config/nvptx/nvptx.cc (first_ptx_version_supporting_sm): Adjust. * config/nvptx/nvptx.opt (-march-map=sm_52): Likewise. * config.gcc: Likewise. * doc/invoke.texi (Nvidia PTX Options): Document '-march=sm_52'. * config/nvptx/gen-multilib-matches-tests: Extend. gcc/testsuite/ * gcc.target/nvptx/march-map=sm_52.c: Adjust. * gcc.target/nvptx/march=sm_52.c: New. libgomp/ * testsuite/libgomp.c/declare-variant-3-sm52.c: New. * testsuite/libgomp.c/declare-variant-3.h: Adjust.
2024-12-06nvptx: Support '-march=sm_37'Thomas Schwinge
gcc/ * config/nvptx/nvptx-sm.def: Add '37'. * config/nvptx/nvptx-gen.h: Regenerate. * config/nvptx/nvptx-gen.opt: Likewise. * config/nvptx/nvptx.cc (first_ptx_version_supporting_sm): Adjust. * config/nvptx/nvptx.opt (-march-map=sm_37, -march-map=sm_50): Likewise. * config.gcc: Likewise. * doc/invoke.texi (Nvidia PTX Options): Document '-march=sm_37'. * config/nvptx/gen-multilib-matches-tests: Extend. gcc/testsuite/ * gcc.target/nvptx/march-map=sm_37.c: Adjust. * gcc.target/nvptx/march-map=sm_50.c: Likewise. * gcc.target/nvptx/march-map=sm_52.c: Likewise. * gcc.target/nvptx/march=sm_37.c: New. libgomp/ * testsuite/libgomp.c/declare-variant-3-sm37.c: New. * testsuite/libgomp.c/declare-variant-3.h: Adjust.
2024-12-03OpenMP: 'allocate' directive - fixes for 'alignof' and [[omp::decl]]Tobias Burnus
Fixed a check to permit [[omp::decl(allocate,...)]] parsing in C. Additionaly, we discussed that 'allocate align' should not affect 'alignof' to avoid issues like with: int a; _Alignas(_Alignof(a)) int b; #pragma omp allocate(a) align(128) _Alignas(_Alignof(a)) int c; Thus, the alignment is no longer set in the C and Fortran front ends, but for static variables now in varpool_node::finalize_decl. (For stack variables, the alignment is handled in gimplify_bind_expr.) NOTE: 'omp allocate' is not yet supported in C++. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_allocate): Only check scope if not in_omp_decl_attribute. Remove setting the alignment. gcc/ChangeLog: * cgraphunit.cc (varpool_node::finalize_decl): Set alignment based on OpenMP's 'omp allocate' attribute/directive. gcc/fortran/ChangeLog: * trans-decl.cc (gfc_finish_var_decl): Remove setting the alignment. libgomp/ChangeLog: * libgomp.texi (Memory allocation): Mention (non-)effect of 'align' on _Alignof. * testsuite/libgomp.c/allocate-7.c: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/allocate-18.c: Check that alignof is unaffected by 'omp allocate'. * c-c++-common/gomp/allocate-19.c: Likewise.
2024-11-21libgomp: testsuite: Fix libgomp.c/alloc-pinned-3.c etc. for C23 on non-LinuxRainer Orth
Since the switch to a C23 default, three libgomp tests FAIL on Solaris: FAIL: libgomp.c/alloc-pinned-3.c (test for excess errors) UNRESOLVED: libgomp.c/alloc-pinned-3.c compilation failed to produce executable FAIL: libgomp.c/alloc-pinned-4.c (test for excess errors) UNRESOLVED: libgomp.c/alloc-pinned-4.c compilation failed to produce executable FAIL: libgomp.c/alloc-pinned-6.c (test for excess errors) UNRESOLVED: libgomp.c/alloc-pinned-6.c compilation failed to produce executable Excess errors: /vol/gcc/src/hg/master/local/libgomp/testsuite/libgomp.c/alloc-pinned-3.c:104:3: error: too many arguments to function 'set_pin_limit' Fixed by adding the missing size argument to the stub functions. Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11. 2024-11-20 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> libgomp: * testsuite/libgomp.c/alloc-pinned-3.c [!__linux__] (set_pin_limit): Add size arg. * testsuite/libgomp.c/alloc-pinned-4.c [!__linux__] (set_pin_limit): Likewise. * testsuite/libgomp.c/alloc-pinned-6.c [!__linux__] (set_pin_limit): Likewise.
2024-11-10Adjust 'libgomp.c/max_vf-*.c'Thomas Schwinge
For configurations where both GCN and nvptx offloading are enabled, we get: PASS: libgomp.c/max_vf-1.c (test for excess errors) PASS: libgomp.c/max_vf-1.c scan-tree-dump-times ompexp "GOMP_MAX_VF" 2 PASS: libgomp.c/max_vf-1.c scan-tree-dump-times ompexp "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, D\\.[0-9]*, 0\\);" 1 PASS: libgomp.c/max_vf-1.c scan-amdgcn-amdhsa-offload-tree-dump-times optimized "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 64, 0\\);" 1 FAIL: libgomp.c/max_vf-1.c scan-nvptx-none-offload-tree-dump-times optimized "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 64, 0\\);" 1 FAIL: libgomp.c/max_vf-1.c scan-amdgcn-amdhsa-offload-tree-dump-times optimized "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 7, 0\\);" 1 PASS: libgomp.c/max_vf-1.c scan-nvptx-none-offload-tree-dump-times optimized "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 7, 0\\);" 1 Avoid these FAILs via 'only_for_offload_target [...]'. Also, for consistency with other libgomp test cases, use effective-target specifiers of the libgomp test suite. Fix-up for recent commit d334f729e53867b838e867375b3f475ba793d96e "openmp: Add testcases for omp_max_vf". libgomp/ * testsuite/libgomp.c/max_vf-1.c: Adjust. * testsuite/libgomp.c/max_vf-2.c: Likewise.
2024-11-07openmp: Fix max_vf testcases with -march=cascadelakeAndrew Stubbs
Apparently we need to explicitly disable AVX, not just enabled SSE, to guarentee the 16-lane vectors we need for the pattern match. libgomp/ChangeLog: * testsuite/libgomp.c/max_vf-1.c: Add -mno-avx. gcc/testsuite/ChangeLog: * gcc.dg/gomp/max_vf-1.c: Add -mno-avx.
2024-11-06openmp: Add testcases for omp_max_vfAndrew Stubbs
Ensure that the GOMP_MAX_VF does the right thing for explicit schedules, when offloading is enabled ("target" directives are present), and is inactive otherwise. libgomp/ChangeLog: * testsuite/libgomp.c/max_vf-1.c: New test. * testsuite/libgomp.c/max_vf-2.c: New test. gcc/testsuite/ChangeLog: * gcc.dg/gomp/max_vf-1.c: New test.
2024-09-20OpenMP: Add get_device_from_uid/omp_get_uid_from_device routinesTobias Burnus
Those TR13/OpenMP 6.0 routines permit a reproducible offloading to a specific device by mapping an OpenMP device number to a unique ID (UID). The GPU device UIDs should be universally unique, the one for the host is not. gcc/ChangeLog: * omp-general.cc (omp_runtime_api_procname): Add get_device_from_uid and omp_get_uid_from_device routines. include/ChangeLog: * cuda/cuda.h (cuDeviceGetUuid): Declare. (cuDeviceGetUuid_v2): Add prototype. libgomp/ChangeLog: * config/gcn/target.c (omp_get_uid_from_device, omp_get_device_from_uid): Add stub implementation. * config/nvptx/target.c (omp_get_uid_from_device, omp_get_device_from_uid): Likewise. * fortran.c (omp_get_uid_from_device_, omp_get_uid_from_device_8_): New functions. * libgomp-plugin.h (GOMP_OFFLOAD_get_uid): Add prototype. * libgomp.h (struct gomp_device_descr): Add 'uid' and 'get_uid_func'. * libgomp.map (GOMP_6.0): New, includind the new UID routines. * libgomp.texi (OpenMP Technical Report 13): Mark UID routines as 'Y'. (Device Information Routines): Document new UID routines. (Offload-Target Specifics): Document UID format. * omp.h.in (omp_get_device_from_uid, omp_get_uid_from_device): New prototype. * omp_lib.f90.in (omp_get_device_from_uid, omp_get_uid_from_device): New interface. * omp_lib.h.in: Likewise. * plugin/cuda-lib.def: Add cuDeviceGetUuid and cuDeviceGetUuid_v2 via CUDA_ONE_CALL_MAYBE_NULL. * plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): New. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): New. * target.c (str_omp_initial_device): New static var. (STR_OMP_DEV_PREFIX): Define. (gomp_get_uid_for_device, omp_get_uid_from_device, omp_get_device_from_uid): New. (gomp_load_plugin_for_device): DLSYM_OPT the function 'get_uid'. (gomp_target_init): Set the device's 'uid' field to NULL. * testsuite/libgomp.c/device_uid.c: New test. * testsuite/libgomp.fortran/device_uid.f90: New test.
2024-09-02amdgcn: remove gfx803 "Fiji" supportAndrew Stubbs
The gfx803 "Fiji" device was deprecated in GCC 14, removed from LLVM 18, and hasn't worked properly with the drivers since about ROCm 4. This patch removes the device from GCC options and documentation, and removes the direct mentions from the internals. The TARGET_GCN3 support in the back-end is now unused and can be removed (in a follow-up patch). gcc/ChangeLog: * config.gcc (amdgcn-*-*): Remove "fiji" from with_arch checks. * config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): Remove fiji alternative. (NO_XNACK): Likewise. (NO_SRAM_ECC): Likewise. (ASM_SPEC): Remove "%{}" around ABI_VERSION_SPEC. * config/gcn/gcn-opts.h (enum processor_type): Remove PROCESSOR_FIJI. (TARGET_FIJI): Delete. * config/gcn/gcn.cc (gcn_option_override): Remove Fiji. (gcn_omp_device_kind_arch_isa): Likewise. (output_file_start): Likewise. * config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Likewise. * config/gcn/gcn.opt (gpu_type): Likewise. (march, mtune): Change default to PROCESSOR_VEGA10. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX803): Delete. (copy_early_debug_info): Remove elf_flags_actual. Use ELFABIVERSION_AMDGPU_HSA_V4 unconditionally. (get_arch): Remove Fiji. (main): Remove gfx803. * config/gcn/t-omp-device (omp-device-properties-gcn): Remove fiji and gfx803. * doc/install.texi (amdgcn*-*-*): Remove fiji and special instructions. * doc/invoke.texi: Remove fiji. libgomp/ChangeLog: * libgomp.texi: Remove fiji and gfx803. * testsuite/libgomp.c/declare-variant-4.h: Remove fiji and gfx803. * testsuite/libgomp.c/declare-variant-4-fiji.c: Removed. * testsuite/libgomp.c/declare-variant-4-gfx803.c: Removed.
2024-07-31testsuite: fix dg-require-effective-target order vs dg-additional-sourcesSam James
Per gccint, 'dg-require-effective-target' must come before any 'dg-additional-sources' directives. Fix a handful of deviant cases. gcc/testsuite/ChangeLog: * gcc.target/aarch64/aapcs64/func-ret-3.c: Fix dg-require-effective-target directive order. * gcc.target/aarch64/aapcs64/func-ret-4.c: Likewise. * gfortran.dg/PR100914.f90: Likewise. libgomp/ChangeLog: * testsuite/libgomp.c++/pr24455.C: Fix dg-require-effective-target directive order. * testsuite/libgomp.c/pr24455.c: Likewise.
2024-07-01libgomp, openmp: Add ompx_gnu_pinned_mem_allocAndrew Stubbs
This creates a new predefined allocator as a shortcut for using pinned memory with OpenMP. This is not in the OpenMP standard so it uses the "ompx" namespace and an independent enum baseline of 200 (selected to not clash with other known implementations). The allocator is equivalent to using a custom allocator with the pinned trait and the null fallback trait. One motivation for having this feature is for use by the (planned) -foffload-memory=pinned feature. gcc/fortran/ChangeLog: * openmp.cc (is_predefined_allocator): Update valid ranges to incorporate ompx_gnu_pinned_mem_alloc. libgomp/ChangeLog: * allocator.c (ompx_gnu_min_predefined_alloc): New. (ompx_gnu_max_predefined_alloc): New. (predefined_alloc_mapping): Rename to ... (predefined_omp_alloc_mapping): ... this. (predefined_ompx_gnu_alloc_mapping): New. (_Static_assert): Adjust for the new name, and add a new assert for the new table. (predefined_allocator_p): New. (predefined_alloc_mapping): New. (omp_aligned_alloc): Support ompx_gnu_pinned_mem_alloc. Use predefined_allocator_p and predefined_alloc_mapping. (omp_free): Likewise. (omp_alligned_calloc): Likewise. (omp_realloc): Likewise. * env.c (parse_allocator): Add ompx_gnu_pinned_mem_alloc. * libgomp.texi: Document ompx_gnu_pinned_mem_alloc. * omp.h.in (omp_allocator_handle_t): Add ompx_gnu_pinned_mem_alloc. * omp_lib.f90.in: Add ompx_gnu_pinned_mem_alloc. * omp_lib.h.in: Add ompx_gnu_pinned_mem_alloc. * testsuite/libgomp.c/alloc-pinned-5.c: New test. * testsuite/libgomp.c/alloc-pinned-6.c: New test. * testsuite/libgomp.fortran/alloc-pinned-1.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/allocate-pinned-1.f90: New test. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2024-07-01libgomp: change alloc-pinned tests failure modeAndrew Stubbs
The feature doesn't work on non-Linux hosts, at present, so skip the tests entirely. On Linux systems that have insufficient lockable memory configured we still need to fail or else the feature won't be getting tested when we think it is, but now there's a message to explain why. libgomp/ChangeLog: * testsuite/libgomp.c/alloc-pinned-1.c: Change dg-xfail-run-if to dg-skip-if. Correct spelling mistake. Abort on insufficient lockable memory. Use #error on non-linux hosts. * testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
2024-05-02libgomp: Add gfx90c, 1036 and 1103 declare variant testsJakub Jelinek
Recently -march=gfx{90c,1036,1103} support has been added, but corresponding changes weren't done in the testsuite. The following patch adds that. Tested on x86_64-linux (with fiji and gfx1103 devices; had to use OMP_DEFAULT_DEVICE=1 there, fiji doesn't really work due to LLVM dropping support, but we still list those as offloading devices). 2024-05-02 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c/declare-variant-4.h (gfx90c, gfx1036, gfx1103): New functions. (f): Add #pragma omp declare variant directives for those. * testsuite/libgomp.c/declare-variant-4-gfx90c.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1036.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1103.c: New test.
2024-04-05nvptx: In mkoffload.cc, call diagnostic_color_init + gcc_init_libintl: ↵Thomas Schwinge
Restore 'libgomp.c/reverse-offload-sm30.c' testing With commit 7520a4992c94254016085a461c58c972497c4483 "nvptx: In mkoffload.cc, call diagnostic_color_init + gcc_init_libintl", we regressed: [-PASS:-]{+FAIL:+} libgomp.c/reverse-offload-sm30.c at line 15 (test for warnings, line ) [-PASS:-]{+FAIL:+} libgomp.c/reverse-offload-sm30.c (test for excess errors) libgomp/ * testsuite/libgomp.c/reverse-offload-sm30.c: Set 'GCC_COLORS' to the empty string.
2024-03-06Revert "Set num_threads to 50 on 32-bit hppa in two libgomp loop tests"John David Anglin
This reverts commit b14209715e659f6d3ca0f9eef9a4851e7bd6e373.
2024-02-12libgomp: testsuite: Don't XPASS libgomp.c/alloc-pinned-1.c etc. on non-Linux ↵Rainer Orth
targets [PR113448] Two libgomp tests XPASS on Solaris (any non-Linux target actually) since their introduction: XPASS: libgomp.c/alloc-pinned-1.c execution test XPASS: libgomp.c/alloc-pinned-2.c execution test The problem is that the test just prints OS unsupported and exits successfully, while the test is XFAILed: /* { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu } } */ Fixed by aborting immediately after the message above in the non-Linux case. Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu. 2024-02-02 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> libgomp: PR testsuite/113448 * testsuite/libgomp.c/alloc-pinned-1.c [!__linux__] (CHECK_SIZE): Call abort. * testsuite/libgomp.c/alloc-pinned-2.c [!__linux__] (CHECK_SIZE): Likewise.
2024-02-01Set num_threads to 50 on 32-bit hppa in two libgomp loop testsJohn David Anglin
We support a maximum of 50 threads on 32-bit hppa. 2024-02-01 John David Anglin <danglin@gcc.gnu.org> libgomp/ChangeLog: * testsuite/libgomp.c++/loop-3.C: Set num_threads to 50 on 32-bit hppa. * testsuite/libgomp.c/omp-loop03.c: Likewise.
2024-01-29libgomp.c/declare-variant-4.h: Fix used variant function for gfx1030/gfx1100Tobias Burnus
libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4.h: Use gfx1100/gfx1030 function not gfx90a for gfx1100/gfx1030 context selector. Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
2024-01-26amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docsTobias Burnus
gcc/ChangeLog: * config.gcc (amdgcn-*-*): Add gfx1030 and gfx1100 to TM_MULTILIB_CONFIG. * doc/install.texi (Configuration amdgcn-*-*): Mention gfx1030/gfx1100. * doc/invoke.texi (AMD GCN Options): Add gfx1030 and gfx1100 to -march/-mtune. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4.h: Add variant functions for gfx1030 and gfx1100. * testsuite/libgomp.c/declare-variant-4-gfx1030.c: New test. * testsuite/libgomp.c/declare-variant-4-gfx1100.c: New test. Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
2024-01-22xfail libgomp.c/declare-variant-4-{fiji,gfx803}.cTobias Burnus
Since r14-4734-g56ed1055b2f40ac162ae8d382280ac07a33f789f, GCC no longer builds the Fiji (alias gfx803) libraries by default as support for it was removed in ROCm 4.0 and will be removed in LLVM 18. Thus, unless gfx803 is explicitly enabled, the following testcases will fail to link as libgomp is not available for Fiji. Hence, this commit xfails those testcases. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-4-fiji.c: Xfail as fiji support is no longer enabled by default. * testsuite/libgomp.c/declare-variant-4-gfx803.c: Likewise. Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
2024-01-20Don't run libgomp.c/simd-math-1.c on hppa*-*-hpux*John David Anglin
hppa*-*-hpux* lacks necessary math functions. 2024-01-20 John David Anglin <danglin@gcc.gnu.org> libgomp/ChangeLog: * testsuite/libgomp.c/simd-math-1.c: Don't run on hppa*-*-hpux*.
2024-01-17openmp: Add OpenMP _BitInt support [PR113409]Jakub Jelinek
The following patch adds support for _BitInt iterators of OpenMP canonical loops (with the preexisting limitation that when not using compile time static scheduling the iterators in the library are at most unsigned long long or signed long, so one can't in the runtime/dynamic/guided etc. cases iterate more than what those types can represent, like is the case of e.g. __int128 iterators too) and the testcase also covers linear/reduction clauses for them. 2024-01-17 Jakub Jelinek <jakub@redhat.com> PR middle-end/113409 * omp-general.cc (omp_adjust_for_condition): Handle BITINT_TYPE like INTEGER_TYPE. (omp_extract_for_data): Use build_bitint_type rather than build_nonstandard_integer_type if either iter_type or loop->v type is BITINT_TYPE. * omp-expand.cc (expand_omp_for_generic, expand_omp_taskloop_for_outer, expand_omp_taskloop_for_inner): Handle BITINT_TYPE like INTEGER_TYPE. * testsuite/libgomp.c/bitint-1.c: New test.
2024-01-03Update copyright years.Jakub Jelinek
2023-12-18libgomp: Make libgomp.c/declare-variant-1.c test x86 specificJakub Jelinek
As written earlier, this test was written with the x86 specifics in mind and adding dg-final directives for it for other arches makes it unreadable. If a declare variant call can be resolved in gimple already as in the aarch64 or gcn cases, it can be done in gcc.dg/gomp/ and I believe we have tests like that already, the point of the test is that it is not known during gimplification time which exact call should be chosen as it depends on which declare simd clone it will be in. 2023-12-18 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c/declare-variant-1.c: Restrict the test to x86, drop because of that unneeded target selector from other directives and remove the aarch64 specific ones.
2023-12-15Fix tests for gompAndre Vieira
This is to fix testisms initially introduced by: commit f5fc001a84a7dbb942a6252b3162dd38b4aae311 Author: Andre Vieira <andre.simoesdiasvieira@arm.com> Date: Mon Dec 11 14:24:41 2023 +0000 aarch64: enable mixed-types for aarch64 simdclones gcc/testsuite/ChangeLog: * gcc.dg/gomp/pr87887-1.c: Fixed test. * gcc.dg/gomp/pr89246-1.c: Likewise. * gcc.dg/gomp/simd-clones-2.c: Likewise. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-1.c: Fixed test. * testsuite/libgomp.fortran/declare-simd-1.f90: Likewise.
2023-12-13libgomp: basic pinned memory on LinuxAndrew Stubbs
Implement the OpenMP pinned memory trait on Linux hosts using the mlock syscall. Pinned allocations are performed using mmap, not malloc, to ensure that they can be unpinned safely when freed. This implementation will work OK for page-scale allocations, and finer-grained allocations will be implemented in a future patch. libgomp/ChangeLog: * allocator.c (MEMSPACE_ALLOC): Add PIN. (MEMSPACE_CALLOC): Add PIN. (MEMSPACE_REALLOC): Add PIN. (MEMSPACE_FREE): Add PIN. (MEMSPACE_VALIDATE): Add PIN. (omp_init_allocator): Use MEMSPACE_VALIDATE to check pinning. (omp_aligned_alloc): Add pinning to all MEMSPACE_* calls. (omp_aligned_calloc): Likewise. (omp_realloc): Likewise. (omp_free): Likewise. * config/linux/allocator.c: New file. * config/nvptx/allocator.c (MEMSPACE_ALLOC): Add PIN. (MEMSPACE_CALLOC): Add PIN. (MEMSPACE_REALLOC): Add PIN. (MEMSPACE_FREE): Add PIN. (MEMSPACE_VALIDATE): Add PIN. * config/gcn/allocator.c (MEMSPACE_ALLOC): Add PIN. (MEMSPACE_CALLOC): Add PIN. (MEMSPACE_REALLOC): Add PIN. (MEMSPACE_FREE): Add PIN. * libgomp.texi: Switch pinned trait to supported. (MEMSPACE_VALIDATE): Add PIN. * testsuite/libgomp.c/alloc-pinned-1.c: New test. * testsuite/libgomp.c/alloc-pinned-2.c: New test. * testsuite/libgomp.c/alloc-pinned-3.c: New test. * testsuite/libgomp.c/alloc-pinned-4.c: New test. Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2023-12-11aarch64: enable mixed-types for aarch64 simdclonesAndre Vieira
This patch enables the use of mixed-types for simd clones for AArch64, adds aarch64 as a target_vect_simd_clones and corrects the way the simdlen is chosen for non-specified simdlen clauses according to the 'Vector Function Application Binary Interface Specification for AArch64'. Additionally this patch also restricts combinations of simdlen and return/argument types that map to vectors larger than 128 bits as we currently do not have a way to represent these types in a way that is consistent internally and externally. gcc/ChangeLog: * config/aarch64/aarch64.cc (lane_size): New function. (aarch64_simd_clone_compute_vecsize_and_simdlen): Determine simdlen according to NDS rule and reject combination of simdlen and types that lead to vectors larger than 128bits. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add aarch64 targets to vect_simd_clones. * c-c++-common/gomp/declare-variant-14.c: Adapt test for aarch64. * c-c++-common/gomp/pr60823-1.c: Likewise. * c-c++-common/gomp/pr60823-2.c: Likewise. * c-c++-common/gomp/pr60823-3.c: Likewise. * g++.dg/gomp/attrs-10.C: Likewise. * g++.dg/gomp/declare-simd-1.C: Likewise. * g++.dg/gomp/declare-simd-3.C: Likewise. * g++.dg/gomp/declare-simd-4.C: Likewise. * g++.dg/gomp/declare-simd-7.C: Likewise. * g++.dg/gomp/declare-simd-8.C: Likewise. * g++.dg/gomp/pr88182.C: Likewise. * gcc.dg/declare-simd.c: Likewise. * gcc.dg/gomp/declare-simd-1.c: Likewise. * gcc.dg/gomp/declare-simd-3.c: Likewise. * gcc.dg/gomp/pr87887-1.c: Likewise. * gcc.dg/gomp/pr87895-1.c: Likewise. * gcc.dg/gomp/pr89246-1.c: Likewise. * gcc.dg/gomp/pr99542.c: Likewise. * gcc.dg/gomp/simd-clones-2.c: Likewise. * gcc.dg/vect/vect-simd-clone-1.c: Likewise. * gcc.dg/vect/vect-simd-clone-2.c: Likewise. * gcc.dg/vect/vect-simd-clone-4.c: Likewise. * gcc.dg/vect/vect-simd-clone-5.c: Likewise. * gcc.dg/vect/vect-simd-clone-6.c: Likewise. * gcc.dg/vect/vect-simd-clone-7.c: Likewise. * gcc.dg/vect/vect-simd-clone-8.c: Likewise. * gfortran.dg/gomp/declare-simd-2.f90: Likewise. * gfortran.dg/gomp/declare-simd-coarray-lib.f90: Likewise. * gfortran.dg/gomp/declare-variant-14.f90: Likewise. * gfortran.dg/gomp/pr79154-1.f90: Likewise. * gfortran.dg/gomp/pr83977.f90: Likewise. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-1.c: Adapt test for aarch64. * testsuite/libgomp.fortran/declare-simd-1.f90: Likewise.