<feed xmlns='http://www.w3.org/2005/Atom'>
<title>llvm-project.git/openmp/runtime/src/kmp_affinity.h, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/'/>
<entry>
<title>[OpenMP] Fix preprocessor mismatches between include and usages of hwloc (#158349)</title>
<updated>2025-10-15T08:58:41+00:00</updated>
<author>
<name>Peter Arzt</name>
<email>peter@arzt-fd.de</email>
</author>
<published>2025-10-15T08:58:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=cd24d108a2c19c23c4ac80b501fa7361963cca3d'/>
<id>cd24d108a2c19c23c4ac80b501fa7361963cca3d</id>
<content type='text'>
Fix https://github.com/llvm/llvm-project/issues/156679

There is a mismatch between the preprocessor guards around the include
of `hwloc.h` and those protecting its usages, leading to build failures
on Darwin: https://github.com/spack/spack-packages/pull/1212

This change introduces `KMP_HWLOC_ENABLED` that reflects
whether hwloc is actually used.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix https://github.com/llvm/llvm-project/issues/156679

There is a mismatch between the preprocessor guards around the include
of `hwloc.h` and those protecting its usages, leading to build failures
on Darwin: https://github.com/spack/spack-packages/pull/1212

This change introduces `KMP_HWLOC_ENABLED` that reflects
whether hwloc is actually used.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Provide __NR_sched_[gs]etaffinity on Linux/sparc64 (#138525)</title>
<updated>2025-05-06T07:13:34+00:00</updated>
<author>
<name>Rainer Orth</name>
<email>ro@gcc.gnu.org</email>
</author>
<published>2025-05-06T07:13:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=3f1eafaa04f1c04ae5c7aae3e452eb75c507584d'/>
<id>3f1eafaa04f1c04ae5c7aae3e452eb75c507584d</id>
<content type='text'>
`libomp` doesn't currently build on Linux/sparc64 due to lack of
`__NR_sched_setaffinity` and `__NR_sched_getaffinity` definitions.

This patch provides those.

Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`,
`amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
`libomp` doesn't currently build on Linux/sparc64 due to lack of
`__NR_sched_setaffinity` and `__NR_sched_getaffinity` definitions.

This patch provides those.

Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`,
`amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Miscellaneous small code improvements (#95603)</title>
<updated>2024-08-15T15:42:22+00:00</updated>
<author>
<name>Hansang Bae</name>
<email>hansang.bae@intel.com</email>
</author>
<published>2024-08-15T15:42:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=598970904736f3535939f6a5525022219e4ae517'/>
<id>598970904736f3535939f6a5525022219e4ae517</id>
<content type='text'>
Removes a few uninitialized variables, possible resource leaks, and
redundant code.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Removes a few uninitialized variables, possible resource leaks, and
redundant code.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Add topology and affinity changes for Meteor Lake (#91012)</title>
<updated>2024-07-29T14:51:42+00:00</updated>
<author>
<name>Jonathan Peyton</name>
<email>jonathan.l.peyton@intel.com</email>
</author>
<published>2024-07-29T14:51:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=77ff969e5d6a561606ea87fbae101195417d4d73'/>
<id>77ff969e5d6a561606ea87fbae101195417d4d73</id>
<content type='text'>
These are Intel-specific changes for the CPUID leaf 31 method for
detecting machine topology.

* Cleanup known levels usage in x2apicid topology algorithm
Change to be a constant mask of all Intel topology type values.

* Take unknown ids into account when sorting them
If a hardware id is unknown, then put further down the hardware thread
list so it will take last priority when assigning to threads.

* Have sub ids printed out for hardware thread dump

* Add caches to topology 
New` kmp_cache_ids_t` class helps create cache ids which are then put
into the topology table after regular topology type ids have been put
in.

* Allow empty masks in place list creation
Have enumeration information and place list generation take into account
that certain hardware threads may be lacking certain layers

* Allow different procs to have different number of topology levels
Accommodates possible situation where CPUID.1F has different depth for
different hardware threads. Each hardware thread has a topology
description which is just a small set of its topology levels. These
descriptions are tracked to see if the topology is uniform or not.

* Change regular ids with logical ids
Instead of keeping the original sub ids that the x2apicid topology
detection algorithm gives, change each id to its logical id which is a
number: [0, num_items - 1]. This makes inserting new layers into the
topology significantly simpler.

* Insert caches into topology
This change takes into account that most topologies are uniform and
therefore can use the quicker method of inserting caches as equivalent
layers into the topology.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These are Intel-specific changes for the CPUID leaf 31 method for
detecting machine topology.

* Cleanup known levels usage in x2apicid topology algorithm
Change to be a constant mask of all Intel topology type values.

* Take unknown ids into account when sorting them
If a hardware id is unknown, then put further down the hardware thread
list so it will take last priority when assigning to threads.

* Have sub ids printed out for hardware thread dump

* Add caches to topology 
New` kmp_cache_ids_t` class helps create cache ids which are then put
into the topology table after regular topology type ids have been put
in.

* Allow empty masks in place list creation
Have enumeration information and place list generation take into account
that certain hardware threads may be lacking certain layers

* Allow different procs to have different number of topology levels
Accommodates possible situation where CPUID.1F has different depth for
different hardware threads. Each hardware thread has a topology
description which is just a small set of its topology levels. These
descriptions are tracked to see if the topology is uniform or not.

* Change regular ids with logical ids
Instead of keeping the original sub ids that the x2apicid topology
detection algorithm gives, change each id to its logical id which is a
number: [0, num_items - 1]. This makes inserting new layers into the
topology significantly simpler.

* Insert caches into topology
This change takes into account that most topologies are uniform and
therefore can use the quicker method of inserting caches as equivalent
layers into the topology.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][AIX] Use syssmt() to get the number of SMTs per physical CPU (#89985)</title>
<updated>2024-04-26T17:23:33+00:00</updated>
<author>
<name>Xing Xue</name>
<email>xingxue@outlook.com</email>
</author>
<published>2024-04-26T17:23:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=690c929b6c68b4cd0ff314a0a88d3b218d46db2d'/>
<id>690c929b6c68b4cd0ff314a0a88d3b218d46db2d</id>
<content type='text'>
This patch changes to use system call `syssmt()` instead of
`lpar_get_info()` to get the number of SMTs (logical processors) per
physical processor for AIX. `lpar_get_info()` gives the max number of
SMTs that the physical processor can support while `syssmt()` returns
the number that is currently configured.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch changes to use system call `syssmt()` instead of
`lpar_get_info()` to get the number of SMTs (logical processors) per
physical processor for AIX. `lpar_get_info()` gives the max number of
SMTs that the physical processor can support while `syssmt()` returns
the number that is currently configured.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP] Add absolute KMP_HW_SUBSET functionality (#85326)</title>
<updated>2024-04-03T16:43:23+00:00</updated>
<author>
<name>Jonathan Peyton</name>
<email>jonathan.l.peyton@intel.com</email>
</author>
<published>2024-04-03T16:43:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=2ff3850ea19f72573d8abdf9a78e52d3dfdd90ac'/>
<id>2ff3850ea19f72573d8abdf9a78e52d3dfdd90ac</id>
<content type='text'>
Users can put a : in front of KMP_HW_SUBSET to indicate that the
specified subset is an "absolute" subset. Currently, when a user puts
KMP_HW_SUBSET=1t. This gets translated to KMP_HW_SUBSET="*s,*c,1t",
where * means "use all of". If a user wants only one thread as the
entire topology they can now do KMP_HW_SUBSET=:1t.

Along with the absolute syntax is a fix for newer machines and making
them easier to use with only the 3-level topology syntax. When a user
puts KMP_HW_SUBSET=1s,4c,2t on a machine which actually has 4 layers,
(say 1s,2m,3c,2t as the entire machine) the user gets an unexpected "too
many resources asked" message because KMP_HW_SUBSET currently translates
the "4c" value to mean 4 cores per module. To help users out, the
runtime can assume that these newer layers, module in this case, should
be ignored if they are not specified, but the topology should always
take into account the sockets, cores, and threads layers.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Users can put a : in front of KMP_HW_SUBSET to indicate that the
specified subset is an "absolute" subset. Currently, when a user puts
KMP_HW_SUBSET=1t. This gets translated to KMP_HW_SUBSET="*s,*c,1t",
where * means "use all of". If a user wants only one thread as the
entire topology they can now do KMP_HW_SUBSET=:1t.

Along with the absolute syntax is a fix for newer machines and making
them easier to use with only the 3-level topology syntax. When a user
puts KMP_HW_SUBSET=1s,4c,2t on a machine which actually has 4 layers,
(say 1s,2m,3c,2t as the entire machine) the user gets an unexpected "too
many resources asked" message because KMP_HW_SUBSET currently translates
the "4c" value to mean 4 cores per module. To help users out, the
runtime can assume that these newer layers, module in this case, should
be ignored if they are not specified, but the topology should always
take into account the sockets, cores, and threads layers.</pre>
</div>
</content>
</entry>
<entry>
<title>[OpenMP][AIX] Affinity implementation for AIX (#84984)</title>
<updated>2024-03-22T19:25:08+00:00</updated>
<author>
<name>Xing Xue</name>
<email>xingxue@outlook.com</email>
</author>
<published>2024-03-22T19:25:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=d394f3a162b871668d0c8e8bf6a94922fa8698ae'/>
<id>d394f3a162b871668d0c8e8bf6a94922fa8698ae</id>
<content type='text'>
This patch implements `affinity` for AIX, which is quite different from
platforms such as Linux.
- Setting CPU affinity through masks and related functions are not
supported. System call `bindprocessor()` is used to bind a thread to one
CPU per call.
- There are no system routines to get the affinity info of a thread. The
implementation of `get_system_affinity()` for AIX gets the mask of all
available CPUs, to be used as the full mask only.
- Topology is not available from the file system. It is obtained through
system SRAD (Scheduler Resource Allocation Domain).

This patch has run through the libomp LIT tests successfully with
`affinity` enabled.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch implements `affinity` for AIX, which is quite different from
platforms such as Linux.
- Setting CPU affinity through masks and related functions are not
supported. System call `bindprocessor()` is used to bind a thread to one
CPU per call.
- There are no system routines to get the affinity info of a thread. The
implementation of `get_system_affinity()` for AIX gets the mask of all
available CPUs, to be used as the full mask only.
- Topology is not available from the file system. It is obtained through
system SRAD (Scheduler Resource Allocation Domain).

This patch has run through the libomp LIT tests successfully with
`affinity` enabled.</pre>
</div>
</content>
</entry>
<entry>
<title>[openmp] adding affinity support to DragonFlyBSD. (#84672)</title>
<updated>2024-03-10T09:56:55+00:00</updated>
<author>
<name>David CARLIER</name>
<email>devnexen@gmail.com</email>
</author>
<published>2024-03-10T09:56:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=fa4cc39255767bbaf63a6a3b445dc94b43ebd447'/>
<id>fa4cc39255767bbaf63a6a3b445dc94b43ebd447</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>[openmp] porting affinity feature to netbsd. (#84618)</title>
<updated>2024-03-09T11:45:07+00:00</updated>
<author>
<name>David CARLIER</name>
<email>devnexen@gmail.com</email>
</author>
<published>2024-03-09T11:45:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=11cd2a33f1a80c1b8ad1968c1316204b172e4937'/>
<id>11cd2a33f1a80c1b8ad1968c1316204b172e4937</id>
<content type='text'>
netbsd supports the portable hwloc's layer as well. for a hardware with
4 cpus, a cpu set is 4 and maxcpus is 256.</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
netbsd supports the portable hwloc's layer as well. for a hardware with
4 cpus, a cpu set is 4 and maxcpus is 256.</pre>
</div>
</content>
</entry>
<entry>
<title>Add openmp support to System z (#66081)</title>
<updated>2023-11-03T11:42:55+00:00</updated>
<author>
<name>Neale Ferguson</name>
<email>neale@sinenomine.net</email>
</author>
<published>2023-11-03T11:42:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/llvm-project.git/commit/?id=1111ef025762d9b7ecc3cafc576083987ae63fe6'/>
<id>1111ef025762d9b7ecc3cafc576083987ae63fe6</id>
<content type='text'>
* openmp/README.rst
  - Add s390x to those platforms supported

* openmp/libomptarget/plugins-nextgen/CMakeLists.txt
  - Add s390x subdirectory

* openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt
  - Add s390x definitions

* openmp/runtime/CMakeLists.txt
  - Add s390x to those platforms supported

* openmp/runtime/cmake/LibompGetArchitecture.cmake
  - Define s390x ARCHITECTURE

* openmp/runtime/cmake/LibompMicroTests.cmake
  - Add dependencies for System z (aka s390x)

* openmp/runtime/cmake/LibompUtils.cmake
  - Add S390X to the mix

* openmp/runtime/cmake/config-ix.cmake
  - Add s390x as a supported LIPOMP_ARCH

* openmp/runtime/src/kmp_affinity.h
  - Define __NR_sched_[get|set]addinity for s390x

* openmp/runtime/src/kmp_config.h.cmake
  - Define CACHE_LINE for s390x

* openmp/runtime/src/kmp_os.h
  - Add KMP_ARCH_S390X to support checks

* openmp/runtime/src/kmp_platform.h
  - Define KMP_ARCH_S390X

* openmp/runtime/src/kmp_runtime.cpp
  - Generate code when KMP_ARCH_S390X is defined

* openmp/runtime/src/kmp_tasking.cpp
  - Generate code when KMP_ARCH_S390X is defined

* openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h
  - Define ITT_ARCH_S390X

* openmp/runtime/src/z_Linux_asm.S
  - Instantiate __kmp_invoke_microtask for s390x

* openmp/runtime/src/z_Linux_util.cpp
  - Generate code when KMP_ARCH_S390X is defined

* openmp/runtime/test/ompt/callback.h
  - Define print_possible_return_addresses for s390x

* openmp/runtime/tools/lib/Platform.pm
  - Return s390x as platform and host architecture

* openmp/runtime/tools/lib/Uname.pm
  - Set hardware platform value for s390x</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* openmp/README.rst
  - Add s390x to those platforms supported

* openmp/libomptarget/plugins-nextgen/CMakeLists.txt
  - Add s390x subdirectory

* openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt
  - Add s390x definitions

* openmp/runtime/CMakeLists.txt
  - Add s390x to those platforms supported

* openmp/runtime/cmake/LibompGetArchitecture.cmake
  - Define s390x ARCHITECTURE

* openmp/runtime/cmake/LibompMicroTests.cmake
  - Add dependencies for System z (aka s390x)

* openmp/runtime/cmake/LibompUtils.cmake
  - Add S390X to the mix

* openmp/runtime/cmake/config-ix.cmake
  - Add s390x as a supported LIPOMP_ARCH

* openmp/runtime/src/kmp_affinity.h
  - Define __NR_sched_[get|set]addinity for s390x

* openmp/runtime/src/kmp_config.h.cmake
  - Define CACHE_LINE for s390x

* openmp/runtime/src/kmp_os.h
  - Add KMP_ARCH_S390X to support checks

* openmp/runtime/src/kmp_platform.h
  - Define KMP_ARCH_S390X

* openmp/runtime/src/kmp_runtime.cpp
  - Generate code when KMP_ARCH_S390X is defined

* openmp/runtime/src/kmp_tasking.cpp
  - Generate code when KMP_ARCH_S390X is defined

* openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h
  - Define ITT_ARCH_S390X

* openmp/runtime/src/z_Linux_asm.S
  - Instantiate __kmp_invoke_microtask for s390x

* openmp/runtime/src/z_Linux_util.cpp
  - Generate code when KMP_ARCH_S390X is defined

* openmp/runtime/test/ompt/callback.h
  - Define print_possible_return_addresses for s390x

* openmp/runtime/tools/lib/Platform.pm
  - Return s390x as platform and host architecture

* openmp/runtime/tools/lib/Uname.pm
  - Set hardware platform value for s390x</pre>
</div>
</content>
</entry>
</feed>
