summaryrefslogtreecommitdiff
path: root/openmp/runtime/src/kmp_affinity.h
AgeCommit message (Collapse)Author
2025-10-15[OpenMP] Fix preprocessor mismatches between include and usages of hwloc ↵Peter Arzt
(#158349) Fix https://github.com/llvm/llvm-project/issues/156679 There is a mismatch between the preprocessor guards around the include of `hwloc.h` and those protecting its usages, leading to build failures on Darwin: https://github.com/spack/spack-packages/pull/1212 This change introduces `KMP_HWLOC_ENABLED` that reflects whether hwloc is actually used.
2025-05-06[OpenMP] Provide __NR_sched_[gs]etaffinity on Linux/sparc64 (#138525)Rainer Orth
`libomp` doesn't currently build on Linux/sparc64 due to lack of `__NR_sched_setaffinity` and `__NR_sched_getaffinity` definitions. This patch provides those. Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
2024-08-15[OpenMP] Miscellaneous small code improvements (#95603)Hansang Bae
Removes a few uninitialized variables, possible resource leaks, and redundant code.
2024-07-29[OpenMP] Add topology and affinity changes for Meteor Lake (#91012)Jonathan Peyton
These are Intel-specific changes for the CPUID leaf 31 method for detecting machine topology. * Cleanup known levels usage in x2apicid topology algorithm Change to be a constant mask of all Intel topology type values. * Take unknown ids into account when sorting them If a hardware id is unknown, then put further down the hardware thread list so it will take last priority when assigning to threads. * Have sub ids printed out for hardware thread dump * Add caches to topology New` kmp_cache_ids_t` class helps create cache ids which are then put into the topology table after regular topology type ids have been put in. * Allow empty masks in place list creation Have enumeration information and place list generation take into account that certain hardware threads may be lacking certain layers * Allow different procs to have different number of topology levels Accommodates possible situation where CPUID.1F has different depth for different hardware threads. Each hardware thread has a topology description which is just a small set of its topology levels. These descriptions are tracked to see if the topology is uniform or not. * Change regular ids with logical ids Instead of keeping the original sub ids that the x2apicid topology detection algorithm gives, change each id to its logical id which is a number: [0, num_items - 1]. This makes inserting new layers into the topology significantly simpler. * Insert caches into topology This change takes into account that most topologies are uniform and therefore can use the quicker method of inserting caches as equivalent layers into the topology.
2024-04-26[OpenMP][AIX] Use syssmt() to get the number of SMTs per physical CPU (#89985)Xing Xue
This patch changes to use system call `syssmt()` instead of `lpar_get_info()` to get the number of SMTs (logical processors) per physical processor for AIX. `lpar_get_info()` gives the max number of SMTs that the physical processor can support while `syssmt()` returns the number that is currently configured.
2024-04-03[OpenMP] Add absolute KMP_HW_SUBSET functionality (#85326)Jonathan Peyton
Users can put a : in front of KMP_HW_SUBSET to indicate that the specified subset is an "absolute" subset. Currently, when a user puts KMP_HW_SUBSET=1t. This gets translated to KMP_HW_SUBSET="*s,*c,1t", where * means "use all of". If a user wants only one thread as the entire topology they can now do KMP_HW_SUBSET=:1t. Along with the absolute syntax is a fix for newer machines and making them easier to use with only the 3-level topology syntax. When a user puts KMP_HW_SUBSET=1s,4c,2t on a machine which actually has 4 layers, (say 1s,2m,3c,2t as the entire machine) the user gets an unexpected "too many resources asked" message because KMP_HW_SUBSET currently translates the "4c" value to mean 4 cores per module. To help users out, the runtime can assume that these newer layers, module in this case, should be ignored if they are not specified, but the topology should always take into account the sockets, cores, and threads layers.
2024-03-22[OpenMP][AIX] Affinity implementation for AIX (#84984)Xing Xue
This patch implements `affinity` for AIX, which is quite different from platforms such as Linux. - Setting CPU affinity through masks and related functions are not supported. System call `bindprocessor()` is used to bind a thread to one CPU per call. - There are no system routines to get the affinity info of a thread. The implementation of `get_system_affinity()` for AIX gets the mask of all available CPUs, to be used as the full mask only. - Topology is not available from the file system. It is obtained through system SRAD (Scheduler Resource Allocation Domain). This patch has run through the libomp LIT tests successfully with `affinity` enabled.
2024-03-10[openmp] adding affinity support to DragonFlyBSD. (#84672)David CARLIER
2024-03-09[openmp] porting affinity feature to netbsd. (#84618)David CARLIER
netbsd supports the portable hwloc's layer as well. for a hardware with 4 cpus, a cpu set is 4 and maxcpus is 256.
2023-11-03Add openmp support to System z (#66081)Neale Ferguson
* openmp/README.rst - Add s390x to those platforms supported * openmp/libomptarget/plugins-nextgen/CMakeLists.txt - Add s390x subdirectory * openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt - Add s390x definitions * openmp/runtime/CMakeLists.txt - Add s390x to those platforms supported * openmp/runtime/cmake/LibompGetArchitecture.cmake - Define s390x ARCHITECTURE * openmp/runtime/cmake/LibompMicroTests.cmake - Add dependencies for System z (aka s390x) * openmp/runtime/cmake/LibompUtils.cmake - Add S390X to the mix * openmp/runtime/cmake/config-ix.cmake - Add s390x as a supported LIPOMP_ARCH * openmp/runtime/src/kmp_affinity.h - Define __NR_sched_[get|set]addinity for s390x * openmp/runtime/src/kmp_config.h.cmake - Define CACHE_LINE for s390x * openmp/runtime/src/kmp_os.h - Add KMP_ARCH_S390X to support checks * openmp/runtime/src/kmp_platform.h - Define KMP_ARCH_S390X * openmp/runtime/src/kmp_runtime.cpp - Generate code when KMP_ARCH_S390X is defined * openmp/runtime/src/kmp_tasking.cpp - Generate code when KMP_ARCH_S390X is defined * openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h - Define ITT_ARCH_S390X * openmp/runtime/src/z_Linux_asm.S - Instantiate __kmp_invoke_microtask for s390x * openmp/runtime/src/z_Linux_util.cpp - Generate code when KMP_ARCH_S390X is defined * openmp/runtime/test/ompt/callback.h - Define print_possible_return_addresses for s390x * openmp/runtime/tools/lib/Platform.pm - Return s390x as platform and host architecture * openmp/runtime/tools/lib/Uname.pm - Set hardware platform value for s390x
2023-09-10[OpenMP][VE] Support OpenMP runtime on VEKazushi (Jam) Marukawa
Support OpenMP runtime library on VE. This patch makes OpenMP compilable for VE architecture. Almost all tests run correctly on VE. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D159401
2023-07-31[OpenMP] Introduce hybrid core attributes to OMP_PLACES and KMP_AFFINITYJonathan Peyton
* Add KMP_CPU_EQUAL and KMP_CPU_ISEMPTY to affinity mask API * Add printout of leader to hardware thread dump * Allow OMP_PLACES to restrict fullMask This change fixes an issue with the OMP_PLACES=resource(#) syntax. Before this change, specifying the number of resources did NOT change the default number of threads created by the runtime. e.g., OMP_PLACES=cores(2) would still create __kmp_avail_proc number of threads. After this change, the fullMask and __kmp_avail_proc are modified if necessary so that the final place list dictates which resources are available and how thus, how many threads are created by default. * Introduce hybrid core attributes to OMP_PLACES and KMP_AFFINITY For OMP_PLACES, two new features are added: 1) OMP_PLACES=cores:<attribute> where <attribute> is either intel_atom, intel_core, or eff# where # is 0 - number of core efficiencies-1. This syntax also supports the optional (#) number selection of resources. 2) OMP_PLACES=core_types|core_effs where this setting will create the number of core_types (or core_effs|core_efficiencies). For KMP_AFFINITY, the granularity setting is expanded to include two new keywords: core_type, and core_eff (or core_efficiency). This will set the granularity to include all cores with a particular core type (or efficiency). e.g., KMP_AFFINITY=granularity=core_type,compact will create threads which can float across a single core type. Differential Revision: https://reviews.llvm.org/D154547
2023-07-05[OpenMP] Minor improvement in error msg and fixes few coverity reported issuesNawrin Sultana
Differential Revision: https://reviews.llvm.org/D152289
2023-01-16[OpenMP][libomp] Add topology information to thread structureJonathan Peyton
Each time a thread gets a new affinity assigned, it will not only assign its mask, but also topology information including which socket, core, thread and core-attributes (if available) it is now assigned. This occurs for all non-disabled KMP_AFFINITY values as well as OMP_PLACES/OMP_PROC_BIND. The information regarding which socket, core, etc. can take on three values: 1) The actual ID of the unit (0 - (N-1)), given N units 2) UNKNOWN_ID (-1) which indicates it does not know which ID 3) MULTIPLE_ID (-2) which indicates the thread is spread across multiple of this unit (e.g., affinity mask is spread across multiple hardware threads) This new information is stored in th_topology_ids[] array. An example how to get the socket Id, one would read th_topology_ids[KMP_HW_SOCKET]. This could be expanded in the future to something more descriptive for the "multiple" case, like a range of values. For now, the single value suffices. The information regarding the core attributes can take on two values: 1) The actual core-type or core-eff 2) KMP_HW_CORE_TYPE_UNKNOWN if the core type is unknown, and UNKNOWN_CORE_EFF (-1) if the core eff is unknown. This new information is stored in th_topology_attrs. An example how to get the core type, one would read th_topology_attrs.core_type. Differential Revision: https://reviews.llvm.org/D139854
2022-11-18[OpenMP] kmp_affinity.h: add RISCV64 supportYabin Cui
In D135552 the #else is added, which causes build error when building openmp on RISCV64. This patch fixed the error: "Unknown or unsupported architecture" Reviewed By: pirama Differential Revision: https://reviews.llvm.org/D138241
2022-11-17[OpenMP] kmp_affinity.h: add LoongArch64 supportzhanglimin
In D135552 the #else is added, which causes build error when building openmp on LoongArch. This patch fixed the error: "Unknown or unsupported architecture" Reviewed By: SixWeining, MaskRay Differential Revision: https://reviews.llvm.org/D137604
2022-10-28[OpenMP][libomp] Add hidden helper affinityJonathan Peyton
Add new hidden helper affinity via the environment variable, KMP_HIDDEN_HELPER_AFFINITY, which allows users to assign thread affinity to hidden helper threads using the same syntax as KMP_AFFINITY. OMP_PLACES/OMP_PROC_BIND have no interaction with KMP_HIDDEN_HELPER_AFFINITY. Differential Revision: https://reviews.llvm.org/D135113
2022-10-28[OpenMP][libomp] Parameterize affinity functionsJonathan Peyton
This patch parameterizes the affinity initialization code to allow multiple affinity settings. Almost all global affinity settings are consolidated and put into a structure kmp_affinity_t. This is in anticipation of the addition of hidden helper affinity which will have the same syntax and semantics as KMP_AFFINITY only for the hidden helper team. Differential Revision: https://reviews.llvm.org/D135109
2022-10-25[OpenMP] kmp_affinity.h: add missing #elseYunQiang Su
When detect __NR_sched_getaffinity. the last #else is missing, which make the last platform MIPS64 failed to build with an error: "Unknown or unsupported architecture" Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D135552
2021-12-20[OpenMP][libomp] Add use-all syntax to KMP_HW_SUBSETJonathan Peyton
This patch allows the user to request all resources of a particular layer (or core-attribute). The syntax of KMP_HW_SUBSET is modified so the number of units requested is optional or can be replaced with an '*' character. e.g., KMP_HW_SUBSET=c:intel_atom@3 will use all the cores after offset 3 e.g., KMP_HW_SUBSET=*c:intel_core will use all the big cores e.g., KMP_HW_SUBSET=*s,*c,1t will use all the sockets, all cores per each socket and 1 thread per core. Differential Revision: https://reviews.llvm.org/D115826
2021-12-10[OpenMP][libomp] Add core attributes to KMP_HW_SUBSETJonathan Peyton
Allow filtering of resources based on core attributes. There are two new attributes added: 1) Core Type (intel_atom, intel_core) 2) Core Efficiency (integer) where the higher the efficiency, the more performant the core On hybrid architectures , e.g., Alder Lake, users can specify KMP_HW_SUBSET=4c:intel_atom,4c:intel_core to select the first four Atom and first four Big cores. The can also use the efficiency syntax. e.g., KMP_HW_SUBSET=2c:eff0,2c:eff1 Differential Revision: https://reviews.llvm.org/D114901
2021-11-17[OpenMP][libomp] Improve Windows Processor Group handling within topologyPeyton, Jonathan L
The current implementation of Windows Processor Groups has a separate topology method to handle them. This patch deprecates that specific method and uses the regular CPUID topology method by default and inserts the Windows Processor Group objects in the topology manually. Notes: * The preference for processor groups is lowered to a value less than socket so that the user will see sockets in the KMP_AFFINITY=verbose output instead of processor groups when sockets=processor groups. * The topology's capacity is modified to handle additional topology layers without the need for reallocation. * If a user asks for a granularity setting that is "above" the processor group layer, then the granularity is adjusted "down" to the processor group since this is the coarsest layer available for threads. Differential Revision: https://reviews.llvm.org/D112273
2021-11-17[OpenMP][libomp] Allow users to specify KMP_HW_SUBSET in any orderPeyton, Jonathan L
Remove restriction forcing users to specify the KMP_HW_SUBSET value in topology order. This patch sorts the user KMP_HW_SUBSET value before trying to apply it. For example: 1s,4c,2t is equivalent to 2t,1s,4c Differential Revision: https://reviews.llvm.org/D112027
2021-10-14[OpenMP][host runtime] Add initial hybrid CPU supportPeyton, Jonathan L
Detect, through CPUID.1A, and show user different core types through KMP_AFFINITY=verbose mechanism. Offer future runtime optimizations __kmp_is_hybrid_cpu() to know whether running on a hybrid system or not. Differential Revision: https://reviews.llvm.org/D110435
2021-05-03[OpenMP] Refactor/Rework topology discovery codePeyton, Jonathan L
This patch does the following: 1) Introduce kmp_topology_t as the runtime-friendly structure (the corresponding global variable is __kmp_topology) to determine the exact machine topology which can vary widely among current and future architectures. The current design is not easy to expand beyond the assumed three layer topology: sockets, cores, and threads so a rework capable of using the existing KMP_AFFINITY mechanisms is required. This new topology structure has: * The depth and types of the topology * Ratio count for each consecutive level (e.g., number of cores per socket, number of threads per core) * Absolute count for each level (e.g., 2 sockets, 16 cores, 32 threads) * Equivalent topology layer map (e.g., Numa domain is equivalent to socket, L1/L2 cache equivalent to core) * Whether it is uniform or not The hardware threads are represented with the kmp_hw_thread_t structure. This structure contains the ids (e.g., socket 0, core 1, thread 0) and other information grabbed from the previous Address structure. The kmp_topology_t structure contains an array of these. 2) Generalize the KMP_HW_SUBSET envirable for the new kmp_topology_t structure. The algorithm doesn't assume any order with tiles,numa domains,sockets,cores,threads. Instead it just parses the envirable, makes sure it is consistent with the detected topology (including taking into account equivalent layers) and then trims away the unneeded subset of hardware threads. To enable this, a new kmp_hw_subset_t structure is introduced which contains a vector of items (hardware type, number user wants, offset). Any keyword within __kmp_hw_get_keyword() can be used as a name and can be shortened as well. e.g., KMP_HW_SUBSET=1s,2numa,4tile,2c,3t can be used on the KNL SNC-4 machine. 3) Simplify topology detection functions so they only do the singular task of detecting the machine's topology. Printing, and all canonicalizing functionality is now done afterwards. So many lines of duplicated code are eliminated. 4) Add new ll_caches and numa_domains to OMP_PLACES, and consequently, KMP_AFFINITY's granularity setting. All the names within __kmp_hw_get_keyword() are available for use in OMP_PLACES or KMP_AFFINITY's granularity setting. 5) Simplify and future-proof code where explicit lists of allowed affinity settings keywords inside if() conditions. 6) Add x86 CPUID leaf 4 cache detection to existing x2apic id method so equivalent caches could be detected (in particular for the ll_caches place). Differential Revision: https://reviews.llvm.org/D100997
2021-02-20[OpenMP][NFC] clang-format the whole openmp projectShilei Tian
Same script as D95318. Test files are excluded. Reviewed By: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D97088
2020-12-31[OpenMP] libomp: Handle implicit conversion warningsTerry Wilmarth
This patch partially prepares the runtime source code to be built with -Wconversion, which should trigger warnings if any implicit conversions can possibly change a value. For builds done with icc or gcc, all such warnings are handled in this patch. clang gives a much longer list of warnings, particularly for sign conversions, which the other compilers don't report. The -Wconversion flag is commented into cmake files, but I'm not going to turn it on. If someone thinks it is important, and wants to fix all the clang warnings, they are welcome to. Types of changes made here involve either improving the consistency of types used so that no conversion is needed, or else performing careful explicit conversions, when we're sure a problem won't arise. Patch is a combination of changes by Terry Wilmarth and Johnny Peyton. Differential Revision: https://reviews.llvm.org/D92942
2020-12-09[OpenMP] Fix norespect affinity bug for WindowsPeyton, Jonathan L
KMP_AFFINITY=norespect was triggering an error because the underlying process affinity mask was not updated to include the entire machine. The Windows documentation states that the thread affinities must be subsets of the process affinity. This patch also moves the printing (for KMP_AFFINITY=verbose) of whether the initial mask was respected out of each topology detection function and to one location where the initial affinity mask is read. Differential Revision: https://reviews.llvm.org/D92587
2020-01-20[OpenMP] affinity little fix for FreeBSDDavid Carlier
- pthread affinity np has different semantic than sched affinity counterpart. On success returns strictly 0. Reviewers: chandlerc, AndreyChurbanov, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D72132
2019-10-08[OpenMP] Enable thread affinity on FreeBSDDavid Carlier
Reviewers: chandlerc, jlpeyton, jdoerfert, dim Reviewed-By: dim Differential Revision: https://reviews.llvm.org/D68580 llvm-svn: 374118
2019-01-19Update more file headers across all of the LLVM projects in the monorepoChandler Carruth
to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648
2018-12-10Support clang compiling under windows-gnu and windows-msvcAndrey Churbanov
Patch by Peiyuan Song <squallatf@gmail.com> Differential Revision: https://reviews.llvm.org/D53422 llvm-svn: 348756
2018-08-09[OpenMP] Cleanup codeJonathan Peyton
This patch cleans up unused functions, variables, sign compare issues, and addresses some -Warning flags which are now enabled including -Wcast-qual. Not all the warning flags in LibompHandleFlags.cmake are enabled, but some are with this patch. Some __kmp_gtid_from_* macros in kmp.h are switched to static inline functions which allows us to remove the awkward definition of KMP_DEBUG_ASSERT() and KMP_ASSERT() macros which used the comma operator. This had to be done for the innumerable -Wunused-value warnings related to KMP_DEBUG_ASSERT() Differential Revision: https://reviews.llvm.org/D49105 llvm-svn: 339393
2017-10-20Apply formatting changesJonathan Peyton
.clang-format's comments are removed and a (hopefully) final set of formatting changes are applied. Differential Revision: https://reviews.llvm.org/D38837 Differential Revision: https://reviews.llvm.org/D38920 llvm-svn: 316227
2017-09-05Minor code cleanup of Klocwork issuesJonathan Peyton
Minor code cleanup of Klocwork issues. Fatal messages are given no return attribute. Define and use KMP_NORETURN to work for multiple C++ versions. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D37275 llvm-svn: 312538
2017-07-18OpenMP RTL cleanup: nullify pointer after memory freeingAndrey Churbanov
Differential Revision: https://reviews.llvm.org/D35497 llvm-svn: 308274
2017-07-17OpenMP RTL cleanup: eliminated warnings with -Wcast-qual, patch 2.Andrey Churbanov
Changes are: got all atomics to accept volatile pointers that allowed to simplify many type conversions. Windows specific code fixed correspondingly. Differential Revision: https://reviews.llvm.org/D35417 llvm-svn: 308164
2017-07-03OpenMP RTL cleanup: eliminated warnings with -Wcast-qual.Andrey Churbanov
Changes are: replaced C-style casts with cons_cast and reinterpret_cast; type of several counters changed to signed; type of parameters of 32-bit and 64-bit AND and OR intrinsics changes to unsigned; changed files formatted using clang-format version 3.8.1. Differential Revision: https://reviews.llvm.org/D34759 llvm-svn: 307020
2017-05-12Clang-format and whitespace cleanup of source codeJonathan Peyton
This patch contains the clang-format and cleanup of the entire code base. Some of clang-formats changes made the code look worse in places. A best effort was made to resolve the bulk of these problems, but many remain. Most of the problems were mangling line-breaks and tabbing of comments. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D32659 llvm-svn: 302929
2016-12-08Support of mips & mips64 for openmprtlSylvestre Ledru
Summary: Implemented by Dejan Latinovic See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=790735 for more more information Reviewers: AndreyChurbanov, jlpeyton Subscribers: openmp-commits, mgorny Differential Revision: https://reviews.llvm.org/D26576 llvm-svn: 289032
2016-11-14Introduce dynamic affinity dispatch capabilitiesJonathan Peyton
This set of changes enables the affinity interface (Either the preexisting native operating system or HWLOC) to be dynamically set at runtime initialization. The point of this change is that we were seeing performance degradations when using HWLOC. This allows the user to use the old affinity mechanisms which on large machines (>64 cores) makes a large difference in initialization time. These changes mostly move affinity code under a small class hierarchy: KMPAffinity class Mask {} KMPNativeAffinity : public KMPAffinity class Mask : public KMPAffinity::Mask KMPHwlocAffinity class Mask : public KMPAffinity::Mask Since all interface functions (for both affinity and the mask implementation) are virtual, the implementation can be chosen at runtime initialization. Differential Revision: https://reviews.llvm.org/D26356 llvm-svn: 286890
2016-09-02Move function into cpp file under KMP_AFFINITY_SUPPORTED guard.Jonathan Peyton
When affinity isn't supported, __kmp_affinity_compact doesn't exist. The problem is that in kmp_affinity.h there is a function which uses it without the proper KMP_AFFINITY_SUPPORTED guard around it. The compiler was smart enough to ignore it and the function __kmp_affinity_cmp_Address_child_num which relies on it, but I think it is cleaner to have it under the proper guard. Since the function is only used in the kmp_affinity.cpp file and there aren't any plans to have it elsewhere. I have moved it there. llvm-svn: 280542
2015-11-30Adding Hwloc library option for affinity mechanismJonathan Peyton
These changes allow libhwloc to be used as the topology discovery/affinity mechanism for libomp. It is supported on Unices. The code additions: * Canonicalize KMP_CPU_* interface macros so bitmask operations are implementation independent and work with both hwloc bitmaps and libomp bitmaps. So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and the like. These are all in kmp.h and appropriately placed. * Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc interface to create a libomp address2os object which the rest of libomp knows how to handle already. * To build, use -DLIBOMP_USE_HWLOC=on and -DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake can't find the library or hwloc.h, then it will tell you and exit. Differential Revision: http://reviews.llvm.org/D13991 llvm-svn: 254320
2015-11-09Improvements to machine_hierarchy code for re-sizingJonathan Peyton
These changes include: 1) Machine hierarchy now uses the base_num_threads field to indicate the maximum number of threads the current hierarchy can handle without a resize. 2) In __kmp_get_hierarchy, we need to get depth after any potential resize is done. 3) Cleanup of hierarchy resize code to support 1 above. Differential Revision: http://reviews.llvm.org/D14455 llvm-svn: 252475
2015-09-10Fix depth field bug and resize() function in hierarchical barrierJonathan Peyton
This is a follow up to the hierarchy cleanup patch. Added some clarifying comments to hierarchy_info. Fixed a bug with the depth field not being updated cleanly during a resize. Fixed resize to first check capacity as determined by maxLevels before actually doing the full resize. Differential Revision: http://reviews.llvm.org/D12562 llvm-svn: 247333
2015-09-10Cleanup of affinity hierarchy code.Jonathan Peyton
Some of this is improvement to code suggested by Hal Finkel. Four changes here: 1.Cleanup of hierarchy code to handle all hierarchy cases whether affinity is available or not 2.Separated this and other classes and common functions out to a header file 3.Added a destructor-like fini function for the hierarchy (and call in __kmp_cleanup) 4.Remove some redundant code that is hopefully no longer needed Differential Revision: http://reviews.llvm.org/D12449 llvm-svn: 247326