summaryrefslogtreecommitdiff
path: root/libc/test/src/string/memory_utils/utils_test.cpp
AgeCommit message (Collapse)Author
2025-09-24[libc][NFC] Remove usage of the C keyword `I`. (#160567)lntue
2025-03-10[libc] Add `-Wno-sign-conversion` & re-attempt `-Wconversion` (#129811)Vinay Deshmukh
Relates to https://github.com/llvm/llvm-project/issues/119281#issuecomment-2699470459
2025-03-05Revert "[libc] Enable -Wconversion for tests. (#127523)"Augie Fackler
This reverts commit 1e6e845d49a336e9da7ca6c576ec45c0b419b5f6 because it changed the 1st parameter of adjust() to be unsigned, but libc itself calls adjust() with a negative argument in align_backward() in op_generic.h.
2025-03-04[libc] Enable -Wconversion for tests. (#127523)Vinay Deshmukh
Relates to: #119281
2024-07-12[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98597)Petr Hosek
This is a part of #97655.
2024-07-12Revert "[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace ↵Mehdi Amini
declaration" (#98593) Reverts llvm/llvm-project#98075 bots are broken
2024-07-11[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98075)Petr Hosek
This is a part of #97655.
2023-12-05[reland][libc][NFC] Remove __support/bit.h and use __support/CPP/bit.h ↵Guillaume Chatelet
instead (#73939) (#74446) Same as #73939 but also fix `libc/src/string/memory_utils/op_aarch64.h` that was still using `deferred_static_assert`.
2023-12-05Revert "[libc][NFC] Remove __support/bit.h and use __support/CPP/bit.h ↵Guillaume Chatelet
instead" (#74444) Reverts llvm/llvm-project#73939 This broke libc-aarch64-ubuntu build bot https://lab.llvm.org/buildbot/#/builders/138/builds/56186
2023-12-05[libc][NFC] Remove __support/bit.h and use __support/CPP/bit.h instead (#73939)Guillaume Chatelet
2023-09-26[libc] Mass replace enclosing namespace (#67032)Guillaume Chatelet
This is step 4 of https://discourse.llvm.org/t/rfc-customizable-namespace-to-allow-testing-the-libc-when-the-system-libc-is-also-llvms-libc/73079
2023-06-14[libc] Dispatch memmove to memcpy when buffers are disjointGuillaume Chatelet
Most of the time `memmove` is called on buffers that are disjoint, in that case we can use `memcpy` which is faster. The additional test is branchless on x86, aarch64 and RISCV with the zbb extension (bitmanip). On x86 this patch adds a latency of 2 to 3 cycles. Before ``` -------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... -------------------------------------------------------------------------------- BM_Memmove/0/0_median 5.00 ns 5.00 ns 10 bytes_per_cycle=1.25477/s bytes_per_second=2.62933G/s items_per_second=199.87M/s __llvm_libc::memmove,memmove Google A BM_Memmove/1/0_median 6.21 ns 6.21 ns 10 bytes_per_cycle=3.22173/s bytes_per_second=6.75106G/s items_per_second=160.955M/s __llvm_libc::memmove,memmove Google B BM_Memmove/2/0_median 8.09 ns 8.09 ns 10 bytes_per_cycle=5.31462/s bytes_per_second=11.1366G/s items_per_second=123.603M/s __llvm_libc::memmove,memmove Google D BM_Memmove/3/0_median 5.95 ns 5.95 ns 10 bytes_per_cycle=2.71865/s bytes_per_second=5.69687G/s items_per_second=167.967M/s __llvm_libc::memmove,memmove Google L BM_Memmove/4/0_median 5.63 ns 5.63 ns 10 bytes_per_cycle=2.28294/s bytes_per_second=4.78383G/s items_per_second=177.615M/s __llvm_libc::memmove,memmove Google M BM_Memmove/5/0_median 5.68 ns 5.68 ns 10 bytes_per_cycle=2.16798/s bytes_per_second=4.54295G/s items_per_second=176.015M/s __llvm_libc::memmove,memmove Google Q BM_Memmove/6/0_median 7.46 ns 7.46 ns 10 bytes_per_cycle=3.97619/s bytes_per_second=8.332G/s items_per_second=134.044M/s __llvm_libc::memmove,memmove Google S BM_Memmove/7/0_median 5.40 ns 5.40 ns 10 bytes_per_cycle=1.79695/s bytes_per_second=3.76546G/s items_per_second=185.211M/s __llvm_libc::memmove,memmove Google U BM_Memmove/8/0_median 5.62 ns 5.62 ns 10 bytes_per_cycle=3.18747/s bytes_per_second=6.67927G/s items_per_second=177.983M/s __llvm_libc::memmove,memmove Google W BM_Memmove/9/0_median 101 ns 101 ns 10 bytes_per_cycle=9.77359/s bytes_per_second=20.4803G/s items_per_second=9.9333M/s __llvm_libc::memmove,uniform 384 to 4096 ``` After ``` BM_Memmove/0/0_median 3.57 ns 3.57 ns 10 bytes_per_cycle=1.71375/s bytes_per_second=3.59112G/s items_per_second=280.411M/s __llvm_libc::memmove,memmove Google A BM_Memmove/1/0_median 4.52 ns 4.52 ns 10 bytes_per_cycle=4.47557/s bytes_per_second=9.37843G/s items_per_second=221.427M/s __llvm_libc::memmove,memmove Google B BM_Memmove/2/0_median 5.70 ns 5.70 ns 10 bytes_per_cycle=7.37396/s bytes_per_second=15.4519G/s items_per_second=175.399M/s __llvm_libc::memmove,memmove Google D BM_Memmove/3/0_median 4.47 ns 4.47 ns 10 bytes_per_cycle=3.4148/s bytes_per_second=7.15563G/s items_per_second=223.743M/s __llvm_libc::memmove,memmove Google L BM_Memmove/4/0_median 4.53 ns 4.53 ns 10 bytes_per_cycle=2.86071/s bytes_per_second=5.99454G/s items_per_second=220.69M/s __llvm_libc::memmove,memmove Google M BM_Memmove/5/0_median 4.19 ns 4.19 ns 10 bytes_per_cycle=2.5484/s bytes_per_second=5.3401G/s items_per_second=238.924M/s __llvm_libc::memmove,memmove Google Q BM_Memmove/6/0_median 5.02 ns 5.02 ns 10 bytes_per_cycle=5.94164/s bytes_per_second=12.4505G/s items_per_second=199.14M/s __llvm_libc::memmove,memmove Google S BM_Memmove/7/0_median 4.03 ns 4.03 ns 10 bytes_per_cycle=2.47028/s bytes_per_second=5.17641G/s items_per_second=247.906M/s __llvm_libc::memmove,memmove Google U BM_Memmove/8/0_median 4.70 ns 4.70 ns 10 bytes_per_cycle=3.84975/s bytes_per_second=8.06706G/s items_per_second=212.72M/s __llvm_libc::memmove,memmove Google W BM_Memmove/9/0_median 90.7 ns 90.7 ns 10 bytes_per_cycle=10.8681/s bytes_per_second=22.7739G/s items_per_second=11.02M/s __llvm_libc::memmove,uniform 384 to 4096 ``` Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D152811
2023-05-23[libc][math] Implement double precision log2 function correctly rounded to ↵Tue Ly
all rounding modes. Implement double precision log2 function correctly rounded to all rounding modes. See https://reviews.llvm.org/D150014 for a more detail description of the algorithm. **Performance** - For `0.5 <= x <= 2`, the fast pass hitting rate is about 99.91%. - Reciprocal throughput from CORE-MATH's perf tool on Ryzen 5900X: ``` $ ./perf.sh log2 GNU libc version: 2.35 GNU libc release: stable -- CORE-MATH reciprocal throughput -- with FMA [####################] 100 % Ntrial = 20 ; Min = 15.458 + 0.204 clc/call; Median-Min = 0.224 clc/call; Max = 15.867 clc/call; -- CORE-MATH reciprocal throughput -- without FMA (-march=x86-64-v2) [####################] 100 % Ntrial = 20 ; Min = 23.711 + 0.524 clc/call; Median-Min = 0.443 clc/call; Max = 25.307 clc/call; -- System LIBC reciprocal throughput -- [####################] 100 % Ntrial = 20 ; Min = 14.807 + 0.199 clc/call; Median-Min = 0.211 clc/call; Max = 15.137 clc/call; -- LIBC reciprocal throughput -- with FMA [####################] 100 % Ntrial = 20 ; Min = 17.666 + 0.274 clc/call; Median-Min = 0.298 clc/call; Max = 18.531 clc/call; -- LIBC reciprocal throughput -- without FMA [####################] 100 % Ntrial = 20 ; Min = 26.534 + 0.418 clc/call; Median-Min = 0.462 clc/call; Max = 27.327 clc/call; ``` - Latency from CORE-MATH's perf tool on Ryzen 5900X: ``` $ ./perf.sh log2 --latency GNU libc version: 2.35 GNU libc release: stable -- CORE-MATH latency -- with FMA [####################] 100 % Ntrial = 20 ; Min = 46.048 + 1.643 clc/call; Median-Min = 1.694 clc/call; Max = 48.018 clc/call; -- CORE-MATH latency -- without FMA (-march=x86-64-v2) [####################] 100 % Ntrial = 20 ; Min = 62.333 + 0.138 clc/call; Median-Min = 0.119 clc/call; Max = 62.583 clc/call; -- System LIBC latency -- [####################] 100 % Ntrial = 20 ; Min = 45.206 + 1.503 clc/call; Median-Min = 1.467 clc/call; Max = 47.229 clc/call; -- LIBC latency -- with FMA [####################] 100 % Ntrial = 20 ; Min = 43.042 + 0.454 clc/call; Median-Min = 0.484 clc/call; Max = 43.912 clc/call; -- LIBC latency -- without FMA [####################] 100 % Ntrial = 20 ; Min = 57.016 + 1.636 clc/call; Median-Min = 1.655 clc/call; Max = 58.816 clc/call; ``` - Accurate pass latency: ``` $ ./perf.sh log2 --latency --simple_stat GNU libc version: 2.35 GNU libc release: stable -- CORE-MATH latency -- with FMA 177.632 -- CORE-MATH latency -- without FMA (-march=x86-64-v2) 231.332 -- LIBC latency -- with FMA 459.751 -- LIBC latency -- without FMA 463.850 ``` Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D150374
2023-05-10[libc] Add optimized memcpy for RISCVGuillaume Chatelet
This patch adds two versions of memcpy optimized for architectures where unaligned accesses are either illegal or extremely slow. It is currently enabled for RISCV 64 and RISCV 32 but it could be used for ARM 32 architectures as well. Here is the before / after output of `libc.benchmarks.memory_functions.opt_host --benchmark_filter=BM_Memcpy` on a quad core Linux starfive RISCV 64 board running at 1.5GHz. Before: ``` Run on (4 X 1500 MHz CPU s) CPU Caches: L1 Instruction 32 KiB (x4) L1 Data 32 KiB (x4) L2 Unified 2048 KiB (x1) ------------------------------------------------------------------------ Benchmark Time CPU Iterations UserCounters... ------------------------------------------------------------------------ BM_Memcpy/0/0 474 ns 474 ns 1483776 bytes_per_cycle=0.243492/s bytes_per_second=348.318M/s items_per_second=2.11097M/s __llvm_libc::memcpy,memcpy Google A BM_Memcpy/1/0 210 ns 209 ns 3649536 bytes_per_cycle=0.233819/s bytes_per_second=334.481M/s items_per_second=4.77519M/s __llvm_libc::memcpy,memcpy Google B BM_Memcpy/2/0 1814 ns 1814 ns 396288 bytes_per_cycle=0.247899/s bytes_per_second=354.622M/s items_per_second=551.402k/s __llvm_libc::memcpy,memcpy Google D BM_Memcpy/3/0 89.3 ns 89.2 ns 7459840 bytes_per_cycle=0.217415/s bytes_per_second=311.014M/s items_per_second=11.2071M/s __llvm_libc::memcpy,memcpy Google L BM_Memcpy/4/0 134 ns 134 ns 3815424 bytes_per_cycle=0.226584/s bytes_per_second=324.131M/s items_per_second=7.44567M/s __llvm_libc::memcpy,memcpy Google M BM_Memcpy/5/0 52.8 ns 52.6 ns 11001856 bytes_per_cycle=0.194893/s bytes_per_second=278.797M/s items_per_second=19.0284M/s __llvm_libc::memcpy,memcpy Google Q BM_Memcpy/6/0 180 ns 180 ns 4101120 bytes_per_cycle=0.231884/s bytes_per_second=331.713M/s items_per_second=5.55957M/s __llvm_libc::memcpy,memcpy Google S BM_Memcpy/7/0 195 ns 195 ns 3906560 bytes_per_cycle=0.232951/s bytes_per_second=333.239M/s items_per_second=5.1217M/s __llvm_libc::memcpy,memcpy Google U BM_Memcpy/8/0 152 ns 152 ns 4789248 bytes_per_cycle=0.227507/s bytes_per_second=325.452M/s items_per_second=6.58187M/s __llvm_libc::memcpy,memcpy Google W BM_Memcpy/9/0 6036 ns 6033 ns 118784 bytes_per_cycle=0.249158/s bytes_per_second=356.423M/s items_per_second=165.75k/s __llvm_libc::memcpy,uniform 384 to 4096 ``` After: ``` BM_Memcpy/0/0 126 ns 126 ns 5770240 bytes_per_cycle=1.04707/s bytes_per_second=1.46273G/s items_per_second=7.9385M/s __llvm_libc::memcpy,memcpy Google A BM_Memcpy/1/0 75.1 ns 75.0 ns 10204160 bytes_per_cycle=0.691143/s bytes_per_second=988.687M/s items_per_second=13.3289M/s __llvm_libc::memcpy,memcpy Google B BM_Memcpy/2/0 333 ns 333 ns 2174976 bytes_per_cycle=1.39297/s bytes_per_second=1.94596G/s items_per_second=3.00002M/s __llvm_libc::memcpy,memcpy Google D BM_Memcpy/3/0 49.6 ns 49.5 ns 16092160 bytes_per_cycle=0.710161/s bytes_per_second=1015.89M/s items_per_second=20.1844M/s __llvm_libc::memcpy,memcpy Google L BM_Memcpy/4/0 57.7 ns 57.7 ns 11213824 bytes_per_cycle=0.561557/s bytes_per_second=803.314M/s items_per_second=17.3228M/s __llvm_libc::memcpy,memcpy Google M BM_Memcpy/5/0 48.0 ns 47.9 ns 16437248 bytes_per_cycle=0.346708/s bytes_per_second=495.97M/s items_per_second=20.8571M/s __llvm_libc::memcpy,memcpy Google Q BM_Memcpy/6/0 67.5 ns 67.5 ns 10616832 bytes_per_cycle=0.614173/s bytes_per_second=878.582M/s items_per_second=14.8142M/s __llvm_libc::memcpy,memcpy Google S BM_Memcpy/7/0 84.7 ns 84.6 ns 10480640 bytes_per_cycle=0.819077/s bytes_per_second=1.14424G/s items_per_second=11.8174M/s __llvm_libc::memcpy,memcpy Google U BM_Memcpy/8/0 61.7 ns 61.6 ns 11191296 bytes_per_cycle=0.550078/s bytes_per_second=786.893M/s items_per_second=16.2279M/s __llvm_libc::memcpy,memcpy Google W BM_Memcpy/9/0 981 ns 981 ns 703488 bytes_per_cycle=1.52333/s bytes_per_second=2.12807G/s items_per_second=1019.81k/s __llvm_libc::memcpy,uniform 384 to 4096 ``` It is not as good as glibc for now so there's room for improvement. I suspect a path pumping 16 bytes at once given the doubled numbers for large copies. ``` BM_Memcpy/0/1 146 ns 82.5 ns 8576000 bytes_per_cycle=1.35236/s bytes_per_second=1.88922G/s items_per_second=12.1169M/s glibc memcpy,memcpy Google A BM_Memcpy/1/1 112 ns 63.7 ns 10634240 bytes_per_cycle=0.628018/s bytes_per_second=898.387M/s items_per_second=15.702M/s glibc memcpy,memcpy Google B BM_Memcpy/2/1 315 ns 180 ns 4079616 bytes_per_cycle=2.65229/s bytes_per_second=3.7052G/s items_per_second=5.54764M/s glibc memcpy,memcpy Google D BM_Memcpy/3/1 85.3 ns 43.1 ns 15854592 bytes_per_cycle=0.774164/s bytes_per_second=1107.45M/s items_per_second=23.2249M/s glibc memcpy,memcpy Google L BM_Memcpy/4/1 105 ns 54.3 ns 13427712 bytes_per_cycle=0.7793/s bytes_per_second=1114.8M/s items_per_second=18.4109M/s glibc memcpy,memcpy Google M BM_Memcpy/5/1 77.1 ns 43.2 ns 16476160 bytes_per_cycle=0.279808/s bytes_per_second=400.269M/s items_per_second=23.1428M/s glibc memcpy,memcpy Google Q BM_Memcpy/6/1 112 ns 62.7 ns 11236352 bytes_per_cycle=0.676078/s bytes_per_second=967.137M/s items_per_second=15.9387M/s glibc memcpy,memcpy Google S BM_Memcpy/7/1 131 ns 65.5 ns 11751424 bytes_per_cycle=0.965616/s bytes_per_second=1.34895G/s items_per_second=15.2762M/s glibc memcpy,memcpy Google U BM_Memcpy/8/1 104 ns 55.0 ns 12314624 bytes_per_cycle=0.583336/s bytes_per_second=834.468M/s items_per_second=18.1937M/s glibc memcpy,memcpy Google W BM_Memcpy/9/1 932 ns 466 ns 1480704 bytes_per_cycle=3.17342/s bytes_per_second=4.43321G/s items_per_second=2.14679M/s glibc memcpy,uniform 384 to 4096 ``` Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D150202
2023-02-07[libc][NFC] Move UnitTest and IntegrationTest to the 'test' directory.Siva Chandra Reddy
This part of the effort to make all test related pieces into the `test` directory. This helps is excluding test related pieces in a straight forward manner if LLVM_INCLUDE_TESTS is OFF. Future patches will also move the MPFR wrapper and testutils into the 'test' directory.
2022-10-18[libc][NFC] Cleanup and document utils.hGuillaume Chatelet
2022-10-14Revert "[libc] New version of the mem* framework"Sterling Augustine
This reverts commit https://reviews.llvm.org/D135134 (b3f1d58a131eb546aaf1ac165c77ccb89c40d758) That revision appears to have broken Arm memcpy in some subtle ways. Am communicating with the original author to get a good reproduction.
2022-10-14[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. It also provides minimal implementations for ARM platforms. Codegen can be checked here https://godbolt.org/z/chf1Y6eGM Differential Revision: https://reviews.llvm.org/D135134
2022-10-14Revert "[libc] New version of the mem* framework"Guillaume Chatelet
This reverts commit 9721687835a7df5da0c9482cf684c11b8ba97f75.
2022-10-14[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. It also provides minimal implementations for ARM platforms. Codegen can be checked here https://godbolt.org/z/x19zvE59v Differential Revision: https://reviews.llvm.org/D135134
2022-10-14Revert "[libc] New version of the mem* framework"Guillaume Chatelet
This reverts commit 98bf836f3127a346a81da5ae3e27246935298de4.
2022-10-14[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. It also provides minimal implementations for ARM platforms. Codegen can be checked here https://godbolt.org/z/x19zvE59v Differential Revision: https://reviews.llvm.org/D135134
2022-10-13Revert "[libc] New version of the mem* framework"Guillaume Chatelet
This reverts commit d55f2d8ab076298cfd745c05c1b4dfd5583f8b9e.
2022-10-13[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. It also provides minimal implementations for ARM platforms. Codegen can be checked here https://godbolt.org/z/x19zvE59v Differential Revision: https://reviews.llvm.org/D135134
2022-10-12Revert "[libc] New version of the mem* framework"Guillaume Chatelet
This reverts commit 4c19439d249256db720e323a446e39d05496732f.
2022-10-12[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. This patch is not meant to be submitted but gives an idea of the change. Codegen can be checked in https://godbolt.org/z/6z1dEoWbs by removing the "static inline" before individual functions. Unittests are coming. Suggested review order: - utils - op_base - op_builtin - op_generic - op_x86 / op_aarch64 - *_implementations.h Differential Revision: https://reviews.llvm.org/D135134
2022-09-29[libc][NFC] Move alignment utils to utils.hGuillaume Chatelet
2022-08-01Reland [libc][NFC] Use STL case for arrayGuillaume Chatelet
This is a reland of https://reviews.llvm.org/D130773
2022-08-01Revert "[libc][NFC] Use STL case for array"Guillaume Chatelet
This reverts commit 7add0e5fdc5c7cb6f59f60cd436bf161cf9f9eb7.
2022-08-01[libc][NFC] Use STL case for arrayGuillaume Chatelet
Migrating all private STL code to the standard STL case but keeping it under the CPP namespace to avoid confusion. Differential Revision: https://reviews.llvm.org/D130773
2021-10-28[libc][NFC] Move utils/CPP to src/__support/CPP.Siva Chandra Reddy
The idea is to move all pieces related to the actual libc sources to the "src" directory. This allows downstream users to ship and build just the "src" directory. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D112653
2021-01-20[libc][NFC] add "LlvmLibc" as a prefix to all test namesMichael Jones
Summary: Having a consistent prefix makes selecting all of the llvm libc tests easier on any platform that is also using the gtest framework. This also modifies the TEST and TEST_F macros to enforce this change moving forward. Reviewers: sivachandra Subscribers:
2020-04-08[libc][NFC] Make all top of file comments consistent.Paula Toth
Summary: Made all header files consistent based of this documentation: https://llvm.org/docs/CodingStandards.html#file-headers. And did the same for all source files top of file comments. Reviewers: sivachandra, abrachet Reviewed By: sivachandra, abrachet Subscribers: MaskRay, tschuett, libc-commits Tags: #libc-project Differential Revision: https://reviews.llvm.org/D77533
2020-03-18[libc] Adding memcpy implementation for x86_64Guillaume Chatelet
Summary: The patch is not ready yet and is here to discuss a few options: - How do we customize the implementation? (i.e. how to define `kRepMovsBSize`), - How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in), - How do we build? We may want to test in debug but build the libc with `-march=native` for instance, - Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but: - If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation, - With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed), - For msvc we'd need to resort on the compiler optimization passes. Reviewers: sivachandra, abrachet Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet Tags: #libc-project Differential Revision: https://reviews.llvm.org/D74397
2020-02-13Fix unneeded semi columnGuillaume Chatelet
2020-01-31[libc] Use cpp::Array instead of cpp::ArrayRef in memory/utils_test.Siva Chandra Reddy
Building with address-sanitizer shows that using ArrayRef ends up accessing a temporary outside its scope.
2020-01-31[libc] Add utils for memory functionsGuillaume Chatelet
Summary: This patch adds a few low level functions needed to build libc memory functions. Reviewers: sivachandra, jfb Subscribers: mgorny, MaskRay, tschuett, libc-commits, ckennelly Tags: #libc-project Differential Revision: https://reviews.llvm.org/D73472