summaryrefslogtreecommitdiff
path: root/libc/src/string
AgeCommit message (Collapse)Author
2025-11-10[libc] add an SVE implementation of strlen (#167259)Schrodinger ZHU Yifan
This PR creates an SVE-based implementation for strlen by translating from the AOR code in tree. Microbenchmark shows improvements against NEON when N>=64. Although both implementations fall behind glibc by a large margin, this may be a good start point to explore SVE implementations. Together with the PR: 1. Added two more tests of strlen with special nul symbols. 2. Added strlen's fuzzer and fix a typo in previous heap fuzzer. ``` === strlen(16 bytes) === libc: 1.56115 ns/call, 9.54499 GiB/s neon: 1.59393 ns/call, 9.34867 GiB/s sve: 1.66097 ns/call, 8.97134 GiB/s === strlen(64 bytes) === libc: 2.06967 ns/call, 28.7991 GiB/s neon: 2.59914 ns/call, 22.9325 GiB/s sve: 2.58628 ns/call, 23.0465 GiB/s === strlen(256 bytes) === libc: 3.74165 ns/call, 63.7202 GiB/s neon: 8.98243 ns/call, 26.5428 GiB/s sve: 7.36426 ns/call, 32.3751 GiB/s === strlen(1024 bytes) === libc: 10.5327 ns/call, 90.5438 GiB/s neon: 34.363 ns/call, 27.7529 GiB/s sve: 26.9329 ns/call, 35.4092 GiB/s === strlen(4096 bytes) === libc: 37.7304 ns/call, 101.104 GiB/s neon: 145.911 ns/call, 26.144 GiB/s sve: 103.208 ns/call, 36.9612 GiB/s === strlen(1048576 bytes) === libc: 9623.4 ns/call, 101.478 GiB/s neon: 36138.2 ns/call, 27.023 GiB/s sve: 26605.6 ns/call, 36.7051 GiB/s ```
2025-11-06[libc] Fix stale char_ptr for find_first_character_wide read (#166594)Sterling-Augustine
On exit from the loop, char_ptr had not been updated to match block_ptr, resulting in erroneous results. Moving all updates out of the loop fixes that. Adjust derefences to always be inside bounds checks.
2025-11-05[libc] Migrate ctype_utils to use char instead of int where applicable. ↵Alexey Samsonov
(#166225) Functions like isalpha / tolower can operate on chars internally. This allows us to get rid of unnecessary casts and open a way to creating wchar_t overloads with the same names (e.g. for isalpha), that would simplify templated code for conversion functions (see 315dfe5865962d8a3d60e21d1fffce5214fe54ef). Add the int->char converstion to public entrypoints implementation instead. We also need to introduce bounds check on the input argument values - these functions' behavior is unspecified if the argument is neither EOF nor fits in "unsigned char" range, but the tests we've had verified that they always return false for small negative values. To preserve this behavior, cover it explicitly.
2025-10-27Move LIBC_CONF_STRING_UNSAFE_WIDE_READ to top-level libc-configuration (#165046)Sterling-Augustine
This options sets a compile option when building sources inside the string directory, and this option affects string_utils.h. But string_utils.h is #included from more places than just the string directory (such as from __support/CPP/string.h), leading to both narrow-reads in those cases, but more seriously, ODR violations when the two different string_length implementations are included int he same program. Having this option at the top level avoids this problem.
2025-10-13Revert "[libc] Implement branchless head-tail comparison for bcmp" (#162859)Guillaume Chatelet
Reverts llvm/llvm-project#107540 This PR demonstrated improvements on micro-benchmarks but the gains did not seem to materialize in production. We are reverting this change for now to get more data. This PR might be reintegrated later once we're more confident in its effects.
2025-10-13[libc] Use UMAXV.4S to reduce bcmp result.Peter Collingbourne
We can use UMAXV.4S to reduce the comparison result in a single instruction. This improves performance by roughly 4% on Apple M1: Summary bin/libc.src.string.bcmp_benchmark3 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 ran 1.01 ± 0.02 times faster than bin/libc.src.string.bcmp_benchmark3 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.01 ± 0.03 times faster than bin/libc.src.string.bcmp_benchmark3 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.01 ± 0.03 times faster than bin/libc.src.string.bcmp_benchmark3 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.01 ± 0.02 times faster than bin/libc.src.string.bcmp_benchmark2 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.02 ± 0.03 times faster than bin/libc.src.string.bcmp_benchmark2 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.03 ± 0.03 times faster than bin/libc.src.string.bcmp_benchmark2 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.03 ± 0.03 times faster than bin/libc.src.string.bcmp_benchmark2 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.05 ± 0.02 times faster than bin/libc.src.string.bcmp_benchmark1 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.05 ± 0.02 times faster than bin/libc.src.string.bcmp_benchmark1 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.05 ± 0.03 times faster than bin/libc.src.string.bcmp_benchmark1 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 1.05 ± 0.02 times faster than bin/libc.src.string.bcmp_benchmark1 --study-name="new bcmp" --sweep-mode --sweep-max-size=128 --output=/dev/null --num-trials=10 (1 = original, 2 = a variant of this patch that uses UMAXV.16B, 3 = this patch) Reviewers: michaelrj-google, gchatelet, overmighty, SchrodingerZhu Pull Request: https://github.com/llvm/llvm-project/pull/99260
2025-10-01[libc] Unify and extend no_sanitize attributes for strlen. (#161316)Alexey Samsonov
Fast strlen implementations (naive wide-reads, SIMD-based, and x86_64/aarch64-optimized versions) all may perform technically-out-of-bound reads, which leads to reports under ASan, HWASan (on ARM machines), and also TSan (which also has the capability to detect heap out-of-bound reads). So, we need to explicitly disable instrumentation in all three cases. Tragically, Clang didn't support `[[gnu::no_sanitize]]` syntax until recently, and since we're supporting both GCC and Clang, we have to revert to `__attribute__` syntax.
2025-09-29[libc][msvc] fix mathlib build on WoA (#161258)Schrodinger ZHU Yifan
Fix build errors encountered when building math library on WoA. 1. Skip FEnv equality check for MSVC 2. Provide a placeholder type for vector types.
2025-09-26[libc] Update the memory helper functions for simd types (#160174)Joseph Huber
Summary: This unifies the interface to just be a bunch of `load` and `store` functions that optionally accept a mask / indices for gathers and scatters with masks. I had to rename this from `load` and `store` because it conflicts with the other version in `op_generic`. I might just work around that with a trait instead.
2025-09-16[libc] Clean up mask helpers after allowing implicit conversions (#158681)Joseph Huber
Summary: I landed a change in clang that allows integral vectors to implicitly convert to boolean ones. This means I can simplify the interface and remove the need to cast to bool on every use. Also do some other cleanups of the traits.
2025-09-12[libc] Some MSVC compatibility changes for src/string/memory_utils. (#158393)lntue
2025-09-12[libc] Change __builtin_memcpy to inline_memcpy. (#158345)lntue
2025-09-03[libc] fixed signed char issues in strsep()/strtok()/strtok_r(). (#156705)enh-google
Also add the missing tests for all the related functions (even the ones that were already right), and add the missing bazel build rules.
2025-09-02[libc] Add more elementwise wrapper functions (#156515)Joseph Huber
Summary: Fills out some of the missing fundamental floating point operations. These just wrap the elementwise builtin of the same name.
2025-09-02 [libc] Implement generic SIMD helper 'simd.h' and implement strlen (#152605)Joseph Huber
Summary: This PR introduces a new 'simd.h' header that implements an interface similar to the proposed `stdx::simd` in C++. However, we instead wrap around the LLVM internal type. This makes heavy use of the clang vector extensions and boolean vectors, instead using primitive vector types instead of a class (many benefits to this). I use this interface to implement a generic strlen implementation, but propse we use this for math. Right now this requires a feature only introduced in clang-22.
2025-08-21[libc] fix strsep()/strtok()/strtok_r() "subsequent searches" behavior. ↵enh-google
(#154370) These functions turned out to have the same bug that was in wcstok() (fixed by 4fc9801), so add the missing tests and fix the code in a way that matches wcstok(). Also fix incorrect test expectations in existing tests. Also update the BUILD.bazel files to actually build the strsep() test.
2025-08-21Disable asan on last wide string functionJoseph Huber
2025-08-21Fix wide read defaultsJoseph Huber
2025-08-21Reapply "[libc] Enable wide-read memory operations by default on Linux ↵Joseph Huber
(#154602)" (#154640) Reland afterr the sanitizer and arm32 builds complained.
2025-08-20Revert "[libc] Enable wide-read memory operations by default on Linux (#154602)"Joseph Huber
This reverts commit c80d1483c6d787edf62ff9e86b1e97af5eb5abf9.
2025-08-20[libc] Enable wide-read memory operations by default on Linux (#154602)Joseph Huber
Summary: This patch changes the linux build to use the wide reads on the memory operations by default. These memory functions will now potentially read outside of the bounds explicitly allowed by the current function. While technically undefined behavior in the standard, plenty of C library implementations do this. it will not cause a segmentation fault on linux as long as you do not cross a page boundary, and because we are only *reading* memory it should not have atomic effects.
2025-08-19Add vector-based strlen implementation for x86_64 and aarch64 (#152389)Sterling-Augustine
These replace the default LIBC_CONF_STRING_UNSAFE_WIDE_READ implementation on x86_64 and aarch64. These are substantially faster than both the character-by-character implementation and the original unsafe_wide_read implementation. Some below I have been unable to performance-test the aarch64 version, but I suspect speedups similar to avx2. ``` Function: strlen Variant: char wide ull sse2 avx2 avx512 ============================================================================================================================================================= length=1, alignment=1: 13.18 20.47 (-55.24%) 20.21 (-53.27%) 32.50 (-146.54%) 26.05 (-97.61%) 18.03 (-36.74%) length=1, alignment=0: 12.80 34.92 (-172.89%) 20.01 (-56.39%) 17.52 (-36.86%) 17.78 (-38.92%) 18.04 (-40.94%) length=2, alignment=2: 9.91 19.02 (-91.95%) 12.64 (-27.52%) 11.06 (-11.59%) 9.48 ( 4.38%) 9.48 ( 4.34%) length=2, alignment=0: 9.56 26.88 (-181.24%) 12.64 (-32.31%) 11.06 (-15.73%) 11.06 (-15.72%) 11.83 (-23.80%) length=3, alignment=3: 8.31 10.45 (-25.84%) 8.28 ( 0.32%) 8.28 ( 0.36%) 6.21 ( 25.28%) 6.21 ( 25.24%) length=3, alignment=0: 8.39 14.53 (-73.20%) 8.28 ( 1.33%) 7.24 ( 13.69%) 7.56 ( 9.94%) 7.25 ( 13.65%) length=4, alignment=4: 9.84 21.76 (-121.24%) 15.55 (-58.11%) 6.57 ( 33.18%) 5.02 ( 48.98%) 6.00 ( 39.00%) length=4, alignment=0: 8.64 13.70 (-58.51%) 7.28 ( 15.73%) 6.37 ( 26.31%) 6.36 ( 26.36%) 6.36 ( 26.36%) length=5, alignment=5: 11.85 23.81 (-100.97%) 12.17 ( -2.67%) 5.68 ( 52.09%) 4.87 ( 58.94%) 6.48 ( 45.33%) length=5, alignment=0: 11.82 13.64 (-15.42%) 7.27 ( 38.45%) 6.36 ( 46.15%) 6.37 ( 46.11%) 6.36 ( 46.14%) length=6, alignment=6: 10.50 19.37 (-84.56%) 13.64 (-29.93%) 6.54 ( 37.71%) 6.89 ( 34.35%) 9.45 ( 10.01%) length=6, alignment=0: 14.96 14.05 ( 6.04%) 6.49 ( 56.62%) 5.68 ( 62.04%) 5.68 ( 62.04%) 13.15 ( 12.05%) length=7, alignment=7: 10.97 18.02 (-64.35%) 14.59 (-33.06%) 6.36 ( 41.96%) 5.46 ( 50.25%) 5.46 ( 50.25%) length=7, alignment=0: 10.96 15.76 (-43.77%) 15.37 (-40.15%) 6.96 ( 36.51%) 5.68 ( 48.22%) 7.04 ( 35.83%) length=4, alignment=0: 8.66 13.69 (-58.02%) 7.28 ( 16.00%) 6.37 ( 26.44%) 6.37 ( 26.52%) 6.61 ( 23.74%) length=4, alignment=7: 8.87 17.35 (-95.73%) 12.18 (-37.39%) 5.68 ( 35.94%) 4.87 ( 45.11%) 6.00 ( 32.36%) length=4, alignment=2: 8.67 10.05 (-15.91%) 7.28 ( 16.01%) 7.37 ( 15.02%) 5.46 ( 37.02%) 5.47 ( 36.89%) length=2, alignment=2: 5.64 10.01 (-77.64%) 7.29 (-29.34%) 6.37 (-13.04%) 5.46 ( 3.19%) 5.46 ( 3.19%) length=8, alignment=0: 12.78 16.52 (-29.33%) 18.27 (-43.00%) 11.82 ( 7.47%) 9.83 ( 23.03%) 11.46 ( 10.27%) length=8, alignment=7: 14.24 17.30 (-21.49%) 12.16 ( 14.59%) 5.68 ( 60.14%) 4.87 ( 65.83%) 6.23 ( 56.28%) length=8, alignment=3: 12.34 26.15 (-111.98%) 12.20 ( 1.14%) 6.50 ( 47.34%) 4.87 ( 60.54%) 6.18 ( 49.94%) length=5, alignment=3: 10.95 19.74 (-80.30%) 12.17 (-11.11%) 5.68 ( 48.16%) 4.87 ( 55.56%) 5.96 ( 45.55%) length=16, alignment=0: 20.33 29.29 (-44.08%) 36.18 (-77.97%) 5.68 ( 72.06%) 5.68 ( 72.08%) 10.60 ( 47.86%) length=16, alignment=7: 19.29 17.52 ( 9.16%) 12.98 ( 32.73%) 7.05 ( 63.47%) 4.87 ( 74.75%) 6.23 ( 67.71%) length=16, alignment=4: 20.54 25.18 (-22.56%) 15.42 ( 24.92%) 7.31 ( 64.43%) 4.87 ( 76.29%) 5.98 ( 70.88%) length=10, alignment=4: 14.59 21.26 (-45.71%) 12.17 ( 16.58%) 5.68 ( 61.07%) 4.87 ( 66.65%) 6.00 ( 58.91%) length=32, alignment=0: 35.46 22.00 ( 37.95%) 16.22 ( 54.26%) 7.32 ( 79.35%) 5.68 ( 83.98%) 7.01 ( 80.22%) length=32, alignment=7: 35.23 24.14 ( 31.48%) 16.22 ( 53.96%) 7.30 ( 79.28%) 8.76 ( 75.12%) 6.14 ( 82.58%) length=32, alignment=5: 35.16 28.56 ( 18.76%) 16.22 ( 53.87%) 7.30 ( 79.23%) 6.77 ( 80.75%) 9.82 ( 72.07%) length=21, alignment=5: 26.47 27.66 ( -4.49%) 15.04 ( 43.17%) 6.90 ( 73.95%) 4.87 ( 81.60%) 6.04 ( 77.18%) length=64, alignment=0: 66.45 25.16 ( 62.14%) 22.70 ( 65.83%) 12.99 ( 80.44%) 7.47 ( 88.77%) 8.70 ( 86.90%) length=64, alignment=7: 64.75 27.78 ( 57.10%) 22.72 ( 64.91%) 10.85 ( 83.25%) 7.46 ( 88.48%) 8.68 ( 86.60%) length=64, alignment=6: 67.26 28.58 ( 57.51%) 22.70 ( 66.24%) 11.26 ( 83.25%) 9.46 ( 85.94%) 13.90 ( 79.33%) length=42, alignment=6: 73.42 27.97 ( 61.91%) 19.46 ( 73.49%) 8.92 ( 87.84%) 6.49 ( 91.16%) 6.00 ( 91.83%) length=128, alignment=0: 172.07 39.18 ( 77.23%) 35.68 ( 79.26%) 13.02 ( 92.43%) 12.98 ( 92.46%) 9.76 ( 94.33%) length=128, alignment=7: 163.98 43.79 ( 73.30%) 36.03 ( 78.03%) 15.68 ( 90.44%) 11.35 ( 93.08%) 10.51 ( 93.59%) length=128, alignment=7: 185.86 40.27 ( 78.33%) 36.04 ( 80.61%) 13.78 ( 92.58%) 11.35 ( 93.89%) 10.49 ( 94.36%) length=85, alignment=7: 121.61 55.66 ( 54.23%) 32.34 ( 73.40%) 13.88 ( 88.59%) 7.30 ( 94.00%) 8.72 ( 92.83%) length=256, alignment=0: 295.54 66.48 ( 77.50%) 61.63 ( 79.15%) 19.54 ( 93.39%) 12.97 ( 95.61%) 12.45 ( 95.79%) length=256, alignment=7: 308.06 78.92 ( 74.38%) 61.63 ( 80.00%) 22.90 ( 92.57%) 12.97 ( 95.79%) 13.23 ( 95.71%) length=256, alignment=8: 295.32 65.83 ( 77.71%) 61.62 ( 79.13%) 23.19 ( 92.15%) 12.97 ( 95.61%) 13.50 ( 95.43%) length=170, alignment=8: 234.39 48.79 ( 79.18%) 43.79 ( 81.32%) 16.22 ( 93.08%) 13.97 ( 94.04%) 10.48 ( 95.53%) length=512, alignment=0: 563.75 116.89 ( 79.27%) 114.99 ( 79.60%) 62.71 ( 88.88%) 19.58 ( 96.53%) 17.76 ( 96.85%) length=512, alignment=7: 580.53 120.91 ( 79.17%) 114.47 ( 80.28%) 37.75 ( 93.50%) 19.55 ( 96.63%) 18.68 ( 96.78%) length=512, alignment=9: 584.05 128.35 ( 78.02%) 114.74 ( 80.35%) 39.09 ( 93.31%) 19.76 ( 96.62%) 18.71 ( 96.80%) length=341, alignment=9: 405.84 90.87 ( 77.61%) 78.79 ( 80.59%) 28.77 ( 92.91%) 14.60 ( 96.40%) 14.15 ( 96.51%) length=1024, alignment=0: 1143.61 247.03 ( 78.40%) 243.70 ( 78.69%) 75.59 ( 93.39%) 67.02 ( 94.14%) 28.99 ( 97.46%) length=1024, alignment=7: 1124.55 267.87 ( 76.18%) 259.16 ( 76.95%) 64.96 ( 94.22%) 33.05 ( 97.06%) 30.91 ( 97.25%) length=1024, alignment=10: 1459.58 257.79 ( 82.34%) 239.91 ( 83.56%) 65.00 ( 95.55%) 33.10 ( 97.73%) 30.33 ( 97.92%) length=682, alignment=10: 732.89 163.67 ( 77.67%) 170.54 ( 76.73%) 46.48 ( 93.66%) 24.32 ( 96.68%) 21.44 ( 97.07%) length=2048, alignment=0: 2141.96 451.61 ( 78.92%) 448.00 ( 79.08%) 133.24 ( 93.78%) 61.22 ( 97.14%) 80.08 ( 96.26%) length=2048, alignment=7: 2145.05 458.26 ( 78.64%) 449.99 ( 79.02%) 140.19 ( 93.46%) 60.26 ( 97.19%) 51.71 ( 97.59%) length=2048, alignment=11: 2162.61 463.37 ( 78.57%) 448.07 ( 79.28%) 140.29 ( 93.51%) 59.51 ( 97.25%) 51.59 ( 97.61%) length=1365, alignment=11: 1439.74 322.86 ( 77.58%) 310.84 ( 78.41%) 116.08 ( 91.94%) 42.43 ( 97.05%) 36.15 ( 97.49%) length=4096, alignment=0: 4278.68 871.60 ( 79.63%) 865.25 ( 79.78%) 252.50 ( 94.10%) 161.17 ( 96.23%) 94.97 ( 97.78%) length=4096, alignment=7: 4253.01 871.62 ( 79.51%) 864.21 ( 79.68%) 243.90 ( 94.27%) 171.17 ( 95.98%) 95.14 ( 97.76%) length=4096, alignment=12: 4252.18 879.66 ( 79.31%) 863.68 ( 79.69%) 244.26 ( 94.26%) 185.36 ( 95.64%) 93.61 ( 97.80%) length=2730, alignment=12: 2868.22 597.65 ( 79.16%) 586.22 ( 79.56%) 175.09 ( 93.90%) 120.35 ( 95.80%) 101.35 ( 96.47%) length=0, alignment=0: 4.87 8.11 (-66.73%) 6.49 (-33.34%) 5.80 (-19.26%) 5.68 (-16.67%) 6.86 (-40.91%) length=32, alignment=0: 33.82 22.36 ( 33.89%) 17.03 ( 49.66%) 7.30 ( 78.42%) 5.68 ( 83.22%) 7.50 ( 77.83%) length=64, alignment=0: 66.20 26.76 ( 59.58%) 23.22 ( 64.93%) 12.99 ( 80.37%) 7.34 ( 88.92%) 8.44 ( 87.25%) length=96, alignment=0: 130.26 31.62 ( 75.72%) 30.00 ( 76.97%) 11.39 ( 91.26%) 10.54 ( 91.91%) 8.68 ( 93.34%) length=128, alignment=0: 164.66 39.05 ( 76.29%) 35.68 ( 78.33%) 13.07 ( 92.07%) 12.97 ( 92.12%) 9.59 ( 94.18%) length=160, alignment=0: 196.63 45.18 ( 77.02%) 42.16 ( 78.56%) 14.65 ( 92.55%) 10.87 ( 94.47%) 9.31 ( 95.27%) length=192, alignment=0: 225.50 52.71 ( 76.63%) 49.61 ( 78.00%) 16.22 ( 92.81%) 11.36 ( 94.96%) 11.08 ( 95.09%) length=224, alignment=0: 261.08 57.57 ( 77.95%) 55.82 ( 78.62%) 17.84 ( 93.17%) 12.16 ( 95.34%) 11.51 ( 95.59%) length=256, alignment=0: 295.13 65.56 ( 77.79%) 62.59 ( 78.79%) 19.46 ( 93.41%) 13.12 ( 95.56%) 12.33 ( 95.82%) length=288, alignment=0: 325.69 72.16 ( 77.84%) 69.20 ( 78.75%) 21.08 ( 93.53%) 13.94 ( 95.72%) 12.32 ( 96.22%) length=320, alignment=0: 364.18 78.78 ( 78.37%) 75.69 ( 79.21%) 22.71 ( 93.77%) 14.70 ( 95.96%) 14.46 ( 96.03%) length=352, alignment=0: 391.40 84.87 ( 78.32%) 82.15 ( 79.01%) 24.50 ( 93.74%) 15.62 ( 96.01%) 14.27 ( 96.35%) length=384, alignment=0: 428.50 91.43 ( 78.66%) 88.70 ( 79.30%) 26.16 ( 93.90%) 17.29 ( 95.97%) 15.04 ( 96.49%) length=416, alignment=0: 457.30 98.23 ( 78.52%) 95.02 ( 79.22%) 27.81 ( 93.92%) 17.22 ( 96.23%) 15.05 ( 96.71%) length=448, alignment=0: 488.38 104.52 ( 78.60%) 101.87 ( 79.14%) 31.22 ( 93.61%) 18.07 ( 96.30%) 16.89 ( 96.54%) length=480, alignment=0: 526.44 109.61 ( 79.18%) 108.11 ( 79.46%) 31.11 ( 94.09%) 18.88 ( 96.41%) 17.10 ( 96.75%) length=512, alignment=0: 556.50 117.29 ( 78.92%) 113.78 ( 79.56%) 62.57 ( 88.76%) 19.88 ( 96.43%) 17.80 ( 96.80%) length=576, alignment=0: 622.17 152.93 ( 75.42%) 127.58 ( 79.49%) 39.34 ( 93.68%) 21.31 ( 96.58%) 19.99 ( 96.79%) length=640, alignment=0: 691.01 142.56 ( 79.37%) 161.78 ( 76.59%) 39.20 ( 94.33%) 22.98 ( 96.67%) 20.13 ( 97.09%) length=704, alignment=0: 756.90 156.31 ( 79.35%) 176.19 ( 76.72%) 45.03 ( 94.05%) 24.82 ( 96.72%) 22.33 ( 97.05%) length=768, alignment=0: 826.23 193.17 ( 76.62%) 188.41 ( 77.20%) 50.81 ( 93.85%) 27.46 ( 96.68%) 23.25 ( 97.19%) length=832, alignment=0: 890.17 204.81 ( 76.99%) 201.61 ( 77.35%) 53.77 ( 93.96%) 27.73 ( 96.88%) 25.06 ( 97.18%) length=896, alignment=0: 959.52 217.89 ( 77.29%) 213.86 ( 77.71%) 57.99 ( 93.96%) 29.53 ( 96.92%) 26.29 ( 97.26%) length=960, alignment=0: 1024.52 231.06 ( 77.45%) 227.05 ( 77.84%) 60.36 ( 94.11%) 32.29 ( 96.85%) 27.94 ( 97.27%) length=1024, alignment=0: 1086.71 244.17 ( 77.53%) 239.87 ( 77.93%) 64.72 ( 94.04%) 72.38 ( 93.34%) 28.72 ( 97.36%) length=1152, alignment=0: 1231.48 270.22 ( 78.06%) 266.47 ( 78.36%) 73.38 ( 94.04%) 40.24 ( 96.73%) 32.42 ( 97.37%) length=1280, alignment=0: 1349.29 295.45 ( 78.10%) 292.69 ( 78.31%) 111.80 ( 91.71%) 42.44 ( 96.85%) 34.59 ( 97.44%) length=1408, alignment=0: 1487.13 322.57 ( 78.31%) 318.18 ( 78.60%) 84.47 ( 94.32%) 44.35 ( 97.02%) 37.31 ( 97.49%) length=1536, alignment=0: 1623.52 347.98 ( 78.57%) 344.24 ( 78.80%) 108.31 ( 93.33%) 49.82 ( 96.93%) 39.94 ( 97.54%) length=1664, alignment=0: 1748.88 373.80 ( 78.63%) 370.03 ( 78.84%) 118.76 ( 93.21%) 52.89 ( 96.98%) 42.93 ( 97.55%) length=1792, alignment=0: 1886.22 399.59 ( 78.82%) 397.39 ( 78.93%) 127.32 ( 93.25%) 53.64 ( 97.16%) 45.39 ( 97.59%) length=1920, alignment=0: 2018.37 425.98 ( 78.89%) 422.31 ( 79.08%) 126.70 ( 93.72%) 57.08 ( 97.17%) 48.12 ( 97.62%) length=2048, alignment=0: 2167.09 451.70 ( 79.16%) 447.70 ( 79.34%) 141.68 ( 93.46%) 61.63 ( 97.16%) 79.06 ( 96.35%) length=2304, alignment=0: 2422.03 503.63 ( 79.21%) 502.23 ( 79.26%) 149.62 ( 93.82%) 73.10 ( 96.98%) 56.97 ( 97.65%) length=2560, alignment=0: 2678.68 556.84 ( 79.21%) 553.24 ( 79.35%) 161.06 ( 93.99%) 127.74 ( 95.23%) 58.81 ( 97.80%) length=2816, alignment=0: 2941.95 608.70 ( 79.31%) 604.03 ( 79.47%) 171.85 ( 94.16%) 87.11 ( 97.04%) 67.08 ( 97.72%) length=3072, alignment=0: 3229.89 660.14 ( 79.56%) 659.19 ( 79.59%) 183.85 ( 94.31%) 140.25 ( 95.66%) 73.01 ( 97.74%) length=3328, alignment=0: 3496.08 713.05 ( 79.60%) 710.00 ( 79.69%) 209.72 ( 94.00%) 138.78 ( 96.03%) 77.81 ( 97.77%) length=3584, alignment=0: 3756.52 766.19 ( 79.60%) 763.94 ( 79.66%) 214.16 ( 94.30%) 146.36 ( 96.10%) 83.43 ( 97.78%) length=3840, alignment=0: 4017.15 817.43 ( 79.65%) 819.77 ( 79.59%) 242.07 ( 93.97%) 164.56 ( 95.90%) 89.72 ( 97.77%) length=4096, alignment=0: 4281.59 867.87 ( 79.73%) 864.71 ( 79.80%) 243.33 ( 94.32%) 173.11 ( 95.96%) 95.65 ( 97.77%) length=4608, alignment=0: 4810.30 977.80 ( 79.67%) 985.03 ( 79.52%) 271.13 ( 94.36%) 190.62 ( 96.04%) 107.82 ( 97.76%) length=5120, alignment=0: 5380.16 1075.77 ( 80.00%) 1071.80 ( 80.08%) 294.27 ( 94.53%) 206.04 ( 96.17%) 141.90 ( 97.36%) length=5632, alignment=0: 5925.70 1195.61 ( 79.82%) 1193.68 ( 79.86%) 323.42 ( 94.54%) 223.55 ( 96.23%) 125.28 ( 97.89%) length=6144, alignment=0: 6402.20 1285.52 ( 79.92%) 1281.04 ( 79.99%) 342.68 ( 94.65%) 234.84 ( 96.33%) 167.01 ( 97.39%) length=6656, alignment=0: 6997.01 1387.32 ( 80.17%) 1384.21 ( 80.22%) 365.93 ( 94.77%) 269.89 ( 96.14%) 176.40 ( 97.48%) length=7168, alignment=0: 7454.76 1492.10 ( 79.98%) 1488.45 ( 80.03%) 391.92 ( 94.74%) 280.81 ( 96.23%) 187.73 ( 97.48%) length=7680, alignment=0: 8163.34 1608.43 ( 80.30%) 1615.98 ( 80.20%) 460.03 ( 94.36%) 299.86 ( 96.33%) 201.40 ( 97.53%) ```
2025-07-24[libc] Implemented wcsdup libc function (#150453)Uzair Nawaz
Implemented wcsdup by templating internal strdup function
2025-07-23[libc][NFC] Add stdint.h proxy header to fix dependency issue with ↵lntue
<stdint.h> includes. (#150303) https://github.com/llvm/llvm-project/issues/149993
2025-07-21[libc] Add dependency <stdint.h> to src/string/string_utils.h (#149849)William Huynh
string_utils.h uses uintptr_t, and there seems to be no tracking of this dependency. It seems upstream builds are unaffected but downstream this is causing a lot of flaky builds.
2025-07-17[libc] Improve Cortex `memset` and `memcpy` functions (#149044)Guillaume Chatelet
The code for `memcpy` is the same as in #148204 but it fixes the build bot error by using `static_assert(cpp::always_false<decltype(access)>)` instead of `static_assert(false)` (older compilers fails on `static_assert(false)` in `constexpr` `else` bodies). The code for `memset` is new and vastly improves performance over the current byte per byte implementation. Both `memset` and `memcpy` implementations use prefetching for sizes >= 64. This lowers a bit the performance for sizes between 64 and 256 but improves throughput for greater sizes.
2025-07-16Revert "[libc][NFC] refactor Cortex `memcpy` code" (#149035)Guillaume Chatelet
Reverts llvm/llvm-project#148204 `libc-arm32-qemu-debian-dbg` is failing, reverting and investigating
2025-07-16[libc][NFC] refactor Cortex `memcpy` code (#148204)Guillaume Chatelet
This patch is in preparation for the Cortex `memset` implementation. It improves the codegen by generating a prefetch for large sizes.
2025-06-26[libc] Improve memcpy for ARM Cortex-M supporting unaligned accesses. (#144872)Guillaume Chatelet
This implementation has been compiled with the [pigweed toolchain](https://pigweed.dev/toolchain.html) and tested on: - Raspberry Pi Pico 2 with the following options\ `--target=armv8m.main-none-eabi` `-march=armv8m.main+fp+dsp` `-mcpu=cortex-m33` - Raspberry Pi Pico with the following options\ `--target=armv6m-none-eabi` `-march=armv6m` `-mcpu=cortex-m0+` They both compile down to a little bit more than 200 bytes and are between 2 and 10 times faster than byte per byte copies. For best performance the following options can be set in the `libc/config/baremetal/arm/config.json` ``` { "codegen": { "LIBC_CONF_KEEP_FRAME_POINTER": { "value": false } }, "general": { "LIBC_ADD_NULL_CHECKS": { "value": false } } } ```
2025-06-13Fix string_length function so that it always returns. (#144148)Amy Huang
Previously setting LIBC_COPT_STRING_UNSAFE_WIDE_READ would cause a build error because there is a path in the ifdef that doesn't return anything.
2025-06-12[libc] Independent strcat/strncat/stpcpy (#142643)Michael Jones
The previous implementations called other entrypoints. This patch fixes strcat, strncat, and stpcpy to be properly independent.
2025-06-11[libc] Move libc_errno.h to libc/src/__support and make ↵lntue
LIBC_ERRNO_MODE_SYSTEM to be header-only. (#143187) This is the first step in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-06-06[libc] clean up string_utils memory functions (#143031)Michael Jones
The string_utils.h file previously included both memcpy and bzero. There were no uses of bzero, and only one use of memcpy which was replaced with __builtin_memcpy. Also fix strsep which was broken by this change, fix a useless assert of "sizeof(char) == sizeof(cpp::byte)", and update the bazel.
2025-06-06[libc] Correct x86_64 architecture for string(s) tests. (#143150)lntue
2025-06-04[libc] Expand usage of libc null checks. (#116262)Aly ElAshram
Fixes #111546 --------- Co-authored-by: alyyelashram <150528548+alyyelashram@users.noreply.github.com>
2025-05-02[libc] Add support for string/memory_utils functions for AArch64 without HW ↵William
FP/SIMD (#137592) Add conditional compilation to add support for AArch64 without vector registers and/or hardware FPUs by using the generic implementation. **Context:** A few functions were hard-coded to use vector registers/hardware FPUs. This meant that libc would not compile on architectures that did not support these features. This fix falls back on the generic implementation if a feature is not supported.
2025-03-14[libc] Fix memmove macros for unreocognized targetsJoseph Huber
2025-03-14[libc] Default to `byte_per_byte` instead of erroring (#131340)Joseph Huber
Summary: Right now a lot of the memory functions error if we don't have specific handling for them. This is weird because we have a generic implementation that should just be used whenever someone hasn't written a more optimized version. This allows us to use the `libc` headers with more architectures from the `shared/` directory without worrying about it breaking.
2025-03-10[libc] Add `-Wno-sign-conversion` & re-attempt `-Wconversion` (#129811)Vinay Deshmukh
Relates to https://github.com/llvm/llvm-project/issues/119281#issuecomment-2699470459
2025-03-05Revert "[libc] Enable -Wconversion for tests. (#127523)"Augie Fackler
This reverts commit 1e6e845d49a336e9da7ca6c576ec45c0b419b5f6 because it changed the 1st parameter of adjust() to be unsigned, but libc itself calls adjust() with a negative argument in align_backward() in op_generic.h.
2025-03-04[libc] Fix casts for arm32 after Wconversion (#129771)Michael Jones
Followup to #127523 There were some test failures on arm32 after enabling Wconversion. There were some tests that were failing due to missing casts. Also I changed BigInt's `safe_get_at` back to being signed since it needed the ability to be negative.
2025-03-04[libc] Enable -Wconversion for tests. (#127523)Vinay Deshmukh
Relates to: #119281
2025-02-05[libc] Fix all imports of src/string/memory_utils (#114939)Krishna Pandey
Fixed imports for all files *within* `libc/src/string/memory_utils`. Note: This doesn't include **all** files that need to be fixed. Fixes #86579
2025-01-23[libc][wchar] implement wcslen (#124150)Nick Desaulniers
Update string_utils' string_length to work with char* or wchar_t*, so that it may be reusable when implementing wmemchr, wcspbrk, wcsrchr, wcsstr. Link: #121183 Link: #124027 Co-authored-by: Nick Desaulniers <ndesaulniers@google.com> --------- Co-authored-by: Tristan Ross <tristan.ross@midstall.com>
2024-12-10[libc] move bcmp, bzero, bcopy, index, rindex, strcasecmp, strncasecmp to ↵Nick Desaulniers
strings.h (#118899) docgen relies on the convention that we have a file foo.cpp in libc/src/\<header\>/. Because the above functions weren't in libc/src/strings/ but rather libc/src/string/, docgen could not find that we had implemented these. Rather than add special carve outs to docgen, let's fix up our sources for these 7 functions to stick with the existing conventions the rest of the codebase follows. Link: #118860 Fixes: #118875
2024-11-26[libc] suppress more clang-cl warnings (#117718)Schrodinger ZHU Yifan
- migrate more `-O3` to `${libc_opt_high_flag}` - workaround a issue with `LLP64` in test. The overflow testing is guarded by a constexpr but the literal overflow itself will still trigger warnings. Notice that for math smoke test, for some reasons, the `${libc_opt_high_flag}` will be passed into `lld-link` which confuses the linker so there are still some warnings leftover there. I can investigate more when I have time.
2024-11-25[libc] suppress string warning in case intrinsics are defined as macros ↵Schrodinger ZHU Yifan
(#117640)
2024-11-21Fix typo "intead"Jay Foad
2024-11-13[libc] Rename libc/src/__support/endian.h to endian_internal.h (#115950)Daniel Thornburgh
This prevents a conflict with the Linux system endian.h when built in overlay mode for CPP files in __support. This issue appeared in PR #106259.
2024-11-01[libc] Remove the #include <stdlib.h> header (#114453)Job Henandez Lara