summaryrefslogtreecommitdiff
path: root/libc/src/string/memmove.cpp
AgeCommit message (Collapse)Author
2025-06-04[libc] Expand usage of libc null checks. (#116262)Aly ElAshram
Fixes #111546 --------- Co-authored-by: alyyelashram <150528548+alyyelashram@users.noreply.github.com>
2024-07-12[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98597)Petr Hosek
This is a part of #97655.
2024-07-12Revert "[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace ↵Mehdi Amini
declaration" (#98593) Reverts llvm/llvm-project#98075 bots are broken
2024-07-11[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98075)Petr Hosek
This is a part of #97655.
2023-10-26[libc] memmove optimizations (#70043)Dmitry Vyukov
1. Remove is_disjoint check for smaller sizes and reduce code bloat. inline_memmove may handle some small sizes as efficiently as inline_memcpy. For these sizes we may not do is_disjoint check. This both avoids additional code for the most frequent smaller sizes and removes code bloat (we don't need the memcpy logic for small sizes). Here we heavily rely on inlining and dead code elimination: from the first inline_memmove we should get only handling of small sizes, and from the second inline_memmove and inline_memcpy we should get only handling of larger sizes. 2. Use the memcpy thresholds for memmove. Memcpy thresholds were more carefully tuned. This becomes more important since we use memmove for all small sizes always now. 3. Fix boundary conditions for sizes = 16/32/64. See the added comment for explanations. Memmove function size drops from 885 to 715 bytes due to removed duplication. ``` │ baseline │ small-size │ │ sec/op │ sec/op vs base │ memmove/Google_A 3.208n ± 0% 2.911n ± 0% -9.25% (n=100) memmove/Google_B 4.113n ± 1% 3.428n ± 0% -16.65% (n=100) memmove/Google_D 5.838n ± 0% 4.158n ± 0% -28.78% (n=100) memmove/Google_S 4.712n ± 1% 3.899n ± 0% -17.25% (n=100) memmove/Google_U 3.609n ± 0% 3.247n ± 1% -10.02% (n=100) memmove/0 2.982n ± 0% 2.169n ± 0% -27.26% (n=50) memmove/1 3.253n ± 0% 2.168n ± 0% -33.34% (n=50) memmove/2 3.255n ± 0% 2.169n ± 0% -33.38% (n=50) memmove/3 3.259n ± 2% 2.175n ± 0% -33.27% (p=0.000 n=50) memmove/4 3.259n ± 0% 2.168n ± 5% -33.46% (p=0.000 n=50) memmove/5 2.488n ± 0% 1.926n ± 0% -22.57% (p=0.000 n=50) memmove/6 2.490n ± 0% 1.928n ± 0% -22.59% (p=0.000 n=50) memmove/7 2.492n ± 0% 1.927n ± 0% -22.65% (p=0.000 n=50) memmove/8 2.737n ± 0% 2.711n ± 0% -0.97% (p=0.000 n=50) memmove/9 2.736n ± 0% 2.711n ± 0% -0.94% (p=0.000 n=50) memmove/10 2.739n ± 0% 2.711n ± 0% -1.04% (p=0.000 n=50) memmove/11 2.740n ± 0% 2.711n ± 0% -1.07% (p=0.000 n=50) memmove/12 2.740n ± 0% 2.711n ± 0% -1.09% (p=0.000 n=50) memmove/13 2.744n ± 0% 2.711n ± 0% -1.22% (p=0.000 n=50) memmove/14 2.742n ± 0% 2.711n ± 0% -1.14% (p=0.000 n=50) memmove/15 2.742n ± 0% 2.711n ± 0% -1.15% (p=0.000 n=50) memmove/16 2.997n ± 0% 2.981n ± 0% -0.52% (p=0.000 n=50) memmove/17 2.998n ± 0% 2.981n ± 0% -0.55% (p=0.000 n=50) memmove/18 2.998n ± 0% 2.981n ± 0% -0.55% (p=0.000 n=50) memmove/19 2.999n ± 0% 2.982n ± 0% -0.59% (p=0.000 n=50) memmove/20 2.998n ± 0% 2.981n ± 0% -0.55% (p=0.000 n=50) memmove/21 3.000n ± 0% 2.981n ± 0% -0.61% (p=0.000 n=50) memmove/22 3.002n ± 0% 2.981n ± 0% -0.68% (p=0.000 n=50) memmove/23 3.002n ± 0% 2.981n ± 0% -0.67% (p=0.000 n=50) memmove/24 3.002n ± 0% 2.981n ± 0% -0.70% (n=50) memmove/25 3.002n ± 0% 2.981n ± 0% -0.68% (p=0.000 n=50) memmove/26 3.004n ± 0% 2.982n ± 0% -0.74% (p=0.000 n=50) memmove/27 3.005n ± 0% 2.981n ± 0% -0.79% (n=50) memmove/28 3.005n ± 0% 2.982n ± 0% -0.77% (n=50) memmove/29 3.009n ± 0% 2.981n ± 0% -0.92% (n=50) memmove/30 3.008n ± 0% 2.981n ± 0% -0.89% (n=50) memmove/31 3.007n ± 0% 2.982n ± 0% -0.86% (n=50) memmove/32 3.540n ± 0% 2.998n ± 0% -15.31% (p=0.000 n=50) memmove/33 3.544n ± 0% 2.997n ± 0% -15.44% (p=0.000 n=50) memmove/34 3.546n ± 0% 2.999n ± 0% -15.42% (n=50) memmove/35 3.545n ± 0% 2.999n ± 0% -15.40% (n=50) memmove/36 3.548n ± 0% 2.998n ± 0% -15.52% (p=0.000 n=50) memmove/37 3.546n ± 0% 3.000n ± 0% -15.41% (n=50) memmove/38 3.549n ± 0% 2.999n ± 0% -15.49% (p=0.000 n=50) memmove/39 3.549n ± 0% 2.999n ± 0% -15.48% (p=0.000 n=50) memmove/40 3.549n ± 0% 3.000n ± 0% -15.46% (p=0.000 n=50) memmove/41 3.550n ± 0% 3.001n ± 0% -15.47% (n=50) memmove/42 3.549n ± 0% 3.001n ± 0% -15.43% (n=50) memmove/43 3.552n ± 0% 3.001n ± 0% -15.52% (p=0.000 n=50) memmove/44 3.552n ± 0% 3.001n ± 0% -15.51% (n=50) memmove/45 3.552n ± 0% 3.002n ± 0% -15.48% (n=50) memmove/46 3.554n ± 0% 3.001n ± 0% -15.55% (p=0.000 n=50) memmove/47 3.556n ± 0% 3.002n ± 0% -15.58% (p=0.000 n=50) memmove/48 3.555n ± 0% 3.003n ± 0% -15.54% (n=50) memmove/49 3.557n ± 0% 3.002n ± 0% -15.59% (p=0.000 n=50) memmove/50 3.557n ± 0% 3.004n ± 0% -15.55% (p=0.000 n=50) memmove/51 3.556n ± 0% 3.004n ± 0% -15.53% (p=0.000 n=50) memmove/52 3.561n ± 0% 3.004n ± 0% -15.65% (p=0.000 n=50) memmove/53 3.558n ± 0% 3.004n ± 0% -15.57% (p=0.000 n=50) memmove/54 3.561n ± 0% 3.005n ± 0% -15.62% (n=50) memmove/55 3.560n ± 0% 3.006n ± 0% -15.57% (n=50) memmove/56 3.562n ± 0% 3.006n ± 0% -15.60% (p=0.000 n=50) memmove/57 3.563n ± 0% 3.006n ± 0% -15.64% (n=50) memmove/58 3.565n ± 0% 3.007n ± 0% -15.64% (p=0.000 n=50) memmove/59 3.564n ± 0% 3.006n ± 0% -15.66% (p=0.000 n=50) memmove/60 3.570n ± 0% 3.008n ± 0% -15.74% (p=0.000 n=50) memmove/61 3.566n ± 0% 3.009n ± 0% -15.63% (p=0.000 n=50) memmove/62 3.567n ± 0% 3.007n ± 0% -15.70% (p=0.000 n=50) memmove/63 3.568n ± 0% 3.008n ± 0% -15.71% (p=0.000 n=50) memmove/64 4.104n ± 0% 3.008n ± 0% -26.70% (p=0.000 n=50) memmove/65 4.126n ± 0% 3.662n ± 0% -11.26% (p=0.000 n=50) memmove/66 4.128n ± 0% 3.662n ± 0% -11.29% (n=50) memmove/67 4.129n ± 0% 3.662n ± 0% -11.31% (n=50) memmove/68 4.129n ± 0% 3.661n ± 0% -11.33% (p=0.000 n=50) memmove/69 4.130n ± 0% 3.662n ± 0% -11.34% (p=0.000 n=50) memmove/70 4.130n ± 0% 3.662n ± 0% -11.33% (n=50) memmove/71 4.132n ± 0% 3.662n ± 0% -11.38% (p=0.000 n=50) memmove/72 4.131n ± 0% 3.661n ± 0% -11.39% (n=50) memmove/73 4.135n ± 0% 3.661n ± 0% -11.45% (p=0.000 n=50) memmove/74 4.137n ± 0% 3.662n ± 0% -11.49% (n=50) memmove/75 4.138n ± 0% 3.662n ± 0% -11.51% (p=0.000 n=50) memmove/76 4.139n ± 0% 3.661n ± 0% -11.56% (p=0.000 n=50) memmove/77 4.136n ± 0% 3.662n ± 0% -11.47% (p=0.000 n=50) memmove/78 4.143n ± 0% 3.661n ± 0% -11.62% (p=0.000 n=50) memmove/79 4.142n ± 0% 3.661n ± 0% -11.60% (n=50) memmove/80 4.142n ± 0% 3.661n ± 0% -11.62% (p=0.000 n=50) memmove/81 4.140n ± 0% 3.661n ± 0% -11.57% (n=50) memmove/82 4.146n ± 0% 3.661n ± 0% -11.69% (n=50) memmove/83 4.143n ± 0% 3.661n ± 0% -11.63% (p=0.000 n=50) memmove/84 4.143n ± 0% 3.661n ± 0% -11.63% (n=50) memmove/85 4.147n ± 0% 3.661n ± 0% -11.73% (p=0.000 n=50) memmove/86 4.142n ± 0% 3.661n ± 0% -11.62% (p=0.000 n=50) memmove/87 4.147n ± 0% 3.661n ± 0% -11.72% (p=0.000 n=50) memmove/88 4.148n ± 0% 3.661n ± 0% -11.74% (n=50) memmove/89 4.152n ± 0% 3.661n ± 0% -11.84% (n=50) memmove/90 4.151n ± 0% 3.661n ± 0% -11.81% (n=50) memmove/91 4.150n ± 0% 3.661n ± 0% -11.78% (n=50) memmove/92 4.153n ± 0% 3.661n ± 0% -11.86% (n=50) memmove/93 4.158n ± 0% 3.661n ± 0% -11.95% (n=50) memmove/94 4.157n ± 0% 3.661n ± 0% -11.95% (p=0.000 n=50) memmove/95 4.155n ± 0% 3.661n ± 0% -11.90% (p=0.000 n=50) memmove/96 4.149n ± 0% 3.660n ± 0% -11.79% (n=50) memmove/97 4.157n ± 0% 3.661n ± 0% -11.94% (n=50) memmove/98 4.157n ± 0% 3.661n ± 0% -11.94% (n=50) memmove/99 4.168n ± 0% 3.661n ± 0% -12.17% (p=0.000 n=50) memmove/100 4.159n ± 0% 3.660n ± 0% -12.00% (p=0.000 n=50) memmove/101 4.161n ± 0% 3.660n ± 0% -12.03% (p=0.000 n=50) memmove/102 4.165n ± 0% 3.660n ± 0% -12.12% (p=0.000 n=50) memmove/103 4.164n ± 0% 3.661n ± 0% -12.08% (n=50) memmove/104 4.164n ± 0% 3.660n ± 0% -12.11% (n=50) memmove/105 4.165n ± 0% 3.660n ± 0% -12.12% (p=0.000 n=50) memmove/106 4.166n ± 0% 3.660n ± 0% -12.15% (n=50) memmove/107 4.171n ± 0% 3.660n ± 1% -12.26% (p=0.000 n=50) memmove/108 4.173n ± 0% 3.660n ± 0% -12.30% (p=0.000 n=50) memmove/109 4.170n ± 0% 3.660n ± 0% -12.24% (n=50) memmove/110 4.174n ± 0% 3.660n ± 0% -12.31% (n=50) memmove/111 4.176n ± 0% 3.660n ± 0% -12.35% (p=0.000 n=50) memmove/112 4.174n ± 0% 3.659n ± 0% -12.34% (p=0.000 n=50) memmove/113 4.176n ± 0% 3.660n ± 0% -12.35% (n=50) memmove/114 4.182n ± 0% 3.660n ± 0% -12.49% (n=50) memmove/115 4.185n ± 0% 3.660n ± 0% -12.55% (n=50) memmove/116 4.184n ± 0% 3.659n ± 0% -12.54% (n=50) memmove/117 4.182n ± 0% 3.660n ± 0% -12.50% (n=50) memmove/118 4.188n ± 0% 3.660n ± 0% -12.61% (n=50) memmove/119 4.186n ± 0% 3.660n ± 0% -12.57% (p=0.000 n=50) memmove/120 4.189n ± 0% 3.659n ± 0% -12.63% (n=50) memmove/121 4.187n ± 0% 3.660n ± 0% -12.60% (n=50) memmove/122 4.186n ± 0% 3.660n ± 0% -12.58% (n=50) memmove/123 4.187n ± 0% 3.660n ± 0% -12.60% (n=50) memmove/124 4.189n ± 0% 3.659n ± 0% -12.65% (n=50) memmove/125 4.195n ± 0% 3.659n ± 0% -12.78% (n=50) memmove/126 4.197n ± 0% 3.659n ± 0% -12.81% (n=50) memmove/127 4.194n ± 0% 3.659n ± 0% -12.75% (n=50) memmove/128 5.035n ± 0% 3.659n ± 0% -27.32% (n=50) memmove/129 5.127n ± 0% 5.164n ± 0% +0.73% (p=0.000 n=50) memmove/130 5.130n ± 0% 5.176n ± 0% +0.88% (p=0.000 n=50) memmove/131 5.127n ± 0% 5.180n ± 0% +1.05% (p=0.000 n=50) memmove/132 5.131n ± 0% 5.169n ± 0% +0.75% (p=0.000 n=50) memmove/133 5.137n ± 0% 5.179n ± 0% +0.81% (p=0.000 n=50) memmove/134 5.140n ± 0% 5.178n ± 0% +0.74% (p=0.000 n=50) memmove/135 5.141n ± 0% 5.187n ± 0% +0.88% (p=0.000 n=50) memmove/136 5.133n ± 0% 5.184n ± 0% +0.99% (p=0.000 n=50) memmove/137 5.148n ± 0% 5.186n ± 0% +0.73% (p=0.000 n=50) memmove/138 5.143n ± 0% 5.189n ± 0% +0.88% (p=0.000 n=50) memmove/139 5.142n ± 0% 5.192n ± 0% +0.97% (p=0.000 n=50) memmove/140 5.141n ± 0% 5.192n ± 0% +1.01% (p=0.000 n=50) memmove/141 5.155n ± 0% 5.188n ± 0% +0.64% (p=0.000 n=50) memmove/142 5.146n ± 0% 5.192n ± 0% +0.90% (p=0.000 n=50) memmove/143 5.142n ± 0% 5.203n ± 0% +1.19% (p=0.000 n=50) memmove/144 5.146n ± 0% 5.197n ± 0% +0.99% (p=0.000 n=50) memmove/145 5.146n ± 0% 5.196n ± 0% +0.97% (p=0.000 n=50) memmove/146 5.151n ± 0% 5.207n ± 0% +1.10% (p=0.000 n=50) memmove/147 5.151n ± 0% 5.205n ± 0% +1.06% (p=0.000 n=50) memmove/148 5.156n ± 0% 5.190n ± 0% +0.66% (p=0.000 n=50) memmove/149 5.158n ± 0% 5.212n ± 0% +1.04% (p=0.000 n=50) memmove/150 5.160n ± 0% 5.203n ± 0% +0.84% (p=0.000 n=50) memmove/151 5.167n ± 0% 5.210n ± 0% +0.83% (p=0.000 n=50) memmove/152 5.157n ± 0% 5.206n ± 0% +0.94% (p=0.000 n=50) memmove/153 5.170n ± 0% 5.211n ± 0% +0.80% (p=0.000 n=50) memmove/154 5.169n ± 0% 5.222n ± 0% +1.02% (p=0.000 n=50) memmove/155 5.171n ± 0% 5.215n ± 0% +0.87% (p=0.000 n=50) memmove/156 5.174n ± 0% 5.214n ± 0% +0.78% (p=0.000 n=50) memmove/157 5.171n ± 0% 5.218n ± 0% +0.92% (p=0.000 n=50) memmove/158 5.168n ± 0% 5.224n ± 0% +1.09% (p=0.000 n=50) memmove/159 5.179n ± 0% 5.218n ± 0% +0.76% (p=0.000 n=50) memmove/160 5.170n ± 0% 5.219n ± 0% +0.95% (p=0.000 n=50) memmove/161 5.187n ± 0% 5.220n ± 0% +0.64% (p=0.000 n=50) memmove/162 5.189n ± 0% 5.234n ± 0% +0.86% (p=0.000 n=50) memmove/163 5.199n ± 0% 5.250n ± 0% +0.99% (p=0.000 n=50) memmove/164 5.205n ± 0% 5.260n ± 0% +1.04% (p=0.000 n=50) memmove/165 5.208n ± 0% 5.261n ± 0% +1.01% (p=0.000 n=50) memmove/166 5.227n ± 0% 5.275n ± 0% +0.91% (p=0.000 n=50) memmove/167 5.233n ± 0% 5.281n ± 0% +0.92% (p=0.000 n=50) memmove/168 5.236n ± 0% 5.295n ± 0% +1.12% (p=0.000 n=50) memmove/169 5.256n ± 0% 5.297n ± 0% +0.79% (p=0.000 n=50) memmove/170 5.259n ± 0% 5.302n ± 0% +0.80% (p=0.000 n=50) memmove/171 5.269n ± 0% 5.321n ± 0% +0.97% (p=0.000 n=50) memmove/172 5.266n ± 0% 5.318n ± 0% +0.98% (p=0.000 n=50) memmove/173 5.272n ± 0% 5.330n ± 0% +1.09% (p=0.000 n=50) memmove/174 5.284n ± 0% 5.331n ± 0% +0.89% (p=0.000 n=50) memmove/175 5.284n ± 0% 5.322n ± 0% +0.72% (p=0.000 n=50) memmove/176 5.298n ± 0% 5.337n ± 0% +0.74% (p=0.000 n=50) memmove/177 5.282n ± 0% 5.338n ± 0% +1.04% (p=0.000 n=50) memmove/178 5.299n ± 0% 5.337n ± 0% +0.71% (p=0.000 n=50) memmove/179 5.296n ± 0% 5.343n ± 0% +0.88% (p=0.000 n=50) memmove/180 5.292n ± 0% 5.343n ± 0% +0.97% (p=0.000 n=50) memmove/181 5.303n ± 0% 5.335n ± 0% +0.60% (p=0.000 n=50) memmove/182 5.305n ± 0% 5.338n ± 0% +0.62% (p=0.000 n=50) memmove/183 5.298n ± 0% 5.329n ± 0% +0.59% (p=0.000 n=50) memmove/184 5.299n ± 0% 5.333n ± 0% +0.64% (p=0.000 n=50) memmove/185 5.291n ± 0% 5.330n ± 0% +0.73% (p=0.000 n=50) memmove/186 5.296n ± 0% 5.332n ± 0% +0.68% (p=0.000 n=50) memmove/187 5.297n ± 0% 5.320n ± 0% +0.44% (p=0.000 n=50) memmove/188 5.286n ± 0% 5.314n ± 0% +0.53% (p=0.000 n=50) memmove/189 5.293n ± 0% 5.318n ± 0% +0.46% (p=0.000 n=50) memmove/190 5.294n ± 0% 5.318n ± 0% +0.45% (p=0.000 n=50) memmove/191 5.292n ± 0% 5.314n ± 0% +0.40% (p=0.032 n=50) memmove/192 5.272n ± 0% 5.304n ± 0% +0.60% (p=0.000 n=50) memmove/193 5.279n ± 0% 5.310n ± 0% +0.57% (p=0.000 n=50) memmove/194 5.294n ± 0% 5.308n ± 0% +0.26% (p=0.018 n=50) memmove/195 5.302n ± 0% 5.311n ± 0% +0.18% (p=0.010 n=50) memmove/196 5.301n ± 0% 5.316n ± 0% +0.28% (p=0.023 n=50) memmove/197 5.302n ± 0% 5.327n ± 0% +0.47% (p=0.000 n=50) memmove/198 5.310n ± 0% 5.326n ± 0% +0.30% (p=0.003 n=50) memmove/199 5.303n ± 0% 5.319n ± 0% +0.30% (p=0.009 n=50) memmove/200 5.312n ± 0% 5.330n ± 0% +0.35% (p=0.001 n=50) memmove/201 5.307n ± 0% 5.333n ± 0% +0.50% (p=0.000 n=50) memmove/202 5.311n ± 0% 5.334n ± 0% +0.44% (p=0.000 n=50) memmove/203 5.313n ± 0% 5.335n ± 0% +0.41% (p=0.006 n=50) memmove/204 5.312n ± 0% 5.332n ± 0% +0.36% (p=0.002 n=50) memmove/205 5.318n ± 0% 5.345n ± 0% +0.50% (p=0.000 n=50) memmove/206 5.311n ± 0% 5.333n ± 0% +0.42% (p=0.002 n=50) memmove/207 5.310n ± 0% 5.338n ± 0% +0.52% (p=0.000 n=50) memmove/208 5.319n ± 0% 5.341n ± 0% +0.40% (p=0.004 n=50) memmove/209 5.330n ± 0% 5.346n ± 0% +0.30% (p=0.004 n=50) memmove/210 5.329n ± 0% 5.349n ± 0% +0.38% (p=0.002 n=50) memmove/211 5.318n ± 0% 5.340n ± 0% +0.41% (p=0.000 n=50) memmove/212 5.339n ± 0% 5.343n ± 0% ~ (p=0.396 n=50) memmove/213 5.329n ± 0% 5.343n ± 0% +0.25% (p=0.017 n=50) memmove/214 5.339n ± 0% 5.358n ± 0% +0.35% (p=0.035 n=50) memmove/215 5.342n ± 0% 5.346n ± 0% ~ (p=0.063 n=50) memmove/216 5.338n ± 0% 5.359n ± 0% +0.39% (p=0.002 n=50) memmove/217 5.341n ± 0% 5.362n ± 0% +0.39% (p=0.015 n=50) memmove/218 5.354n ± 0% 5.373n ± 0% +0.36% (p=0.041 n=50) memmove/219 5.352n ± 0% 5.362n ± 0% ~ (p=0.143 n=50) memmove/220 5.344n ± 0% 5.370n ± 0% +0.50% (p=0.001 n=50) memmove/221 5.345n ± 0% 5.373n ± 0% +0.53% (p=0.000 n=50) memmove/222 5.348n ± 0% 5.360n ± 0% +0.23% (p=0.014 n=50) memmove/223 5.354n ± 0% 5.377n ± 0% +0.43% (p=0.024 n=50) memmove/224 5.352n ± 0% 5.363n ± 0% ~ (p=0.052 n=50) memmove/225 5.372n ± 0% 5.380n ± 0% ~ (p=0.481 n=50) memmove/226 5.368n ± 0% 5.386n ± 0% +0.34% (p=0.004 n=50) memmove/227 5.386n ± 0% 5.402n ± 0% +0.29% (p=0.028 n=50) memmove/228 5.400n ± 0% 5.408n ± 0% ~ (p=0.174 n=50) memmove/229 5.423n ± 0% 5.427n ± 0% ~ (p=0.444 n=50) memmove/230 5.411n ± 0% 5.429n ± 0% +0.33% (p=0.020 n=50) memmove/231 5.420n ± 0% 5.433n ± 0% +0.24% (p=0.034 n=50) memmove/232 5.435n ± 0% 5.441n ± 0% ~ (p=0.235 n=50) memmove/233 5.446n ± 0% 5.462n ± 0% ~ (p=0.590 n=50) memmove/234 5.467n ± 0% 5.461n ± 0% ~ (p=0.921 n=50) memmove/235 5.472n ± 0% 5.478n ± 0% ~ (p=0.883 n=50) memmove/236 5.466n ± 0% 5.478n ± 0% ~ (p=0.324 n=50) memmove/237 5.471n ± 0% 5.489n ± 0% ~ (p=0.132 n=50) memmove/238 5.485n ± 0% 5.489n ± 0% ~ (p=0.460 n=50) memmove/239 5.484n ± 0% 5.488n ± 0% ~ (p=0.833 n=50) memmove/240 5.483n ± 0% 5.495n ± 0% ~ (p=0.095 n=50) memmove/241 5.498n ± 0% 5.514n ± 0% ~ (p=0.077 n=50) memmove/242 5.518n ± 0% 5.517n ± 0% ~ (p=0.481 n=50) memmove/243 5.514n ± 0% 5.511n ± 0% ~ (p=0.503 n=50) memmove/244 5.510n ± 0% 5.497n ± 0% -0.24% (p=0.038 n=50) memmove/245 5.516n ± 0% 5.505n ± 0% ~ (p=0.317 n=50) memmove/246 5.513n ± 1% 5.494n ± 0% ~ (p=0.147 n=50) memmove/247 5.518n ± 0% 5.499n ± 0% -0.36% (p=0.011 n=50) memmove/248 5.503n ± 0% 5.492n ± 0% ~ (p=0.267 n=50) memmove/249 5.498n ± 0% 5.497n ± 0% ~ (p=0.765 n=50) memmove/250 5.485n ± 0% 5.493n ± 0% ~ (p=0.348 n=50) memmove/251 5.503n ± 0% 5.482n ± 0% -0.37% (p=0.013 n=50) memmove/252 5.497n ± 0% 5.485n ± 0% ~ (p=0.077 n=50) memmove/253 5.489n ± 0% 5.496n ± 0% ~ (p=0.850 n=50) memmove/254 5.497n ± 0% 5.491n ± 0% ~ (p=0.548 n=50) memmove/255 5.484n ± 1% 5.494n ± 0% ~ (p=0.888 n=50) memmove/256 6.952n ± 0% 7.676n ± 0% +10.41% (p=0.000 n=50) geomean 4.406n 4.127n -6.33% ```
2023-09-26[libc] Mass replace enclosing namespace (#67032)Guillaume Chatelet
This is step 4 of https://discourse.llvm.org/t/rfc-customizable-namespace-to-allow-testing-the-libc-when-the-system-libc-is-also-llvms-libc/73079
2023-07-19[libc][NFC] Rename filesGuillaume Chatelet
This patch mostly renames files so it better reflects the function they declare. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D155607
2023-06-14[libc] Dispatch memmove to memcpy when buffers are disjointGuillaume Chatelet
Most of the time `memmove` is called on buffers that are disjoint, in that case we can use `memcpy` which is faster. The additional test is branchless on x86, aarch64 and RISCV with the zbb extension (bitmanip). On x86 this patch adds a latency of 2 to 3 cycles. Before ``` -------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... -------------------------------------------------------------------------------- BM_Memmove/0/0_median 5.00 ns 5.00 ns 10 bytes_per_cycle=1.25477/s bytes_per_second=2.62933G/s items_per_second=199.87M/s __llvm_libc::memmove,memmove Google A BM_Memmove/1/0_median 6.21 ns 6.21 ns 10 bytes_per_cycle=3.22173/s bytes_per_second=6.75106G/s items_per_second=160.955M/s __llvm_libc::memmove,memmove Google B BM_Memmove/2/0_median 8.09 ns 8.09 ns 10 bytes_per_cycle=5.31462/s bytes_per_second=11.1366G/s items_per_second=123.603M/s __llvm_libc::memmove,memmove Google D BM_Memmove/3/0_median 5.95 ns 5.95 ns 10 bytes_per_cycle=2.71865/s bytes_per_second=5.69687G/s items_per_second=167.967M/s __llvm_libc::memmove,memmove Google L BM_Memmove/4/0_median 5.63 ns 5.63 ns 10 bytes_per_cycle=2.28294/s bytes_per_second=4.78383G/s items_per_second=177.615M/s __llvm_libc::memmove,memmove Google M BM_Memmove/5/0_median 5.68 ns 5.68 ns 10 bytes_per_cycle=2.16798/s bytes_per_second=4.54295G/s items_per_second=176.015M/s __llvm_libc::memmove,memmove Google Q BM_Memmove/6/0_median 7.46 ns 7.46 ns 10 bytes_per_cycle=3.97619/s bytes_per_second=8.332G/s items_per_second=134.044M/s __llvm_libc::memmove,memmove Google S BM_Memmove/7/0_median 5.40 ns 5.40 ns 10 bytes_per_cycle=1.79695/s bytes_per_second=3.76546G/s items_per_second=185.211M/s __llvm_libc::memmove,memmove Google U BM_Memmove/8/0_median 5.62 ns 5.62 ns 10 bytes_per_cycle=3.18747/s bytes_per_second=6.67927G/s items_per_second=177.983M/s __llvm_libc::memmove,memmove Google W BM_Memmove/9/0_median 101 ns 101 ns 10 bytes_per_cycle=9.77359/s bytes_per_second=20.4803G/s items_per_second=9.9333M/s __llvm_libc::memmove,uniform 384 to 4096 ``` After ``` BM_Memmove/0/0_median 3.57 ns 3.57 ns 10 bytes_per_cycle=1.71375/s bytes_per_second=3.59112G/s items_per_second=280.411M/s __llvm_libc::memmove,memmove Google A BM_Memmove/1/0_median 4.52 ns 4.52 ns 10 bytes_per_cycle=4.47557/s bytes_per_second=9.37843G/s items_per_second=221.427M/s __llvm_libc::memmove,memmove Google B BM_Memmove/2/0_median 5.70 ns 5.70 ns 10 bytes_per_cycle=7.37396/s bytes_per_second=15.4519G/s items_per_second=175.399M/s __llvm_libc::memmove,memmove Google D BM_Memmove/3/0_median 4.47 ns 4.47 ns 10 bytes_per_cycle=3.4148/s bytes_per_second=7.15563G/s items_per_second=223.743M/s __llvm_libc::memmove,memmove Google L BM_Memmove/4/0_median 4.53 ns 4.53 ns 10 bytes_per_cycle=2.86071/s bytes_per_second=5.99454G/s items_per_second=220.69M/s __llvm_libc::memmove,memmove Google M BM_Memmove/5/0_median 4.19 ns 4.19 ns 10 bytes_per_cycle=2.5484/s bytes_per_second=5.3401G/s items_per_second=238.924M/s __llvm_libc::memmove,memmove Google Q BM_Memmove/6/0_median 5.02 ns 5.02 ns 10 bytes_per_cycle=5.94164/s bytes_per_second=12.4505G/s items_per_second=199.14M/s __llvm_libc::memmove,memmove Google S BM_Memmove/7/0_median 4.03 ns 4.03 ns 10 bytes_per_cycle=2.47028/s bytes_per_second=5.17641G/s items_per_second=247.906M/s __llvm_libc::memmove,memmove Google U BM_Memmove/8/0_median 4.70 ns 4.70 ns 10 bytes_per_cycle=3.84975/s bytes_per_second=8.06706G/s items_per_second=212.72M/s __llvm_libc::memmove,memmove Google W BM_Memmove/9/0_median 90.7 ns 90.7 ns 10 bytes_per_cycle=10.8681/s bytes_per_second=22.7739G/s items_per_second=11.02M/s __llvm_libc::memmove,uniform 384 to 4096 ``` Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D152811
2022-11-16[libc][NFC] move memmove implementationGuillaume Chatelet
Moving memmove implementation to its own file for symmetry with other mem functions. Differential Revision: https://reviews.llvm.org/D136687
2022-11-02[reland][libc] Switch to new implementation of mem* functionsGuillaume Chatelet
The new framework makes it explicit which processor feature is being used and allows for easier per platform customization: - ARM cpu now uses trivial implementations to reduce code size. - Memcmp, Bcmp and Memmove have been optimized for x86 - Bcmp has been optimized for aarch64. This is a reland of https://reviews.llvm.org/D135134 (b3f1d58, 028414881381) Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D136595
2022-10-27Revert D136595 "[libc] Switch to new implementation of mem* functions"Guillaume Chatelet
This patch seems to introduce bugs on aarch64. Reverting while we investigate the root cause. This reverts commit 02841488138160f9064f334a833d4bf3e80385c6.
2022-10-25[libc] Switch to new implementation of mem* functionsGuillaume Chatelet
The new framework makes it explicit which processor feature is being used and allows for easier per platform customization: - ARM cpu now uses trivial implementations to reduce code size. - Memcmp, Bcmp and Memmove have been optimized for x86 - Bcmp has been optimized for aarch64. This is a reland of https://reviews.llvm.org/D135134 (b3f1d58) Differential Revision: https://reviews.llvm.org/D136595
2022-10-14Revert "[libc] New version of the mem* framework"Sterling Augustine
This reverts commit https://reviews.llvm.org/D135134 (b3f1d58a131eb546aaf1ac165c77ccb89c40d758) That revision appears to have broken Arm memcpy in some subtle ways. Am communicating with the original author to get a good reproduction.
2022-10-14[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. It also provides minimal implementations for ARM platforms. Codegen can be checked here https://godbolt.org/z/chf1Y6eGM Differential Revision: https://reviews.llvm.org/D135134
2022-10-14Revert "[libc] New version of the mem* framework"Guillaume Chatelet
This reverts commit 9721687835a7df5da0c9482cf684c11b8ba97f75.
2022-10-14[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. It also provides minimal implementations for ARM platforms. Codegen can be checked here https://godbolt.org/z/x19zvE59v Differential Revision: https://reviews.llvm.org/D135134
2022-10-14Revert "[libc] New version of the mem* framework"Guillaume Chatelet
This reverts commit 98bf836f3127a346a81da5ae3e27246935298de4.
2022-10-14[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. It also provides minimal implementations for ARM platforms. Codegen can be checked here https://godbolt.org/z/x19zvE59v Differential Revision: https://reviews.llvm.org/D135134
2022-10-13Revert "[libc] New version of the mem* framework"Guillaume Chatelet
This reverts commit d55f2d8ab076298cfd745c05c1b4dfd5583f8b9e.
2022-10-13[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. It also provides minimal implementations for ARM platforms. Codegen can be checked here https://godbolt.org/z/x19zvE59v Differential Revision: https://reviews.llvm.org/D135134
2022-10-12Revert "[libc] New version of the mem* framework"Guillaume Chatelet
This reverts commit 4c19439d249256db720e323a446e39d05496732f.
2022-10-12[libc] New version of the mem* frameworkGuillaume Chatelet
This version is more composable and also simpler at the expense of being more explicit and more verbose. This patch is not meant to be submitted but gives an idea of the change. Codegen can be checked in https://godbolt.org/z/6z1dEoWbs by removing the "static inline" before individual functions. Unittests are coming. Suggested review order: - utils - op_base - op_builtin - op_generic - op_x86 / op_aarch64 - *_implementations.h Differential Revision: https://reviews.llvm.org/D135134
2022-02-08[libc] Optimized version of memmoveGuillaume Chatelet
This implementation relies on storing data in registers for sizes up to 128B. Then depending on whether `dst` is less (resp. greater) than `src` we move data forward (resp. backward) by chunks of 32B. We first make sure one of the pointers is aligned to increase performance on large move sizes. Differential Revision: https://reviews.llvm.org/D114637
2021-12-07[libc] apply new lint rulesMichael Jones
This patch applies the lint rules described in the previous patch. There was also a significant amount of effort put into manually fixing things, since all of the templated functions, or structs defined in /spec, were not updated and had to be handled manually. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D114302
2021-11-26[libc] Make string entrypoints mutualy exclusive.Siva Chandra Reddy
For example, strcpy does not pull memcpy now. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D114300
2021-09-14[libc][Obvious] Some clean work with memmove.Cheng Wang
2021-09-13Revert "[libc] Some clean work with memmove."Guillaume Chatelet
This reverts commit b659b789c03ac339e28d7b91406b67bb887a426d.
2021-09-10[libc] Some clean work with memmove.Cheng Wang
- Replace `move_byte_forward()` with `memcpy`. In `memcpy` implementation, it copies bytes forward from beginning to end. Otherwise, `memmove` unit tests will break. - Make `memmove` unit tests work. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D109316
2021-03-11[libc][NFC] Move the template implementation of integer_abs to __support.Siva Chandra Reddy
This eliminates cross-header dependency from stdlib to string.
2021-01-19[libc][NFC] remove dependency on non standard ssize_tGuillaume Chatelet
`ssize_t` is from POSIX and is not standard unfortunately. Rewritting the code so it doesn't depend on it. Differential Revision: https://reviews.llvm.org/D94760
2021-01-15[libc] Add memmove implementation.Cheng Wang
Use `memcpy` rather than copying bytes one by one, for there might be large size structs to move. Reviewed By: gchatelet, sivachandra Differential Revision: https://reviews.llvm.org/D93195