glibc.git/include/stdlib.h, branch master

debug: Improve '%n' fortify detection (BZ 30932)

2025-03-21T18:46:48+00:00

The 7bb8045ec0 path made the '%n' fortify check ignore EMFILE errors
while trying to open /proc/self/maps, and this added a security
issue where EMFILE can be attacker-controlled thus making it
ineffective for some cases.

The EMFILE failure is reinstated but with a different error
message.  Also, to improve the false positive of the hardening for
the cases where no new files can be opened, the
_dl_readonly_area now uses  _dl_find_object to check if the
memory area is within a writable ELF segment.  The procfs method is
still used as fallback.

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Arjun Shankar

stdlib: Make abort/_Exit AS-safe (BZ 26275)

2024-10-08T17:40:12+00:00

The recursive lock used on abort does not synchronize with a new process
creation (either by fork-like interfaces or posix_spawn ones), nor it
is reinitialized after fork().

Also, the SIGABRT unblock before raise() shows another race condition,
where a fork or posix_spawn() call by another thread, just after the
recursive lock release and before the SIGABRT signal, might create
programs with a non-expected signal mask.  With the default option
(without POSIX_SPAWN_SETSIGDEF), the process can see SIG_DFL for
SIGABRT, where it should be SIG_IGN.

To fix the AS-safe, raise() does not change the process signal mask,
and an AS-safe lock is used if a SIGABRT is installed or the process
is blocked or ignored.  With the signal mask change removal,
there is no need to use a recursive loc.  The lock is also taken on
both _Fork() and posix_spawn(), to avoid the spawn process to see the
abort handler as SIG_DFL.

A read-write lock is used to avoid serialize _Fork and posix_spawn
execution.  Both sigaction (SIGABRT) and abort() requires to lock
as writer (since both change the disposition).

The fallback is also simplified: there is no need to use a loop of
ABORT_INSTRUCTION after _exit() (if the syscall does not terminate the
process, the system is broken).

The proposed fix changes how setjmp works on a SIGABRT handler, where
glibc does not save the signal mask.  So usage like the below will now
always abort.

  static volatile int chk_fail_ok;
  static jmp_buf chk_fail_buf;

  static void
  handler (int sig)
  {
    if (chk_fail_ok)
      {
        chk_fail_ok = 0;
        longjmp (chk_fail_buf, 1);
      }
    else
      _exit (127);
  }
  [...]
  signal (SIGABRT, handler);
  [....]
  chk_fail_ok = 1;
  if (! setjmp (chk_fail_buf))
    {
      // Something that can calls abort, like a failed fortify function.
      chk_fail_ok = 0;
      printf ("FAIL\n");
    }

Such cases will need to use sigsetjmp instead.

The _dl_start_profile calls sigaction through _profil, and to avoid
pulling abort() on loader the call is replaced with __libc_sigaction.

Checked on x86_64-linux-gnu and aarch64-linux-gnu.

Reviewed-by: DJ Delorie

Refer to C23 in place of C2X in glibc

2024-02-01T11:02:01+00:00

WG14 decided to use the name C23 as the informal name of the next
revision of the C standard (notwithstanding the publication date in
2024).  Update references to C2X in glibc to use the C23 name.

This is intended to update everything *except* where it involves
renaming files (the changes involving renaming tests are intended to
be done separately).  In the case of the _ISOC2X_SOURCE feature test
macro - the only user-visible interface involved - support for that
macro is kept for backwards compatibility, while adding
_ISOC23_SOURCE.

Tested for x86_64.

stdlib: Remove use of mergesort on qsort (BZ 21719)

2023-10-31T17:18:05+00:00

This patch removes the mergesort optimization on qsort implementation
and uses the introsort instead.  The mergesort implementation has some
issues:

  - It is as-safe only for certain types sizes (if total size is less
    than 1 KB with large element sizes also forcing memory allocation)
    which contradicts the function documentation.  Although not required
    by the C standard, it is preferable and doable to have an O(1) space
    implementation.

  - The malloc for certain element size and element number adds
    arbitrary latency (might even be worse if malloc is interposed).

  - To avoid trigger swap from memory allocation the implementation
    relies on system information that might be virtualized (for instance
    VMs with overcommit memory) which might lead to potentially use of
    swap even if system advertise more memory than actually has.  The
    check also have the downside of issuing syscalls where none is
    expected (although only once per execution).

  - The mergesort is suboptimal on an already sorted array (BZ#21719).

The introsort implementation is already optimized to use constant extra
space (due to the limit of total number of elements from maximum VM
size) and thus can be used to avoid the malloc usage issues.

Resulting performance is slower due the usage of qsort, specially in the
worst-case scenario (partialy or sorted arrays) and due the fact
mergesort uses a slight improved swap operations.

This change also renders the BZ#21719 fix unrequired (since it is meant
to fix the sorted input performance degradation for mergesort).  The
manual is also updated to indicate the function is now async-cancel
safe.

Checked on x86_64-linux-gnu.
Reviewed-by: Noah Goldstein

__call_tls_dtors: Use call_function_static_weak

2023-09-04T18:03:37+00:00

chk: Add and fix hidden builtin definitions for *_chk

2023-08-03T20:46:48+00:00

Otherwise on gnu-i686 there are unwanted PLT entries in libc.so when
fortification is enabled.

Tested for i686-gnu, x86_64-gnu, i686-linux-gnu and x86_64-linux-gnu

C2x strtol binary constant handling

2023-02-16T23:02:40+00:00

C2x adds binary integer constants starting with 0b or 0B, and supports
those constants in strtol-family functions when the base passed is 0
or 2.  Implement that strtol support for glibc.

As discussed at
,
this is incompatible with previous C standard versions, in that such
an input string starting with 0b or 0B was previously required to be
parsed as 0 (with the rest of the string unprocessed).  Thus, as
proposed there, this patch adds 20 new __isoc23_* functions with
appropriate header redirection support.  This patch does *not* do
anything about scanf %i (which will need 12 new functions per long
double variant, so 12, 24 or 36 depending on the glibc configuration),
instead leaving that for a future patch.  The function names would
remain as __isoc23_* even if C2x ends up published in 2024 rather than
2023.

Making this change leads to the question of what should happen to
internal uses of these functions in glibc and its tests.  The header
redirection (which applies for _GNU_SOURCE or any other feature test
macros enabling C2x features) has the effect of redirecting internal
uses but without those uses then ending up at a hidden alias (see the
comment in include/stdio.h about interaction with libc_hidden_proto).
It seems desirable for the default for internal uses to be the same
versions used by normal code using _GNU_SOURCE, so rather than doing
anything to disable that redirection, similar macro definitions to
those in include/stdio.h are added to the include/ headers for the new
functions.

Given that the default for uses in glibc is for the redirections to
apply, the next question is whether the C2x semantics are correct for
all those uses.  Uses with the base fixed to 10, 16 or any other value
other than 0 or 2 can be ignored.  I think this leaves the following
internal uses to consider (an important consideration for review of
this patch will be both whether this list is complete and whether my
conclusions on all entries in it are correct):

benchtests/bench-malloc-simple.c
benchtests/bench-string.h
elf/sotruss-lib.c
math/libm-test-support.c
nptl/perf.c
nscd/nscd_conf.c
nss/nss_files/files-parse.c
posix/tst-fnmatch.c
posix/wordexp.c
resolv/inet_addr.c
rt/tst-mqueue7.c
soft-fp/testit.c
stdlib/fmtmsg.c
support/support_test_main.c
support/test-container.c
sysdeps/pthread/tst-mutex10.c

I think all of these places are OK with the new semantics, except for
resolv/inet_addr.c, where the POSIX semantics of inet_addr do not
allow for binary constants; thus, I changed that file (to use
__strtoul_internal, whose semantics are unchanged) and added a test
for this case.  In the case of posix/wordexp.c I think accepting
binary constants is OK since POSIX explicitly allows additional forms
of shell arithmetic expressions, and in stdlib/fmtmsg.c SEV_LEVEL is
not in POSIX so again I think accepting binary constants is OK.

Functions such as __strtol_internal, which are only exported for
compatibility with old binaries from when those were used in inline
functions in headers, have unchanged semantics; the __*_l_internal
versions (purely internal to libc and not exported) have a new
argument to specify whether to accept binary constants.

As well as for the standard functions, the header redirection also
applies to the *_l versions (GNU extensions), and to legacy functions
such as strtoq, to avoid confusing inconsistency (the *q functions
redirect to __isoc23_*ll rather than needing their own __isoc23_*
entry points).  For the functions that are only declared with
_GNU_SOURCE, this means the old versions are no longer available for
normal user programs at all.  An internal __GLIBC_USE_C2X_STRTOL macro
is used to control the redirections in the headers, and cases in glibc
that wish to avoid the redirections - the function implementations
themselves and the tests of the old versions of the GNU functions -
then undefine and redefine that macro to allow the old versions to be
accessed.  (There would of course be greater complexity should we wish
to make any of the old versions into compat symbols / avoid them being
defined at all for new glibc ABIs.)

strtol_l.c has some similarity to strtol.c in gnulib, but has already
diverged some way (and isn't listed at all at
https://sourceware.org/glibc/wiki/SharedSourceFiles unlike strtoll.c
and strtoul.c); I haven't made any attempts at gnulib compatibility in
the changes to that file.

I note incidentally that inttypes.h and wchar.h are missing the
__nonnull present on declarations of this family of functions in
stdlib.h; I didn't make any changes in that regard for the new
declarations added.

arc4random: simplify design for better safety

2022-07-27T11:58:27+00:00

Rather than buffering 16 MiB of entropy in userspace (by way of
chacha20), simply call getrandom() every time.

This approach is doubtlessly slower, for now, but trying to prematurely
optimize arc4random appears to be leading toward all sorts of nasty
properties and gotchas. Instead, this patch takes a much more
conservative approach. The interface is added as a basic loop wrapper
around getrandom(), and then later, the kernel and libc together can
work together on optimizing that.

This prevents numerous issues in which userspace is unaware of when it
really must throw away its buffer, since we avoid buffering all
together. Future improvements may include userspace learning more from
the kernel about when to do that, which might make these sorts of
chacha20-based optimizations more possible. The current heuristic of 16
MiB is meaningless garbage that doesn't correspond to anything the
kernel might know about. So for now, let's just do something
conservative that we know is correct and won't lead to cryptographic
issues for users of this function.

This patch might be considered along the lines of, "optimization is the
root of all evil," in that the much more complex implementation it
replaces moves too fast without considering security implications,
whereas the incremental approach done here is a much safer way of going
about things. Once this lands, we can take our time in optimizing this
properly using new interplay between the kernel and userspace.

getrandom(0) is used, since that's the one that ensures the bytes
returned are cryptographically secure. But on systems without it, we
fallback to using /dev/urandom. This is unfortunate because it means
opening a file descriptor, but there's not much of a choice. Secondly,
as part of the fallback, in order to get more or less the same
properties of getrandom(0), we poll on /dev/random, and if the poll
succeeds at least once, then we assume the RNG is initialized. This is a
rough approximation, as the ancient "non-blocking pool" initialized
after the "blocking pool", not before, and it may not port back to all
ancient kernels, though it does to all kernels supported by glibc
(≥3.2), so generally it's the best approximation we can do.

The motivation for including arc4random, in the first place, is to have
source-level compatibility with existing code. That means this patch
doesn't attempt to litigate the interface itself. It does, however,
choose a conservative approach for implementing it.

Cc: Adhemerval Zanella Netto 
Cc: Florian Weimer 
Cc: Cristian Rodríguez 
Cc: Paul Eggert 
Cc: Mark Harris 
Cc: Eric Biggers 
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Jason A. Donenfeld 

Reviewed-by: Adhemerval Zanella

stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417)

2022-07-22T14:58:27+00:00

The implementation is based on scalar Chacha20 with per-thread cache.
It uses getrandom or /dev/urandom as fallback to get the initial entropy,
and reseeds the internal state on every 16MB of consumed buffer.

To improve performance and lower memory consumption the per-thread cache
is allocated lazily on first arc4random functions call, and if the
memory allocation fails getentropy or /dev/urandom is used as fallback.
The cache is also cleared on thread exit iff it was initialized (so if
arc4random is not called it is not touched).

Although it is lock-free, arc4random is still not async-signal-safe
(the per thread state is not updated atomically).

The ChaCha20 implementation is based on RFC8439 [1], omitting the final
XOR of the keystream with the plaintext because the plaintext is a
stream of zeros.  This strategy is similar to what OpenBSD arc4random
does.

The arc4random_uniform is based on previous work by Florian Weimer,
where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete
Uniform Generation from Coin Flips, and Applications (2013) [2], who
credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform
random number generation (1976), for solving the general case.

The main advantage of this method is the that the unit of randomness is not
the uniform random variable (uint32_t), but a random bit.  It optimizes the
internal buffer sampling by initially consuming a 32-bit random variable
and then sampling byte per byte.  Depending of the upper bound requested,
it might lead to better CPU utilization.

Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu.

Co-authored-by: Florian Weimer 
Reviewed-by: Yann Droneaud 

[1] https://datatracker.ietf.org/doc/html/rfc8439
[2] https://arxiv.org/pdf/1304.1916.pdf

Remove morecore and default_morecore

2021-07-22T13:07:57+00:00

Make the __morecore and __default_morecore symbols compat-only and
remove their declarations from the API.  Also, include morecore.c
directly into malloc.c; this should ideally get merged into malloc in
a future cleanup.

Reviewed-by: Carlos O'Donell 
Tested-by: Carlos O'Donell