<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glibc.git/localedata/Makefile, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/'/>
<entry>
<title>stdio-common: Reject insufficient character data in scanf [BZ #12701]</title>
<updated>2025-08-23T00:02:46+00:00</updated>
<author>
<name>Maciej W. Rozycki</name>
<email>macro@redhat.com</email>
</author>
<published>2025-08-23T00:02:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=2b16c76609350ec887d49afd4447743da38f7fab'/>
<id>2b16c76609350ec887d49afd4447743da38f7fab</id>
<content type='text'>
Reject invalid formatted scanf character data with the 'c' conversion
where there is not enough input available to satisfy the field width
requested.  It is required by ISO C that this conversion matches a
sequence of characters of exactly the number specified by the field
width and it is also already documented as such in our own manual:

"It reads precisely the next N characters, and fails if it cannot get
that many."

Currently a matching success is instead incorrectly produced where the
EOF condition is encountered before the required number of characters
has been retrieved, and the characters actually obtained are stored in
the buffer provided.

Add test cases accordingly and remove placeholders from 'c' conversion
input data for the existing scanf tests.

Reviewed-by: Adhemerval Zanella &lt;adhemerval.zanella@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reject invalid formatted scanf character data with the 'c' conversion
where there is not enough input available to satisfy the field width
requested.  It is required by ISO C that this conversion matches a
sequence of characters of exactly the number specified by the field
width and it is also already documented as such in our own manual:

"It reads precisely the next N characters, and fails if it cannot get
that many."

Currently a matching success is instead incorrectly produced where the
EOF condition is encountered before the required number of characters
has been retrieved, and the characters actually obtained are stored in
the buffer provided.

Add test cases accordingly and remove placeholders from 'c' conversion
input data for the existing scanf tests.

Reviewed-by: Adhemerval Zanella &lt;adhemerval.zanella@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>stdio-common: Don't read real input beyond the field width in scanf</title>
<updated>2025-08-11T16:42:12+00:00</updated>
<author>
<name>Maciej W. Rozycki</name>
<email>macro@redhat.com</email>
</author>
<published>2025-08-11T16:42:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=b692181703e59174bdb3d9a5f696326f10f7a13b'/>
<id>b692181703e59174bdb3d9a5f696326f10f7a13b</id>
<content type='text'>
Fix a code pattern that repeats across '__vfscanf_internal' where the
remaining field width of 0 is incorrectly interpreted as no width limit,
which in turn results in reading input beyond the limit requested.  The
lack of width limit is indicated by the field width of -1 rather than 0,
set earlier on in the function.

The problematic code pattern is used for both integer and floating-point
conversions, but in the former case a corresponding conditional earlier
on prevents the field width from being 0 when executing the pattern.  It
does trigger in the latter case, where the decimal point is a multibyte
character or for multibyte digit characters.

Fix the code pattern by using 'width &gt; 0' comparison, and apply the fix
throughout even to code handling integer conversions so as to interpret
the field width consistently and avoid people's confusion even if width
cannot be 0 at those places.

For multibyte digit characters there is an additional issue that causes
code to push back a partially fetched multibyte character multiple times
as execution proceeds through matching data retrieved against individual
digits that have to be rejected due to the field width limit preventing
the rest of the multibyte character from being retrieved.  It is because
code relies on 'ungetc' ignoring a request to push back EOF, however in
the out-of-limit field width condition the data held is not EOF but the
previously retrieved character byte instead.

Fix this issue by artificially assigning EOF to the character byte
storage variable where the out-of-limit field width condition prevents
further processing, and also apply the fix throughout except for the
decimal point/thousands separator case, which uses different code.

Add test cases accordingly.

Reviewed-by: Adhemerval Zanella &lt;adhemerval.zanella@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix a code pattern that repeats across '__vfscanf_internal' where the
remaining field width of 0 is incorrectly interpreted as no width limit,
which in turn results in reading input beyond the limit requested.  The
lack of width limit is indicated by the field width of -1 rather than 0,
set earlier on in the function.

The problematic code pattern is used for both integer and floating-point
conversions, but in the former case a corresponding conditional earlier
on prevents the field width from being 0 when executing the pattern.  It
does trigger in the latter case, where the decimal point is a multibyte
character or for multibyte digit characters.

Fix the code pattern by using 'width &gt; 0' comparison, and apply the fix
throughout even to code handling integer conversions so as to interpret
the field width consistently and avoid people's confusion even if width
cannot be 0 at those places.

For multibyte digit characters there is an additional issue that causes
code to push back a partially fetched multibyte character multiple times
as execution proceeds through matching data retrieved against individual
digits that have to be rejected due to the field width limit preventing
the rest of the multibyte character from being retrieved.  It is because
code relies on 'ungetc' ignoring a request to push back EOF, however in
the out-of-limit field width condition the data held is not EOF but the
previously retrieved character byte instead.

Fix this issue by artificially assigning EOF to the character byte
storage variable where the out-of-limit field width condition prevents
further processing, and also apply the fix throughout except for the
decimal point/thousands separator case, which uses different code.

Add test cases accordingly.

Reviewed-by: Adhemerval Zanella &lt;adhemerval.zanella@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>stdio-common: Also reject exp char w/o significand in i18n scanf [BZ #13988]</title>
<updated>2025-03-28T12:35:53+00:00</updated>
<author>
<name>Maciej W. Rozycki</name>
<email>macro@redhat.com</email>
</author>
<published>2025-03-28T12:35:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=a26638424ffea604f7ef94d0c6f3940304698442'/>
<id>a26638424ffea604f7ef94d0c6f3940304698442</id>
<content type='text'>
Fix the handling of real 'scanf' input such as "+.e" as per BZ #13988
for the i18n case as well, complementing commit 6ecec3b616ae ("Don't
accept exp char without preceding digits in scanf float parsing"), where
the 'e' character is incorrectly consumed from input.  Add a test case
matching stdio-common/bug26.c, with bits from localedata/tst-sscanf.c.

Reviewed-by: Joseph Myers &lt;josmyers@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix the handling of real 'scanf' input such as "+.e" as per BZ #13988
for the i18n case as well, complementing commit 6ecec3b616ae ("Don't
accept exp char without preceding digits in scanf float parsing"), where
the 'e' character is incorrectly consumed from input.  Add a test case
matching stdio-common/bug26.c, with bits from localedata/tst-sscanf.c.

Reviewed-by: Joseph Myers &lt;josmyers@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Update copyright dates with scripts/update-copyrights</title>
<updated>2025-01-01T19:22:09+00:00</updated>
<author>
<name>Paul Eggert</name>
<email>eggert@cs.ucla.edu</email>
</author>
<published>2025-01-01T18:14:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=2642002380aafb71a1d3b569b6d7ebeab3284816'/>
<id>2642002380aafb71a1d3b569b6d7ebeab3284816</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Define ISO 639-3 "ltg" (Latgalian) and add ltg_LV locale</title>
<updated>2024-06-17T08:53:16+00:00</updated>
<author>
<name>Mike FABIAN</name>
<email>mfabian@redhat.com</email>
</author>
<published>2024-06-10T17:54:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=3ea79f50853afcbe17d6a4e2537e1bd5a2d17e7d'/>
<id>3ea79f50853afcbe17d6a4e2537e1bd5a2d17e7d</id>
<content type='text'>
Resolves: BZ # 31411

References:
https://iso639-3.sil.org/code/ltg
https://en.wikipedia.org/wiki/Latgalian_language
https://github.com/unicode-org/cldr/blob/main/common/main/ltg.xml
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Resolves: BZ # 31411

References:
https://iso639-3.sil.org/code/ltg
https://en.wikipedia.org/wiki/Latgalian_language
https://github.com/unicode-org/cldr/blob/main/common/main/ltg.xml
</pre>
</div>
</content>
</entry>
<entry>
<title>localedata: add mdf_RU locale</title>
<updated>2024-05-08T12:27:40+00:00</updated>
<author>
<name>Mike FABIAN</name>
<email>mfabian@redhat.com</email>
</author>
<published>2024-05-07T09:12:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=79fe4a0fa07d0fff98888965f47842251f554dd4'/>
<id>79fe4a0fa07d0fff98888965f47842251f554dd4</id>
<content type='text'>
Resolves: BZ # 31530
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Resolves: BZ # 31530
</pre>
</div>
</content>
</entry>
<entry>
<title>locale: Handle loading a missing locale twice (Bug 14247)</title>
<updated>2024-04-22T20:03:00+00:00</updated>
<author>
<name>Carlos O'Donell</name>
<email>carlos@redhat.com</email>
</author>
<published>2024-04-22T12:16:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=684fbab755e727a8c15f8b621648d66694cd1f53'/>
<id>684fbab755e727a8c15f8b621648d66694cd1f53</id>
<content type='text'>
Delay setting file-&gt;decided until the data has been successfully loaded
by _nl_load_locale().  If the function fails to load the data then we
must return and error and leave decided untouched to allow the caller to
attempt to load the data again at a later time.  We should not set
decided to 1 early in the function since doing so may prevent attempting
to load it again. We want to try loading it again because that allows an
open to fail and set errno correctly.

On the other side of this problem is that if we are called again with
the same inputs we will fetch the cached version of the object and carry
out no open syscalls and that fails to set errno so we must set errno to
ENOENT in that case.  There is a second code path that has to be handled
where the name of the locale matches but the codeset doesn't match.

These changes ensure that errno is correctly set on failure in all the
return paths in _nl_find_locale().

Adds tst-locale-loadlocale to cover the bug.

No regressions on x86_64.

Co-authored-by: Jeff Law &lt;law@redhat.com&gt;
Reviewed-by: Adhemerval Zanella  &lt;adhemerval.zanella@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Delay setting file-&gt;decided until the data has been successfully loaded
by _nl_load_locale().  If the function fails to load the data then we
must return and error and leave decided untouched to allow the caller to
attempt to load the data again at a later time.  We should not set
decided to 1 early in the function since doing so may prevent attempting
to load it again. We want to try loading it again because that allows an
open to fail and set errno correctly.

On the other side of this problem is that if we are called again with
the same inputs we will fetch the cached version of the object and carry
out no open syscalls and that fails to set errno so we must set errno to
ENOENT in that case.  There is a second code path that has to be handled
where the name of the locale matches but the codeset doesn't match.

These changes ensure that errno is correctly set on failure in all the
return paths in _nl_find_locale().

Adds tst-locale-loadlocale to cover the bug.

No regressions on x86_64.

Co-authored-by: Jeff Law &lt;law@redhat.com&gt;
Reviewed-by: Adhemerval Zanella  &lt;adhemerval.zanella@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>localedata: Sort Makefile variables.</title>
<updated>2024-01-10T19:08:26+00:00</updated>
<author>
<name>Carlos O'Donell</name>
<email>carlos@redhat.com</email>
</author>
<published>2024-01-09T13:29:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=a09b2aacd9a23969aa3d768a22fe491e1ee98cf3'/>
<id>a09b2aacd9a23969aa3d768a22fe491e1ee98cf3</id>
<content type='text'>
Sort Makefile variables using scrips/sort-makefile-lines.py.

No regressions on x86_64.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sort Makefile variables using scrips/sort-makefile-lines.py.

No regressions on x86_64.
</pre>
</div>
</content>
</entry>
<entry>
<title>Update copyright dates with scripts/update-copyrights</title>
<updated>2024-01-01T18:53:40+00:00</updated>
<author>
<name>Paul Eggert</name>
<email>eggert@cs.ucla.edu</email>
</author>
<published>2024-01-01T18:12:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=dff8da6b3e89b986bb7f6b1ec18cf65d5972e307'/>
<id>dff8da6b3e89b986bb7f6b1ec18cf65d5972e307</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Adapt collation in th_TH locale to use the iso14651_t1_common file and sync the collation with CLDR</title>
<updated>2023-09-21T08:34:35+00:00</updated>
<author>
<name>Mike FABIAN</name>
<email>mfabian@redhat.com</email>
</author>
<published>2023-06-01T15:02:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.belthelziquor.com/glibc.git/commit/?id=aceda10bd5131cf716830827d66da9c671dec649'/>
<id>aceda10bd5131cf716830827d66da9c671dec649</id>
<content type='text'>
I made it to agree as much as possible with the rules from CLDR (see:
https://github.com/unicode-org/cldr/blob/main/common/collation/th.xml).

It seems to be impossible to follow the CLDR rules

  &amp;[before 1]๚&lt;ฯ # should be "variable"

and

  &amp;๛&lt;ๆ # should be "variable"

exactly though. These ask for a primary difference in punctuation
characters whose primary weight should be "IGNORE". But using a
secondary differnence instead still sorts the test data correctly and
the previously used collation in th_TH used tertiary differences for
these characters.

There was old localedata/th_TH.in test data in TIS-620 encoding which
was not used (it was not in the localedata/Makefile). I converted this
to UTF-8 and moved it to localedata/th_TH.UTF-8.in and added it to
localedata/Makefile.

Using the existing collation rules in the th_TH locale did not sort that
test file completely correct, I think my new collation rules based on
iso14651_t1 are better.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I made it to agree as much as possible with the rules from CLDR (see:
https://github.com/unicode-org/cldr/blob/main/common/collation/th.xml).

It seems to be impossible to follow the CLDR rules

  &amp;[before 1]๚&lt;ฯ # should be "variable"

and

  &amp;๛&lt;ๆ # should be "variable"

exactly though. These ask for a primary difference in punctuation
characters whose primary weight should be "IGNORE". But using a
secondary differnence instead still sorts the test data correctly and
the previously used collation in th_TH used tertiary differences for
these characters.

There was old localedata/th_TH.in test data in TIS-620 encoding which
was not used (it was not in the localedata/Makefile). I converted this
to UTF-8 and moved it to localedata/th_TH.UTF-8.in and added it to
localedata/Makefile.

Using the existing collation rules in the th_TH locale did not sort that
test file completely correct, I think my new collation rules based on
iso14651_t1 are better.
</pre>
</div>
</content>
</entry>
</feed>
