diff options
| author | Joseph Myers <josmyers@redhat.com> | 2024-12-03 13:01:58 +0000 |
|---|---|---|
| committer | Joseph Myers <josmyers@redhat.com> | 2024-12-03 13:01:58 +0000 |
| commit | f3b5de944ad6d1f6a10f819b816c2ba234ecd8c0 (patch) | |
| tree | 45c15b02d46a42f4ffc5167fff2fae29ebc36b0e /libcpp/charset.cc | |
| parent | af9a3fe6a52974252516b3eea4c5ab5caae47b4b (diff) | |
preprocessor: Adjust C rules on UCNs for C23 [PR117162]
As noted in bug 117162, C23 changed some rules on UCNs to match C++
(this was a late change agreed in the resolution to CD2 comment
US-032, implementing changes from N3124), which we need to implement.
Allow UCNs below 0xa0 outside identifiers for C, with a
pedwarn-if-pedantic before C23 (and a warning with -Wc11-c23-compat)
except for the always-allowed cases of UCNs for $ @ `. Also as part
of that change, do not allow \u0024 in identifiers as equivalent to $
for C23.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
PR c/117162
libcpp/
* include/cpplib.h (struct cpp_options): Add low_ucns.
* init.cc (struct lang_flags, lang_defaults): Add low_ucns.
(cpp_set_lang): Set low_ucns
* charset.cc (_cpp_valid_ucn): For C, allow UCNs below 0xa0
outside identifiers, with a pedwarn if pedantic before C23 or a
warning with -Wc11-c23-compat. Do not allow \u0024 in identifiers
for C23.
gcc/testsuite/
* gcc.dg/cpp/c17-ucn-1.c, gcc.dg/cpp/c17-ucn-2.c,
gcc.dg/cpp/c17-ucn-3.c, gcc.dg/cpp/c17-ucn-4.c,
gcc.dg/cpp/c23-ucn-2.c, gcc.dg/cpp/c23-ucnid-2.c: New tests.
* c-c++-common/cpp/delimited-escape-seq-3.c,
c-c++-common/cpp/named-universal-char-escape-3.c,
gcc.dg/cpp/c23-ucn-1.c, gcc.dg/cpp/c2y-delimited-escape-seq-3.c:
Update expected messages
* gcc.dg/cpp/ucs.c: Use -pedantic-errors. Update expected
messages.
Diffstat (limited to 'libcpp/charset.cc')
| -rw-r--r-- | libcpp/charset.cc | 34 |
1 files changed, 24 insertions, 10 deletions
diff --git a/libcpp/charset.cc b/libcpp/charset.cc index 9337fbc3a7a..72049e61fa8 100644 --- a/libcpp/charset.cc +++ b/libcpp/charset.cc @@ -1811,14 +1811,7 @@ _cpp_valid_ucn (cpp_reader *pfile, const uchar **pstr, (int) (str - base), base); result = 1; } - /* The C99 standard permits $, @ and ` to be specified as UCNs. We use - hex escapes so that this also works with EBCDIC hosts. - C++0x permits everything below 0xa0 within literals; - ucn_valid_in_identifier will complain about identifiers. */ - else if ((result < 0xa0 - && !CPP_OPTION (pfile, cplusplus) - && (result != 0x24 && result != 0x40 && result != 0x60)) - || (result & 0x80000000) + else if ((result & 0x80000000) || (result >= 0xD800 && result <= 0xDFFF)) { cpp_error (pfile, CPP_DL_ERROR, @@ -1826,13 +1819,34 @@ _cpp_valid_ucn (cpp_reader *pfile, const uchar **pstr, (int) (str - base), base); result = 1; } + /* The C99 standard permits $, @ and ` to be specified as UCNs. We use + hex escapes so that this also works with EBCDIC hosts. + C++0x permits everything below 0xa0 within literals, as does C23; + ucn_valid_in_identifier will complain about identifiers. */ + else if (result < 0xa0 + && !identifier_pos + && !CPP_OPTION (pfile, cplusplus) + && (result != 0x24 && result != 0x40 && result != 0x60)) + { + bool warned = false; + if (!CPP_OPTION (pfile, low_ucns) && CPP_OPTION (pfile, cpp_pedantic)) + warned = cpp_pedwarning (pfile, CPP_W_PEDANTIC, + "%.*s is not a valid universal character" + " name before C23", (int) (str - base), base); + if (!warned && CPP_OPTION (pfile, cpp_warn_c11_c23_compat) > 0) + warned = cpp_warning (pfile, CPP_W_C11_C23_COMPAT, + "%.*s is not a valid universal character" + " name before C23", (int) (str - base), base); + } else if (identifier_pos && result == 0x24 && CPP_OPTION (pfile, dollars_in_ident) /* In C++26 when dollars are allowed in identifiers, we should still reject \u0024 as $ is part of the basic - character set. */ + character set. C23 also does not allow \u0024 in + identifiers. */ && !(CPP_OPTION (pfile, cplusplus) - && CPP_OPTION (pfile, lang) > CLK_CXX23)) + ? CPP_OPTION (pfile, lang) > CLK_CXX23 + : CPP_OPTION (pfile, low_ucns))) { if (CPP_OPTION (pfile, warn_dollars) && !pfile->state.skipping) { |
