diff options
| author | Joshua Haberman <jhaberman@gmail.com> | 2025-11-21 17:03:40 -0800 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-11-21 20:03:40 -0500 |
| commit | 58e2dde45f775328b71b532e65762a9696ccccbd (patch) | |
| tree | 5b5fbf7ecabceed71eadca52a635497041c5834b /lld/MachO/SymbolTable.cpp | |
| parent | 99120bb51bf728d7ba7fad5068227f8c6e707159 (diff) | |
[lld:MachO] Allow independent override of weak symbols aliased via .set (#167825)
Currently, if multiple external weak symbols are defined at the same
address in an object file (e.g., by using the .set assembler directive
to alias them to a single weak variable), ld64.lld treats them as a
single unit. When any one of these symbols is overridden by a strong
definition, all of the original weak symbols resolve to the strong
definition.
This patch changes the behavior in `transplantSymbolsAtOffset`. When a
weak symbol is being replaced by a strong one, only non-external (local)
symbols at the same offset are moved to the new symbol's section. Other
*external* symbols are no longer transplanted.
This allows each external weak symbol to be overridden independently.
This behavior is consistent with Apple's ld-classic, but diverges from
ld-prime in one case, as noted on
https://github.com/llvm/llvm-project/issues/167262 (this discrepancy has
recently been reported to Apple).
### Backward Compatibility
This change alters linker behavior for a specific scenario. The creation
of multiple external weak symbols aliased to the same address via
assembler directives is primarily an advanced technique. It's unlikely
that existing builds rely on the current behavior of all aliases being
overridden together.
If there are concerns, this could be put behind a linker option, but the
new default seems more correct, less surprising, and is consistent with
ld-classic.
### Testing
The new lit test `test/MachO/weak-alias-override.s` verifies this
behavior using llvm-nm.
Fixes #167262
Diffstat (limited to 'lld/MachO/SymbolTable.cpp')
| -rw-r--r-- | lld/MachO/SymbolTable.cpp | 35 |
1 files changed, 18 insertions, 17 deletions
diff --git a/lld/MachO/SymbolTable.cpp b/lld/MachO/SymbolTable.cpp index baddddcb76fb..a7db5a3ac96e 100644 --- a/lld/MachO/SymbolTable.cpp +++ b/lld/MachO/SymbolTable.cpp @@ -61,8 +61,8 @@ struct DuplicateSymbolDiag { SmallVector<DuplicateSymbolDiag> dupSymDiags; } // namespace -// Move symbols at \p fromOff in \p fromIsec into \p toIsec, unless that symbol -// is \p skip. +// Move local symbols at \p fromOff in \p fromIsec into \p toIsec, unless that +// symbol is \p skip, in which case we just remove it. static void transplantSymbolsAtOffset(InputSection *fromIsec, InputSection *toIsec, Defined *skip, uint64_t fromOff, uint64_t toOff) { @@ -78,22 +78,23 @@ static void transplantSymbolsAtOffset(InputSection *fromIsec, auto insertIt = llvm::upper_bound(toIsec->symbols, toOff, symSucceedsOff); llvm::erase_if(fromIsec->symbols, [&](Symbol *s) { auto *d = cast<Defined>(s); - if (d->value != fromOff) + if (d == skip) + return true; + if (d->value != fromOff || d->isExternal()) return false; - if (d != skip) { - // This repeated insertion will be quadratic unless insertIt is the end - // iterator. However, that is typically the case for files that have - // .subsections_via_symbols set. - insertIt = toIsec->symbols.insert(insertIt, d); - d->originalIsec = toIsec; - d->value = toOff; - // We don't want to have more than one unwindEntry at a given address, so - // drop the redundant ones. We We can safely drop the unwindEntries of - // the symbols in fromIsec since we will be adding another unwindEntry as - // we finish parsing toIsec's file. (We can assume that toIsec has its - // own unwindEntry because of the ODR.) - d->originalUnwindEntry = nullptr; - } + + // This repeated insertion will be quadratic unless insertIt is the end + // iterator. However, that is typically the case for files that have + // .subsections_via_symbols set. + insertIt = toIsec->symbols.insert(insertIt, d); + d->originalIsec = toIsec; + d->value = toOff; + // We don't want to have more than one unwindEntry at a given address, so + // drop the redundant ones. We can safely drop the unwindEntries of the + // symbols in fromIsec since we will be adding another unwindEntry as we + // finish parsing toIsec's file. (We can assume that toIsec has its own + // unwindEntry because of the ODR.) + d->originalUnwindEntry = nullptr; return true; }); } |
