summaryrefslogtreecommitdiff
path: root/lld/ELF/InputSection.cpp
AgeCommit message (Collapse)Author
2025-09-29ELF: Use preprocessed relocations for EhInputSection scanningFangrui Song
.eh_frame sections require special sub-section processing, specifically, CIEs are de-duplicated and FDEs are garbage collected. Create a specialized scanEhSection() function utilizing the just-added EhInputSection::rels. OffsetGetter is moved to scanEhSection. This improves separation of concerns between InputSection and EhInputSection processing. This removes another `relsOrRelas` call using `supportsCrel=false`. DWARF.cpp now has the last call. Pull Request: https://github.com/llvm/llvm-project/pull/161091
2025-09-29ELF: Store EhInputSection relocations to simplify code. NFCFangrui Song
Store relocations directly as `SmallVector<Relocation, 0>` within EhInputSection to avoid processing different relocation formats (REL/RELA/CREL) throughout the codebase. Next: Refactor RelocationScanner to utilize EhInputSection::rels Pull Request: https://github.com/llvm/llvm-project/pull/161041
2025-09-22ELF: Split relocateAlloc to relocateAlloc and relocateEh. NFCFangrui Song
relocateAlloc can be called with either InputSection (including SyntheticSection like GotSection) or EhInputSection. Introduce relocateEh so that we can remove some boilerplate and replace relocateAlloc's parameter type with `InputSection`. Pull Request: https://github.com/llvm/llvm-project/pull/160031
2025-09-09[JITLink][RISC-V] Support R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128 (#153778)Zhijin Zeng
Support bolt instrument the elf binary which has exception handling table. Fixes #153775
2025-07-30[NFCI][ELF][Mips] Replace MipsMultiGotPage with new RE_MIPS_OSEC_LOCAL_PAGEJessica Clarke
Instead of having a special DynamicReloc::Kind, we can just use a new RelExpr for the calculation needed. The only odd thing we do that allows this is to keep a representative symbol for the OutputSection in question (the first we see for it) around to use in this relocation for the addend calculation. This reduces DynamicReloc to just AddendOnly vs AgainstSymbol, plus the internal Computed. Reviewers: MaskRay, arichardson Reviewed By: MaskRay, arichardson Pull Request: https://github.com/llvm/llvm-project/pull/150810
2025-07-02[lld][LoongArch] Support TLSDESC GD/LD to IE/LE (#123715)Zhaoxin Yang
Support TLSDESC to initial-exec or local-exec optimizations. Introduce a new hook RE_LOONGARCH_RELAX_TLS_GD_TO_IE_PAGE_PC and use existing R_RELAX_TLS_GD_TO_IE_ABS to support TLSDESC => IE, while use existing R_RELAX_TLS_GD_TO_LE to support TLSDESC => LE. In normal or medium code model, there are two forms of code sequences: * pcalau12i $a0, %desc_pc_hi20(sym_desc) * addi.d $a0, $a0, %desc_pc_lo12(sym_desc) * ld.d $ra, $a0, %desc_ld(sym_desc) * jirl $ra, $ra, %desc_call(sym_desc) ------ * pcaddi $a0, %desc_pcrel_20(sym_desc) * ld.d $ra, $a0, %desc_ld(sym_desc) * jirl $ra, $ra, %desc_call(sym_desc) Convert to IE: * pcalau12i $a0, %ie_pc_hi20(sym_ie) * ld.[wd] $a0, $a0, %ie_pc_lo12(sym_ie) Convert to LE: * lu12i.w $a0, %le_hi20(sym_le) # le_hi20 != 0, otherwise NOP * ori $a0 src, %le_lo12(sym_le) # le_hi20 != 0, src = $a0, otherwise src = $zero Simplicity, whether tlsdescToIe or tlsdescToLe, we always tend to convert the preceding instructions to NOPs, due to both forms of code sequence (corresponding to relocation combinations: R_LARCH_TLS_DESC_PC_HI20+R_LARCH_TLS_DESC_PC_LO12 and R_LARCH_TLS_DESC_PCREL20_S2) have same process. TODO: When relaxation enables, redundant NOPs can be removed. It will be implemented in a future patch. Note: All forms of TLSDESC code sequences should not appear interleaved in the normal, medium or extreme code model, which compilers do not generate and lld is unsupported. This is thanks to the guard in PostRASchedulerList.cpp in llvm. ``` Calls are not scheduling boundaries before register allocation, but post-ra we don't gain anything by scheduling across calls since we don't need to worry about register pressure. ```
2025-06-24Reapply "ELF: Add branch-to-branch optimization."Peter Collingbourne
Fixed assertion failure when reading .eh_frame sections, and added .eh_frame sections to tests. This reverts commit 1e95349dbe329938d2962a78baa0ec421e9cd7d1. Original commit message follows: When code calls a function which then immediately tail calls another function there is no need to go via the intermediate function. By branching directly to the target function we reduce the program's working set for a slight increase in runtime performance. Normally it is relatively uncommon to have functions that just tail call another function, but with LLVM control flow integrity we have jump tables that replace the function itself as the canonical address. As a result, when a function address is taken and called directly, for example after a compiler optimization resolves the indirect call, or if code built without control flow integrity calls the function, the call will go via the jump table. The impact of this optimization was measured using a large internal Google benchmark. The results were as follows: CFI enabled: +0.1% ± 0.05% queries per second CFI disabled: +0.01% queries per second [not statistically significant] The optimization is enabled by default at -O2 but may also be enabled or disabled individually with --{,no-}branch-to-branch. This optimization is implemented for AArch64 and X86_64 only. lld's runtime performance (real execution time) after adding this optimization was measured using firefox-x64 from lld-speed-test [1] with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows: ``` N Min Max Median Avg Stddev x 512 1.2264546 1.3481076 1.2970261 1.2965788 0.018620888 + 512 1.2561196 1.3839965 1.3214632 1.3209327 0.019443971 Difference at 95.0% confidence 0.0243538 +/- 0.00233202 1.87831% +/- 0.179859% (Student's t, pooled s = 0.0190369) ``` [1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057 Reviewers: zmodem, MaskRay Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/145579
2025-06-23Revert "ELF: Add branch-to-branch optimization."Hans Wennborg
This caused assertion failures in applyBranchToBranchOpt(): llvm/include/llvm/Support/Casting.h:578: decltype(auto) llvm::cast(From*) [with To = lld::elf::InputSection; From = lld::elf::InputSectionBase]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed. See comment on the PR (https://github.com/llvm/llvm-project/pull/138366) This reverts commit 491b82a5ec1add78d2c93370580a2f1897b6a364. This also reverts the follow-up "[lld] Use llvm::partition_point (NFC) (#145209)" This reverts commit 2ac293f5ac4cf65c0c038bf75a88f1d6715e467d.
2025-06-20ELF: Add branch-to-branch optimization.Peter Collingbourne
When code calls a function which then immediately tail calls another function there is no need to go via the intermediate function. By branching directly to the target function we reduce the program's working set for a slight increase in runtime performance. Normally it is relatively uncommon to have functions that just tail call another function, but with LLVM control flow integrity we have jump tables that replace the function itself as the canonical address. As a result, when a function address is taken and called directly, for example after a compiler optimization resolves the indirect call, or if code built without control flow integrity calls the function, the call will go via the jump table. The impact of this optimization was measured using a large internal Google benchmark. The results were as follows: CFI enabled: +0.1% ± 0.05% queries per second CFI disabled: +0.01% queries per second [not statistically significant] The optimization is enabled by default at -O2 but may also be enabled or disabled individually with --{,no-}branch-to-branch. This optimization is implemented for AArch64 and X86_64 only. lld's runtime performance (real execution time) after adding this optimization was measured using firefox-x64 from lld-speed-test [1] with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows: ``` N Min Max Median Avg Stddev x 512 1.2264546 1.3481076 1.2970261 1.2965788 0.018620888 + 512 1.2561196 1.3839965 1.3214632 1.3209327 0.019443971 Difference at 95.0% confidence 0.0243538 +/- 0.00233202 1.87831% +/- 0.179859% (Student's t, pooled s = 0.0190369) ``` [1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057 Pull Request: https://github.com/llvm/llvm-project/pull/138366
2025-05-25[lld] Remove unused includes (NFC) (#141421)Kazu Hirata
2025-01-25[ELF] MergeInputSection: replace Fatal with ErrFangrui Song
In LLD_IN_TEST=2 mode, when a thread calls Fatal, there will be no output even if the process exits with code 1. Change a few Fatal to recoverable Err.
2025-01-25[ELF] Replace a few Fatal with ErrFangrui Song
In LLD_IN_TEST=2 mode, when a thread calls Fatal, there will be no output even if the process exits with code 1. Change a few Fatal to recoverable Err.
2025-01-22[PAC][lld][AArch64][ELF] Support signed TLSDESC (#113817)Daniil Kovalev
Depends on #120010 Support `R_AARCH64_AUTH_TLSDESC_ADR_PAGE21`, `R_AARCH64_AUTH_TLSDESC_LD64_LO12` and `R_AARCH64_AUTH_TLSDESC_LD64_LO12` static relocations and `R_AARCH64_AUTH_TLSDESC` dynamic relocation. IE/LE optimization is not currently supported for AUTH TLSDESC.
2024-12-18[PAC][lld][AArch64][ELF] Support signed GOT with tiny code model (#113816)Daniil Kovalev
Depends on #114525 Support `R_AARCH64_AUTH_GOT_ADR_PREL_LO21` and `R_AARCH64_AUTH_GOT_LD_PREL19` GOT-generating relocations. A corresponding `RE_AARCH64_AUTH_GOT_PC` member of `RelExpr` is added, which is an AUTH-specific variant of `R_GOT_PC`.
2024-12-17[PAC][lld][AArch64][ELF] Support signed GOT (#113815)Daniil Kovalev
Depends on #113811 Support `R_AARCH64_AUTH_ADR_GOT_PAGE`, `R_AARCH64_AUTH_GOT_LO12_NC` and `R_AARCH64_AUTH_GOT_ADD_LO12_NC` GOT-generating relocations. For preemptible symbols, dynamic relocation `R_AARCH64_AUTH_GLOB_DAT` is emitted. Otherwise, we unconditionally emit `R_AARCH64_AUTH_RELATIVE` dynamic relocation since pointers in signed GOT needs to be signed during dynamic link time.
2024-12-03[ELF] Rename target-specific RelExpr enumeratorsFangrui Song
RelExpr enumerators are named `R_*`, which can be confused with ELF relocation type names. Rename the target-specific ones to `RE_*` to avoid confusion. For consistency, the target-independent ones can be renamed as well, but that's not urgent. The relocation processing mechanism with RelExpr has non-trivial overhead compared with mold's approach, and we might make more code into Arch/*.cpp files and decrease the enumerators. Pull Request: https://github.com/llvm/llvm-project/pull/118424
2024-12-01[ELF] decompress: remove mutexFangrui Song
decompress() is in the parallel code path splitIntoPieces and we should avoid mutex.
2024-11-29[ELF] Change getSrcMsg to use ELFSyncStream. NFCFangrui Song
2024-11-29[ELF] Move some ObjFile members to ELFFileBase to simplify getSrcMsgFangrui Song
2024-11-29[ELF] Use lower case offset in getObjMsgFangrui Song
to improve consistency with other diagnostics. While here, migrate to use ELFSyncStream to drop toStr/getCtx uses and avoid string overhead.
2024-11-25[LLD][ARM] Allow R_ARM_SBREL32 relocations in debug info (#116956)Oliver Stannard
The R_ARM_SBREL32 relocation is used in debug info for ARM RWPI (read-write position independent) code. Compiler-generated DWARF info will use an expression to add the relocated value to the actual value of the static base (held in r9) at run-time, so it should be relocated as if the static base is at address 0.
2024-11-24[ELF] Remove unneeded Twine in ELFSyncStreamFangrui Song
2024-11-23[ELF] Reorder SectionBase/InputSectionBase membersFangrui Song
Move `sectionKind` outside the bitfield and move bss/keepUnique to InputSectionBase. * sizeof(InputSection) decreases from 160 to 152 on 64-bit systems. * The numerous `sectionKind` accesses are faster.
2024-11-23[ELF] Make section member orders consistentFangrui Song
SectionBase, InputSectionBase, InputSection, MergeInputSection, and OutputSection have different member orders. Make them consistent and adopt the order similar to the raw Elf64_Shdr.
2024-11-17[ELF] Work around extra "warning: $" with MSVC 14.41.34120Fangrui Song
2024-11-16[ELF] Replace functions bAlloc/saver/uniqueSaver with member accessFangrui Song
2024-11-16[ELF] Simplify relocateNonAlloc diagnosticFangrui Song
2024-11-16[ELF] Remove unneeded Twine()Fangrui Song
2024-11-16[ELF] Pass ctx to bAlloc/saver/uniqueSaverFangrui Song
2024-11-16[ELF] Remove unneeded toString(Error) when using ELFSyncStreamFangrui Song
2024-11-16[ELF] Remove unneeded toString(Error) when using ELFSyncStreamFangrui Song
2024-11-16[ELF] Replace global ctx with getCtx()Fangrui Song
2024-11-16[ELF] Replace contex-less toString(x) with toStr(ctx, x)Fangrui Song
so that we can remove the global `ctx` from toString implementations. Rename lld::toString (to lld::elf::toStr) to simplify name lookup (we have many llvm::toString and another lld::toString(const llvm::opt::Arg &)).
2024-11-16[ELF] Replace toString(RelType) with operator<< while using ELFSyncStreamFangrui Song
2024-11-14[ELF] Migrate away from global ctxFangrui Song
2024-11-14[ELF] Migrate away from global ctxFangrui Song
2024-11-14[ELF] Migrate away from global ctxFangrui Song
2024-11-06[ELF] Replace errorOrWarn(...) with ErrFangrui Song
2024-11-06[ELF] Replace warn(...) with WarnFangrui Song
2024-11-06[ELF] Replace error(...) with ErrAlways or ErrFangrui Song
Most are migrated to ErrAlways mechanically. In the future we should change most to Err.
2024-11-06[ELF] Replace fatal(...) with Fatal or ErrFangrui Song
2024-10-19[ELF] Pass Ctx &Fangrui Song
2024-10-19[ELF] Pass Ctx & to Symbol::getVAFangrui Song
2024-10-11[ELF] Make .comment have a non-full fileFangrui Song
This ensures that SectionBase::file is non-null except InputSection::discarded.
2024-10-11[ELF] Pass Ctx & to InputSectionFangrui Song
2024-10-11[ELF] Pass Ctx & to OutputSectionFangrui Song
2024-10-11[ELF] Pass Ctx &Fangrui Song
2024-10-11[ELF] Pass Ctx &Fangrui Song
2024-10-10[ELF] Revert Ctx & parameters from SyntheticSectionFangrui Song
Since Ctx &ctx is a member variable, 1f391a75af8685e6bba89421443d72ac6a186599 7a5b9ef54eb96abd8415fd893576c42e51fd95db e2f0ec3a3a8a2981be8a1aac2004cfb9064c61e8 can be reverted.
2024-10-10[ELF] Move InputSectionBase::file to SectionBaseFangrui Song
... and add getCtx (file->ctx). This allows InputSectionBase and OutputSection to access ctx without taking an extra function argument.