summaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
AgeCommit message (Collapse)Author
2025-11-20[DebugInfo] Force early line-zero calls to have meaningful locations (#156850)Jeremy Morse
In functions that have been seriously deformed during optimisation, there can be call instructions with line-zero immediately after frame setup (see C reproducer in the test added). Our previous algorithms for prologue_end ignored these, meaning someone entering a function at prologue_end would break-in after a function call had completed. Prefer instead to place prologue_end and the function scope-line on the line zero call: this isn't false (it's the first meaningful instruction of the function) and is approximately true. Given a less than ideal function, this is an OK solution.
2025-11-12[AsmPrinter] Replace improper use of Register with MCRegUnit (NFC) (#167682)Sergei Barannikov
2025-11-10[DebugInfo] Add Verifier check for incorrectly-scoped retainedNodes (#166855)Vladislav Dzhidzhoev
These checks ensure that retained nodes of a DISubprogram belong to the subprogram. Tests with incorrect IR are fixed. We should not have variables of one subprogram present in retained nodes of other subprograms. Also, interface for accessing DISubprogram's retained nodes is slightly refactored. `DISubprogram::visitRetainedNodes` and `DISubprogram::forEachRetainedNode` are added to avoid repeating checks like ``` if (const auto *LV = dyn_cast<DILocalVariable>(N)) ... else if (const auto *L = dyn_cast<DILabel>(N)) ... else if (const auto *IE = dyn_cast<DIImportedEntity>(N)) ... ```
2025-10-17[llvm][DebugInfo] Add support for emitting DW_AT_language_version (#163147)Michael Buch
Depends on: * https://github.com/llvm/llvm-project/pull/162632 Emit `DW_AT_language_version` (new in DWARFv6) if DICompileUnit has a `sourceLanguageVersion` field. Omit it if it has the default value of `0` (since it's the default lower bound of any language without a version scheme).
2025-10-17[DWARF] Don't leak line numbers onto frame-setup instructions (#161845)Orlando Cazalet-Hyams
Fixes issue #157887
2025-10-10[llvm][DebugInfo] Add support for emitting DW_AT_language_name (#162621)Michael Buch
Depends on: * https://github.com/llvm/llvm-project/pull/162445 * https://github.com/llvm/llvm-project/pull/162449 Emit `DW_AT_language_name` (new in DWARFv6) if `DICompileUnit` has a `sourceLanguageName` field. Emit a `DW_AT_language` otherwise.
2025-10-08[llvm][DebugInfo][NFC] Abstract DICompileUnit::SourceLanguage to allow ↵Michael Buch
alternate DWARF SourceLanguage encoding (#162255) This patch sets up `DICompileUnit` to support the DWARFv6 `DW_AT_language_name` and `DW_AT_language_version` attributes (which are set to replace `DW_AT_language`). This patch changes the `DICompileUnit::SourceLanguage` field type to a `DISourceLanguageName` that encapsulates the notion of "versioned vs. unversioned name". A "versioned" name is one that has an associated version stored separately in `DISourceLanguageName::Version`. This patch just changes all the clients of the `getSourceLanguage` API to the expect a `DISourceLanguageName`. Currently they all just `assert` (via `DISourceLanguageName::getUnversionedName`) that we're dealing with "unversioned names" (i.e., the pre-DWARFv6 language codes). In follow-up patches (e.g., draft is at https://github.com/llvm/llvm-project/pull/162261), when we start emitting versioned language codes, the `getUnversionedName` calls can then be adjusted to `getName`. **Implementation considerations** * We could have added a new member to `DICompileUnit` alongside the existing `SourceLanguage` field. I don't think this would have made the transition any simpler (clients would still need to be aware of "versioned" vs. "unversioned" language names). I felt that encapsulating this inside a `DISourceLanguageName` was easier to reason about for maintainers. * Currently DISourceLanguageName is a `12` byte structure. We could probably pack all the info inside a `uint64_t` (16-bits for the name, 32-bits for the version, 1-bit for answering the `hasVersionedName`). Just to keep the prototype simple I used a `std::optional`. But since the guts of the structure are hidden, we can always change the layout to a more compact representation instead. **How to review** * The new `DISourceLanguageName` structure is defined in `DebugInfoMetadata.h`. All the other changes fall out from changing the `DICompileUnit::SourceLanguage` from `unsigned` to `DISourceLanguageName`.
2025-09-29Reland "[DebugInfo][DwarfDebug] Separate creation and population of abstract ↵Vladislav Dzhidzhoev
subprogram DIEs" (#160786) This is an attempt to reland https://github.com/llvm/llvm-project/pull/159104 with the fix for https://github.com/llvm/llvm-project/issues/160197. The original patch had the following problem: when an abstract subprogram DIE is constructed from within `DwarfDebug::endFunctionImpl()`, `DwarfDebug::constructAbstractSubprogramScopeDIE()` acknowledges `unit:` field of DISubprogram. But an abstract subprogram DIE constructed from `DwarfDebug::beginModule()` was put in the same compile unit to which global variable referencing the subprogram belonged, regardless of subprogram's `unit:`. This is fixed by adding `DwarfDebug::getOrCreateAbstractSubprogramCU()` used by both`DwarfDebug:: constructAbstractSubprogramScopeDIE()` and `DwarfCompileUnit::getOrCreateSubprogramDIE()` when abstract subprogram is queried during the creation of DIEs for globals in `DwarfDebug::beginModule()`. The fix and the already-reviewed code from https://github.com/llvm/llvm-project/pull/159104 are two separate commits in this PR. ===== The original commit message follows: With this change, construction of abstract subprogram DIEs is split in two stages/functions: creation of DIE (in DwarfCompileUnit::getOrCreateAbstractSubprogramDIE) and its population with children (in DwarfCompileUnit::constructAbstractSubprogramScopeDIE). With that, abstract subprograms can be created/referenced from DwarfDebug::beginModule, which should solve the issue with static local variables DIE creation of inlined functons with optimized-out definitions. It fixes https://github.com/llvm/llvm-project/issues/29985. LexicalScopes class now stores mapping from DISubprograms to their corresponding llvm::Function's. It is supposed to be built before processing of each function (so, now LexicalScopes class has a method for "module initialization" alongside the method for "function initialization"). It is used by DwarfCompileUnit to determine whether a DISubprogram needs an abstract DIE before DwarfDebug::beginFunction is invoked. DwarfCompileUnit::getOrCreateSubprogramDIE method is added, which can create an abstract or a concrete DIE for a subprogram. It accepts llvm::Function* argument to determine whether a concrete DIE must be created. This is a temporary fix for https://github.com/llvm/llvm-project/issues/29985. Ideally, it will be fixed by moving global variables and types emission to DwarfDebug::endModule (https://reviews.llvm.org/D144007, https://reviews.llvm.org/D144005). Some code proposed by Ellis Hoag <ellis.sparky.hoag@gmail.com> in https://github.com/llvm/llvm-project/pull/90523 was taken for this commit.
2025-09-24[TII] Split isTrivialReMaterializable into two versions [nfc] (#160377)Philip Reames
This change builds on https://github.com/llvm/llvm-project/pull/160319 which tries to clarify which *callers* (not backends) assume that the result is actually trivial. This change itself should be NFC. Essentially, I'm just renaming the existing isTrivialRematerializable to the non-trivial version and then adding a new trivial version (with the same name as the prior function) and simplifying a few callers which want that semantic. This change does *not* enable non-trivial remat any more broadly than was already done for our targets which were lying through the old APIs; that will come separately. The goal here is simply to make the code easier to follow in terms of what assumptions are being made where. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>
2025-09-23Update callers of isTriviallyReMaterializable to check trivialness (#160319)Philip Reames
This is a preparatory change for an upcoming reorganization of our rematerialization APIs. Despite the interface being documented as "trivial" (meaning no virtual register uses on the instruction being considered for remat), our actual implementation inconsistently supports non-trivial remat, and certain backends (AMDGPU and RISC-V mostly) lie about instructions being trivial to abuse that. We want to allow non-triial remat more broadly, but first we need to do some cleanup to make it understandable what's going on. These three call sites are ones which appear to actually want the trivial definition, and appear fairly low risk to change. p.s. I'm deliberately *not* updating any APIs in this change, I'm going to do that as a followup once it's clear which category each callsite fits in.
2025-09-23Revert "[DebugInfo][DwarfDebug] Separate creation and population of abstract ↵Vladislav Dzhidzhoev
subprogram DIEs" (#160349) Reverts llvm/llvm-project#159104 due to the issues reported in https://github.com/llvm/llvm-project/issues/160197.
2025-09-18[DebugInfo] Emit skeleton to avoid mismatching inlining flags (#153568)Qiu Chaofan
This actually reverts 418120556398c01550d42500d56e6d328290185b. The original commit omits unit with all symbols inlined into current one, which leads to crash when a module using split-dwarf inlined a function from another module with mismatched split-dwarf-inlining option. This revert guarantees that DIEs are created in both DWO and the skeleton sections whenever split-dwarf is active.
2025-09-17[DebugInfo][DwarfDebug] Separate creation and population of abstract ↵Vladislav Dzhidzhoev
subprogram DIEs (#159104) With this change, construction of abstract subprogram DIEs is split in two stages/functions: creation of DIE (in DwarfCompileUnit::getOrCreateAbstractSubprogramDIE) and its population with children (in DwarfCompileUnit::constructAbstractSubprogramScopeDIE). With that, abstract subprograms can be created/referenced from DwarfDebug::beginModule, which should solve the issue with static local variables DIE creation of inlined functons with optimized-out definitions. It fixes https://github.com/llvm/llvm-project/issues/29985. LexicalScopes class now stores mapping from DISubprograms to their corresponding llvm::Function's. It is supposed to be built before processing of each function (so, now LexicalScopes class has a method for "module initialization" alongside the method for "function initialization"). It is used by DwarfCompileUnit to determine whether a DISubprogram needs an abstract DIE before DwarfDebug::beginFunction is invoked. DwarfCompileUnit::getOrCreateSubprogramDIE method is added, which can create an abstract or a concrete DIE for a subprogram. It accepts llvm::Function* argument to determine whether a concrete DIE must be created. This is a temporary fix for https://github.com/llvm/llvm-project/issues/29985. Ideally, it will be fixed by moving global variables and types emission to DwarfDebug::endModule (https://reviews.llvm.org/D144007, https://reviews.llvm.org/D144005). Some code proposed by Ellis Hoag <ellis.sparky.hoag@gmail.com> in https://github.com/llvm/llvm-project/pull/90523 was taken for this commit.
2025-09-08[llvm][DebugInfo] Emit DW_OP_lit0/1 for constant boolean values (#157167)Laxman Sole
Backends like NVPTX use -1 to indicate `true` and 0 to indicate `false` for boolean values. Machine instruction `#DBG_VALUE` also uses -1 to indicate a `true` boolean constant. However, during the DWARF generation, booleans are treated as unsigned variables, and the debug_loc expression, like `DW_OP_lit0; DW_OP_not` is emitted for the `true` value. This leads to the debugger printing `255` instead of `true` for constant boolean variables. This change emits `DW_OP_lit1` instead of `DW_OP_lit0; DW_OP_not`.
2025-08-30Revert "Emit DW_OP_lit0/1 for constant boolean values" (#156172)Michael Buch
Reverts llvm/llvm-project#155539 Failing on buildbots with: ``` Step 7 (test-build-stage1-unified-tree-check-all) failure: test (failure) ******************** TEST 'LLVM :: DebugInfo/debug-bool-const-location.ll' FAILED ******************** Exit Code: 1 Command Output (stderr): -- /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llc /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll -O3 -filetype=obj -o - | /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llvm-dwarfdump - | /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll # RUN: at line 2 + /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llc /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll -O3 -filetype=obj -o - + /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llvm-dwarfdump - + /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll:7:10: error: CHECK: expected string not found in input ; CHECK: {{.*}} DW_OP_lit0 ^ <stdin>:27:54: note: scanning from here [0x0000000000000018, 0x0000000000000020): DW_OP_lit1, DW_OP_stack_value ^ <stdin>:28:41: note: possible intended match here [0x0000000000000020, 0x0000000000000034): DW_OP_reg3 X3) ^ Input file: <stdin> Check file: /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/DebugInfo/debug-bool-const-location.ll -dump-input=help explains the following input dump. Input was: <<<<<< . . . 22: DW_AT_decl_line (5) 23: DW_AT_external (true) 24: 25: 0x0000003f: DW_TAG_variable 26: DW_AT_location (0x00000000: 27: [0x0000000000000018, 0x0000000000000020): DW_OP_lit1, DW_OP_stack_value check:7'0 X~~~~~~~~~~~~~~~~~~~ error: no match found 28: [0x0000000000000020, 0x0000000000000034): DW_OP_reg3 X3) check:7'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ check:7'1 ? possible intended match 29: DW_AT_name ("arg") check:7'0 ~~~~~~~~~~~~~~~~~~~~ 30: DW_AT_decl_file ("test") check:7'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 31: DW_AT_decl_line (5) check:7'0 ~~~~~~~~~~~~~~~~~~~~~ 32: DW_AT_type (0x0000004f "bool") check:7'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 33: check:7'0 ~ . ```
2025-08-30Emit DW_OP_lit0/1 for constant boolean values (#155539)Laxman Sole
Backends like NVPTX use -1 to indicate `true` and 0 to indicate `false` for boolean values. Machine instruction `#DBG_VALUE` also uses -1 to indicate a `true` boolean constant. However, during the DWARF generation, booleans are treated as unsigned variables, and the debug_loc expression, like `DW_OP_lit0; DW_OP_not` is emitted for the `true` value. This leads to the debugger printing `255` instead of `true` for constant boolean variables. This change emits `DW_OP_lit1` instead of `DW_OP_lot0; DW_OP_not`.
2025-08-06[DebugInfo][DWARF] Add heapallocsite information (#132073)Jann Horn
LLVM currently stores heapallocsite information in CodeView debuginfo, but not in DWARF debuginfo. Plumb it into DWARF as an LLVM-specific extension. heapallocsite debug information is useful when it is combined with allocator instrumentation that stores caller addresses; I've used a previous version of this patch for: - analyzing memory usage by object type - analyzing the distributions of values of class members Other possible uses might be: - attributing memory access profiles (for example, on Intel CPUs, from PEBS records with Linear Data Address) to object types or specific object members - adding type information to crash/ASAN reports
2025-08-05[DebugInfo][DWARF] Don't emit bogus DW_AT_call_target for complex calls ↵Jann
(#151378) On X86-64, LLVM currently generates the same DWARF debug info for `call rax` and `call [rax]`; in both cases, the generated DWARF claims that the call goes to address RAX. This bug occurs because the X86 machine instructions CALL64r and CALL64m both receive register operands, but those register operands have different semantics. To fix it, change DwarfDebug::constructCallSiteEntryDIEs() to validate the callee operand's semantics (`OperandType`) and make sure it is not semantically describing a memory location. This fix will result in less DW_TAG_call_site and DW_AT_call_target entries being generated. There is an existing test in dwarf-callsite-related-attrs.ll that asserts the broken behavior; remove the broken check, and instead add a new test dwarf-callsite-related-attrs-indirect.ll that checks behavior for indirect calls. The existing test xray-custom-log.ll is validating something even more broken: It checks the debug info generated by a PATCHABLE_EVENT_CALL. `TII->getCalleeOperand()` assumes that the first argument of a call instruction is always the destination, but the first argument of PATCHABLE_EVENT_CALL is instead the event structure; and so we were emitting debug info claiming the callee was stored in a register that actually contains some kind of xray event descriptor, and the test validates that this happens. I am breaking and deleting this test. I guess the intent there might have been to validate that we emit debuginfo referencing the target of the direct call that LLVM emits (which we don't do)? But I'm not sure.
2025-07-27[AsmPrinter] Remove an unnecessary cast (NFC) (#150839)Kazu Hirata
getLabelAfterInsn() already returns MCSymbol *.
2025-07-01[DwarfDebug] Slightly optimize computeKeyInstructions() (NFC) (#146357)Nikita Popov
Fetch the DILocation once, instead of many times. This is pretty trivial, but goes through out-of-line code.
2025-06-30[KeyInstr] Fully support mixed key/non-key inlining modes (#144103)Orlando Cazalet-Hyams
Patch 3/4 adding bitcode support, though the final patch doesn't depend on this one. Prior to this patch, a Key Instructions function inlined into a Not-Key-Instructions function fell back to Not-Key-Instructions handling. In order to fully support inlining mixed modes we need to run `computeKeyInstructions` (in case there's a Key Instructions scope) and `findForceIsStmtInstrs` (in case there's a Not-Key-Instructions scope) on all functions. This has a slight performance cost for all configurations - see PR for details.
2025-06-30[KeyInstr] Use DISubprogram's is-key-instructions-on flag at DWARF emission ↵Orlando Cazalet-Hyams
(#144104) Patch 2/4 adding bitcode support. A non-key-instructions function inlined into a key-instructions function uses non-key-instructions is_stmt placement (without `findForceIsStmtInstrs`). A key-instructions function inlined into a non-key-instructions function currently results in falling back to non-key-instructions for the inlined scope too. Both of these concessions (not using `findForceIsStmtInstrs` in the 1st case, and not using Key Instructions for the inlined scope in the 2nd) are for performance reasons; to do the right thing we'd need to run both `findForceIsStmtInstrs` and `computeKeyInstructions` - in case that's controversial I've got a separate PR for that: PR 144103.
2025-06-27Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… ↵Sterling-Augustine
(#145959) (#146112) Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… (#145959) This reapplies cbf781f0bdf2f680abbe784faedeefd6f84c246e, with fixes for the shared-library build and the unconventional sanitizer-runtime build. Original Description: This is the culmination of a series of changes described in [1]. Although somewhat large by line count, it is almost entirely mechanical, creating a new library in DebugInfo/DWARF/LowLevel. This new library has very minimal dependencies, allowing it to be used from more places than the normal DebugInfo/DWARF library--in particular from MC. 1. https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
2025-06-26Revert "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#… ↵Sterling-Augustine
(#145959) …145081)" This reverts commit cbf781f0bdf2f680abbe784faedeefd6f84c246e. Breaks a couple of buildbots.
2025-06-26[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#145081)Sterling-Augustine
This is the culmination of a series of changes described in [1]. Although somewhat large by line count, it is almost entirely mechanical, creating a new library in DebugInfo/DWARF/LowLevel. This new library has very minimal dependencies, allowing it to be used from more places than the normal DebugInfo/DWARF library--in particular from MC. I am happy to put it in another location, or to structure it differently if that makes sense. Some have suggested in BinaryFormat, but it is not a great fit there. But if that makes more sense to the reviewers, I can do that. Another possibility would be to use pass-through headers to allow clients who don't care to depend only on DebugInfo/DWARF. This would be a much less invasive change, and perhaps easier for clients. But also a system that hides details. Either way, I'm open. 1. https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
2025-06-13[KeyInstr][NFC] Fix incorrect atomGroup/rank uint size in computeKeyInstructionsOrlando Cazalet-Hyams
2025-06-05[llvm] Use *Map::try_emplace (NFC) (#143002)Kazu Hirata
- try_emplace(Key) is shorter than insert(std::make_pair(Key, 0)). - try_emplace performs value initialization without value parameters. - We overwrite values on successful insertion anyway.
2025-05-24[CodeGen] Remove unused includes (NFC) (#141320)Kazu Hirata
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-05-13[KeyInstr][DwarfDebug] Add is_stmt emission support (#133495)Orlando Cazalet-Hyams
Interpret Key Instructions metadata to determine is_stmt placement. The lowest rank (highest precedent) instructions in each {InlinedAt, atomGroup} set are candidates for is_stmt. Only the last instruction in each set in a given block gets is_stmt. Calls always get is_stmt. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
2025-05-12[llvm][DebugInfo] Drop \01 mangling prefix when inserting linkage name into ↵Michael Buch
accelerator table (#138852) On some platforms (particularly macOS), a `\01` prefix gets added to the name in an `asm` label. This gets stripped when we emit the [`DW_AT_linkage_name`](https://github.com/llvm/llvm-project/blob/2f877c2722e882fe6aaaab44d25b7a49ba0612e1/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp#L531). But we weren't stripping this prefix when inserting the linkage name into accelerator tables. This manifested in an issue where LLDB tried to look up a name in the index by linkage name, but wasn't able to find it because we indexed it with the `\01` unstripped. This patch strips the prefix before indexing.
2025-04-30Reapply "[DLCov] Implement DebugLoc coverage tracking (#107279)"Stephen Tozer
Reapplied after fixing the config issue that was causing issues following the previous merge. This reverts commit fdbf073a86573c9ac4d595fac8e06d252ce1469f.
2025-04-26[llvm] Use llvm::copy (NFC) (#137470)Kazu Hirata
2025-04-25Revert "[DLCov] Implement DebugLoc coverage tracking (#107279)"Stephen Tozer
This reverts commit a9d93ecf1f8d2cfe3f77851e0df179b386cff353. Reverted due to the commit including a config in LLVM headers that is not available outside of the llvm source tree.
2025-04-24Revert "Revert "[DebugInfo][DWARF] Emit DW_AT_abstract_origin for ↵Vladislav Dzhidzhoev
concrete/inlined DW_TAG_lexical_blocks"" (#137243) Reverts llvm/llvm-project#137237, as the problem was fixed with 92dc18b6df043d788d77b4a98e5afa3954a44cb0.
2025-04-24Revert "[DebugInfo][DWARF] Emit DW_AT_abstract_origin for concrete/inlined ↵David Blaikie
DW_TAG_lexical_blocks" (#137237) Reverts llvm/llvm-project#136205 Breaks buildbots, probably something about needing to restrict the test to running on a specific target or the like - I haven't looked closely. Co-authored-by: Vladislav Dzhidzhoev <dzhidzhoev@gmail.com>
2025-04-24[DLCov] Implement DebugLoc coverage tracking (#107279)Stephen Tozer
This is part of a series of patches that tries to improve DILocation bug detection in Debugify; see the review for more details. This is the patch that adds the main feature, adding a set of `DebugLoc::get<Kind>` functions that can be used for instructions with intentionally empty DebugLocs to prevent Debugify from treating them as bugs, removing the currently-pervasive false positives and allowing us to use Debugify (in its original DI preservation mode) to reliably detect existing bugs and regressions. This patch does not add uses of these functions, except for once in Clang before optimizations, and in `Instruction::dropLocation()`, since that is an obvious case that immediately removes a set of false positives.
2025-04-24[DebugInfo][DWARF] Emit DW_AT_abstract_origin for concrete/inlined ↵Vladislav Dzhidzhoev
DW_TAG_lexical_blocks (#136205) During the discussion under https://github.com/llvm/llvm-project/pull/119001, it was noticed that concrete DW_TAG_lexical_blocks should refer to corresponding abstract DW_TAG_lexical_blocks by having DW_AT_abstract_origin, to avoid ambiguity. This behavior is implemented in GCC (https://godbolt.org/z/Khrzdq1Wx), but not in LLVM. Fixes https://github.com/llvm/llvm-project/issues/49297.
2025-03-23[CodeGen] Use *Set::insert_range (NFC) (#132651)Kazu Hirata
We can use *Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E); down to: Set.insert_range(Range);
2025-03-20[llvm] Use *Set::insert_range (NFC) (#132325)Kazu Hirata
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.
2025-03-11Reland: [MC] output inlined-at debug info (#106230) (#130306)Yaxun (Sam) Liu
Reland https://github.com/llvm/llvm-project/pull/106230 The original PR was reverted due to compilation time regression. This PR fixed that by adding a condition OutStreamer->isVerboseAsm() to the generation of extra inlined-at debug info, so that it does not affect normal compilation time. Currently MC print source location of instructions in comments in assembly when debug info is available, however, it does not include inlined-at locations when a function is inlined. For example, function foo is defined in header file a.h and is called multiple times in b.cpp. If foo is inlined, current assembly will only show its instructions with their line numbers in a.h. With inlined-at locations, the assembly will also show where foo is called in b.cpp. This patch adds inlined-at locations to the comments by using DebugLoc::print. It makes the printed source location info consistent with those printed by machine passes.
2025-03-07Revert "[MC] output inlined-at debug info (#106230)"Nikita Popov
This reverts commit f3dc358953a13caf7521fc615a08f6317930351c. This causes a large compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=267403442264959f6b06e227ff450c385f4b3ef2&to=f3dc358953a13caf7521fc615a08f6317930351c&stat=instructions:u
2025-03-06[MC] output inlined-at debug info (#106230)Yaxun (Sam) Liu
Currently MC print source location of instructions in comments in assembly when debug info is available, however, it does not include inlined-at locations when a function is inlined. For example, function foo is defined in header file a.h and is called multiple times in b.cpp. If foo is inlined, current assembly will only show its instructions with their line numbers in a.h. With inlined-at locations, the assembly will also show where foo is called in b.cpp. This patch adds inlined-at locations to the comments by using DebugLoc::print. It makes the printed source location info consistent with those printed by machine passes.
2025-01-13[aarch64][win] Update Called Globals info when updating Call Site info (#122762)Daniel Paoliello
Fixes the "use after poison" issue introduced by #121516 (see <https://github.com/llvm/llvm-project/pull/121516#issuecomment-2585912395>). The root cause of this issue is that #121516 introduced "Called Global" information for call instructions modeling how "Call Site" info is stored in the machine function, HOWEVER it didn't copy the copy/move/erase operations for call site information. The fix is to rename and update the existing copy/move/erase functions so they also take care of Called Global info.
2025-01-08[LLVM][DWARF] Create debug names entry for non-tu top level DIE (#121856)Alexander Yermolovich
When creating a Type Unit (TU), LLVM attempts to do so optimistically. However, if this fails, it discards the TU state and creates the TU within the Compilation Unit (CU). In such cases, an entry for the top-level DIE is not created in the debug names table. This can cause issues when running llvm-dwarfdump --debug-names --verify, as the missing entry will result in verification failure. To address this issue, this patch adds a call to the updateAcceleratorTables when TU creation fails. This ensures that the debug names table is updated correctly, even in cases where TU creation fails.
2024-11-26[DebugInfo] Handle trailing empty blocks when seeking prologue_end spot ↵Jeremy Morse
(#117320) The optimiser will produce empty blocks that are unconditionally executed according to the CFG -- while it may not be meaningful code, and won't get a prologue_end position, we need to not crash on this input. The fault comes from assuming that there's always a next block with some instructions in it, that will eventually produce some meaningful control flow to stop at -- in the given reproducer in issue #117206 this isn't true, because the function terminates with `unreachable`. Thus, I've refactored the "get next instruction logic" into a helper that'll step through all blocks and terminate if there aren't any more. Reproducer from aeubanks
2024-11-14[DebugInfo] Don't pick prologue_end if there are no instructionsJeremy Morse
Add a filter to avoid picking prologue_end when a function is empty (it may have blocks but no instructions). This saves us from pushing more validity-checking into findPrologueEndLoc.
2024-11-14Reapply ccddb6ffad1, "Emit a worst-case prologue_end"Jeremy Morse
In 39b2979a4 Pavel has kindly refined the implementation of a test in such a way that it doesn't trip up over this patch -- the test wishes to stimulate LLDBs presentation of line0 locations, rather than wanting to always step on line-zero on entry to artificial_location.c. As that's what was tripping up this change, reapply. Original commit message follows. [DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849) prologue_end usually indicates where the end of the function-initialization lies, and is where debuggers usually choose to put the initial breakpoint for a function. Our current algorithm piggy-backs it on the first available source-location: which doesn't necessarily have anything to do with the start of the function. To avoid this in heavily-optimised code that lacks many useful source locations, pick a worst-case "if all else fails" prologue_end location, of the first instruction that appears to do meaningful computation. It'll be given the function-scope line number, which should run-on from the start of the function anyway. This means if your code is completely inverted by the optimiser, you can at least put a breakpoint at the _start_ like you expect, even if it's difficult to then step through. This patch also attempts to preserve some good behaviour we have without optimisations -- at O0, if the prologue immediately falls into a loop body without any computation happening, then prologue_end lands at the start of that loop. This is desirable; but does mean we need to do more work to detect and support those situations.
2024-11-13[DebugInfo][DWARF] Emit Per-Function Line Table Offsets and End Sequences ↵alx32
(#110192) **Summary** This patch introduces a new compiler option `-mllvm -emit-func-debug-line-table-offsets` that enables the emission of per-function line table offsets and end sequences in DWARF debug information. This enhancement allows tools and debuggers to accurately attribute line number information to their corresponding functions, even in scenarios where functions are merged or share the same address space due to optimizations like Identical Code Folding (ICF) in the linker. **Background** RFC: [New DWARF Attribute for Symbolication of Merged Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434) Previous similar PR: [#93137](https://github.com/llvm/llvm-project/pull/93137) – This PR was very similar to the current one but at the time, the assembler had no support for emitting labels within the line table. That support was added in PR [#99710](https://github.com/llvm/llvm-project/pull/99710) - and in this PR we use some of the support added in the assembler PR. In the current implementation, Clang generates line information in the `debug_line` section without directly associating line entries with their originating `DW_TAG_subprogram` DIEs. This can lead to issues when post-compilation optimizations merge functions, resulting in overlapping address ranges and ambiguous line information. For example, when functions are merged by ICF in LLD, multiple functions may end up sharing the same address range. Without explicit linkage between functions and their line entries, tools cannot accurately attribute line information to the correct function, adversely affecting debugging and call stack resolution. **Implementation Details** To address the above issue, the patch makes the following key changes: **`DW_AT_LLVM_stmt_sequence` Attribute**: Introduces a new LLVM-specific attribute `DW_AT_LLVM_stmt_sequence` to each `DW_TAG_subprogram` DIE. This attribute holds a label pointing to the offset in the line table where the function's line entries begin. **End-of-Sequence Markers**: Emits an explicit DW_LNE_end_sequence after each function's line entries in the line table. This marks the end of the line information for that function, ensuring that line entries are correctly delimited. **Assembler and Streamer Modifications**: Modifies the MCStreamer and related classes to support emitting the necessary labels and tracking the current function's line entries. A new flag GenerateFuncLineTableOffsets is added to control this behavior. **Compiler Option**: Introduces the `-mllvm -emit-func-debug-line-table-offsets` option to enable this functionality, allowing users to opt-in as needed.
2024-11-12Revert "[DWARF] Emit a worst-case prologue_end flag for pathological inputs ↵Jeremy Morse
(#107849)" This reverts commit bf483ddb42065405e345393e022dc72357ec5a3a. See PR, there's a test testing for this behaviour (possibly adaptable), and a duplicate line entry too
2024-11-12[DebugInfo] Don't apply is_stmt on MBB branches that preserve lines (#108251)Stephen Tozer
This patch follows on from the changes made in #105524, by adding an additional heuristic that prevents us from applying the start-of-MBB is_stmt flag when we can see that, for all direct branches to the MBB, the last line stepped on before the branch is the same as the first line of the MBB. This is mainly to prevent certain pathological cases, such as macros that expand to multiple basic blocks that all have the same source location, from giving us repeated steps on the same line. This approach is not comprehensive, since it relies on analyzeBranch to read edges, but the default fallback of applying is_stmt may lead only to useless steps in some cases, rather than skipping useful steps altogether.