summaryrefslogtreecommitdiff
path: root/lldb/source/Target/ThreadPlanSingleThreadTimeout.cpp
AgeCommit message (Collapse)Author
2024-08-28Disable ThreadPlanSingleThreadTimeout during step over breakpoint (#104532)jeffreytan81
This PR fixes another race condition in https://github.com/llvm/llvm-project/pull/90930. The failure was found by @labath with this log: https://paste.debian.net/hidden/30235a5c/: ``` dotest_wrapper. < 15> send packet: $z0,224505,1#65 ... b-remote.async> < 22> send packet: $vCont;s:p1dcf.1dcf#4c intern-state GDBRemoteClientBase::Lock::Lock sent packet: \x03 b-remote.async> < 818> read packet: $T13thread:p1dcf.1dcf;name:a.out;threads:1dcf,1dd2;jstopinfo:5b7b226e616d65223a22612e6f7574222c22726561736f6e223a227369676e616c222c227369676e616c223a31392c22746964223a373633317d2c7b226e616d65223a22612e6f7574222c22746964223a373633347d5d;thread-pcs:0000000000224505,00007f4e4302119a;00:0000000000000000;01:0000000000000000;02:0100000000000000;03:0000000000000000;04:9084997dfc7f0000;05:a8742a0000000000;06:b084997dfc7f0000;07:6084997dfc7f0000;08:0000000000000000;09:00d7e5424e7f0000;0a:d0d9e5424e7f0000;0b:0202000000000000;0c:80cc290000000000;0d:d8cc1c434e7f0000;0e:2886997dfc7f0000;0f:0100000000000000;10:0545220000000000;11:0602000000000000;12:3300000000000000;13:0000000000000000;14:0000000000000000;15:2b00000000000000;16:80fbe5424e7f0000;17:0000000000000000;18:0000000000000000;19:0000000000000000;reason:signal;#b9 ``` It shows an async interrupt "\x03" was sent immediately after `vCont;s` single step over breakpoint at address `0x224505` (which was disabled before vCont). And the later stop was still at the original PC (0x224505) not moving forward. The investigation shows the failure happens when timeout is short and async interrupt is sent to lldb-server immediately after vCont so ptrace() resumes and then async interrupts debuggee immediately so debuggee does not get a chance to execute and move PC. So it enters stop mode immediately at original PC. `ThreadPlanStepOverBreakpoint` does not expect PC not moving and reports stop at the original place. To fix this, the PR prevents `ThreadPlanSingleThreadTimeout` from being created during `ThreadPlanStepOverBreakpoint` by introduces a new `SupportsResumeOthers()` method and `ThreadPlanStepOverBreakpoint` returns false for it. This makes sense because we should never resume threads during step over breakpoint anyway otherwise it might cause other threads to miss breakpoint. --------- Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
2024-08-15Fix single thread stepping timeout race condition (#104195)jeffreytan81
This PR fixes a potential race condition in https://github.com/llvm/llvm-project/pull/90930. This race can happen because the original code set `m_info->m_isAlive = true` **after** the timer thread is created. So if there is a context switch happens and timer thread checks `m_info->m_isAlive` before main thread got a chance to run `m_info->m_isAlive = true`, the timer thread may treat `ThreadPlanSingleThreadTimeout` as not alive and simply exit resulting in async interrupt never being sent to resume all threads (deadlock). The PR fixes the race by initializing all states **before** worker timer thread creates. Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
2024-08-11[lldb] Silence warningAlexandre Ganea
This fixes: ``` [6831/7617] Building CXX object tools\lldb\source\Target\CMakeFiles\lldbTarget.dir\ThreadPlanSingleThreadTimeout.cpp.obj C:\src\git\llvm-project\lldb\source\Target\ThreadPlanSingleThreadTimeout.cpp(66) : warning C4715: 'lldb_private::ThreadPlanSingleThreadTimeout::StateToString': not all control paths return a value ```
2024-08-06Fix ASAN failure in TestSingleThreadStepTimeout.py (#102208)jeffreytan81
This PR fixes the ASAN failure in https://github.com/llvm/llvm-project/pull/90930. The original PR made the assumption that parent `ThreadPlanStepOverRange`'s lifetime will always be longer than `ThreadPlanSingleThreadTimeout` leaf plan so it passes the `m_timeout_info` as reference to it. From the ASAN failure, it seems that this assumption may not be true (likely the thread stack is holding a strong reference to the leaf plan). This PR fixes this lifetime issue by using shared pointer instead of passing by reference. --------- Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
2024-08-05New ThreadPlanSingleThreadTimeout to resolve potential deadlock in single ↵jeffreytan81
thread stepping (#90930) This PR introduces a new `ThreadPlanSingleThreadTimeout` that will be used to address potential deadlock during single-thread stepping. While debugging a target with a non-trivial number of threads (around 5000 threads in one example target), we noticed that a simple step over can take as long as 10 seconds. Enabling single-thread stepping mode significantly reduces the stepping time to around 3 seconds. However, this can introduce deadlock if we try to step over a method that depends on other threads to release a lock. To address this issue, we introduce a new `ThreadPlanSingleThreadTimeout` that can be controlled by the `target.process.thread.single-thread-plan-timeout` setting during single-thread stepping mode. The concept involves counting the elapsed time since the last internal stop to detect overall stepping progress. Once a timeout occurs, we assume the target is not making progress due to a potential deadlock, as mentioned above. We then send a new async interrupt, resume all threads, and `ThreadPlanSingleThreadTimeout` completes its task. To support this design, the major changes made in this PR are: 1. `ThreadPlanSingleThreadTimeout` is popped during every internal stop and reset (re-pushed) to the top of the stack (as a leaf node) during resume. This is achieved by always returning `true` from `ThreadPlanSingleThreadTimeout::DoPlanExplainsStop()` and `ThreadPlanSingleThreadTimeout::MischiefManaged()`. 2. A new thread-specific async interrupt stop is introduced, which can be detected/consumed by `ThreadPlanSingleThreadTimeout`. 3. The clearing of branch breakpoints in the range thread plan has been moved from `DoPlanExplainsStop()` to `ShouldStop()`, as it is not guaranteed that it will be called. The detailed design is discussed in the RFC below: [https://discourse.llvm.org/t/improve-single-thread-stepping/74599](https://discourse.llvm.org/t/improve-single-thread-stepping/74599) --------- Co-authored-by: jeffreytan81 <jeffreytan@fb.com>