You'd probably need to pull the Unwinder in if you want backtraces, but that part shouldn't be that hard to disentangle. I don't think you'd need much else? Basing your work on NativeProcess rather than lldb proper would also cut the number of observer processes in half and avoid the context switches between the server and the debugger. That seems more appropriate for a lightweight tool. Jim> On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev <lldb-dev at lists.llvm.org> wrote: > > So you aren't planning to print values at all, just stop points (i.e. you are only interested in the line table and function symbols part of DWARF)? > > Given what you've described so far, I'm wondering if what you really want is the NativeProcess classes with some symbol-file reading pulled in? Is there anything that you couldn't do from there? > > Jim > > >> On Jun 26, 2018, at 12:48 PM, Zachary Turner <zturner at google.com> wrote: >> >> no expression parser or knowledge of any specific programming language. >> >> Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope. For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code. Similarly, LLDB's type system could be built on top of it as well. Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands. >> >> It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has. >> >> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <jingham at apple.com> wrote: >> Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical? For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb. I don't think that's what you meant, but wanted to be sure. >> >> Jim >> >>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <lldb-dev at lists.llvm.org> wrote: >>> >>> Hi all, >>> >>> We have been thinking internally about a lightweight llvm-based ptracer. To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface. We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it. >>> >>> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream. Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently. >>> >>> LLDB has solved a lot of the difficult problems needed for such a tool. So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base. At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems. >>> >>> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel. Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more. >>> >>> Anyone have any thoughts / strong opinions on this proposal, or where the code should live? Also, does anyone have any suggestions on things they’d like to see come out of this? Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome. >>> >>> Thanks, >>> Zach >>> >>> _______________________________________________ >>> lldb-dev mailing list >>> lldb-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >> > > _______________________________________________ > lldb-dev mailing list > lldb-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
The various NativeProcess implementations are definitely a good starting point and I'll probably be looking at them to understand all the ins and outs of each platform. I'm not sure if the API / interface we want will be the same, so I don't think we can just copy it all down. But a lot of the core logic we probably can. Depending on how much of it we end up implementing and how close we get to the current functionality of the NativeProcess classes, this could be another area for code reuse similar to what I mentioned with the DWARF reading. i.e. we could write lots of low-level tests of the tracing functionality specifically, then update the NativeProcess implementations to use this. On Tue, Jun 26, 2018 at 1:09 PM Jim Ingham <jingham at apple.com> wrote:> You'd probably need to pull the Unwinder in if you want backtraces, but > that part shouldn't be that hard to disentangle. I don't think you'd need > much else? > > Basing your work on NativeProcess rather than lldb proper would also cut > the number of observer processes in half and avoid the context switches > between the server and the debugger. That seems more appropriate for a > lightweight tool. > > Jim > > > > On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev < > lldb-dev at lists.llvm.org> wrote: > > > > So you aren't planning to print values at all, just stop points (i.e. > you are only interested in the line table and function symbols part of > DWARF)? > > > > Given what you've described so far, I'm wondering if what you really > want is the NativeProcess classes with some symbol-file reading pulled in? > Is there anything that you couldn't do from there? > > > > Jim > > > > > >> On Jun 26, 2018, at 12:48 PM, Zachary Turner <zturner at google.com> > wrote: > >> > >> no expression parser or knowledge of any specific programming language. > >> > >> Basically I just mean that the parsing of the native DWARF format > itself is in scope, but anything beyond that is out of scope. For > symbolication we have things like llvm-symbolizer that already just work > and are built on top of LLVM's dwarf parsing code. Similarly, LLDB's type > system could be built on top of it as well. Given that I think everyone > mostly agrees that unifying on one DWARF parser is a good idea in > principle, this would mean no functional change from LLDB's point of view, > it would just continue to do exactly what it does regarding parsing C++ > expressions and converting these into types that clang understands. > >> > >> It will probably be useful someday to have an expression parser and > language specific type system, but when that comes I don't think we'd want > anything radically different than what LLDB already has. > >> > >> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <jingham at apple.com> wrote: > >> Just to be clear, by "no clang integration" do you mean "no expression > parser" or do you mean something more radical? For instance, adding a > TypeSystem and its DWARF parser for C family languages that uses a > different underlying representation than Clang AST's to store the results > would be a lot of work that wouldn't be terribly interesting to lldb. I > don't think that's what you meant, but wanted to be sure. > >> > >> Jim > >> > >>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev < > lldb-dev at lists.llvm.org> wrote: > >>> > >>> Hi all, > >>> > >>> We have been thinking internally about a lightweight llvm-based > ptracer. To address one question up front: the primary way in which this > differs from LLDB is that it targets a more narrow use case -- there is no > scripting support, no clang integration, no dynamic extensibility, no > support for running jitted code in the target, and no user interface. We > have several use cases internally that call for varying levels of > functionality from such a utility, and being able to use as little as > possible of the library as is necessary for the given task is important for > the scale in which we wish to use it. > >>> > >>> We are still in early discussions and planning, but I think this would > be a good addition to the LLVM upstream. Since we’re approaching this as a > set of small isolated components, my thinking is to work on this completely > upstream, directly under the llvm project (as opposed to making a separate > subproject), but I’m open to discussion if anyone feels differently. > >>> > >>> LLDB has solved a lot of the difficult problems needed for such a > tool. So in the spirit of code reuse, we think it’s worth trying > componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as > these smaller tools on top of these components, so that smaller tools can > reduce code duplication and contribute to the overall health of the code > base. At the same time we think that in doing so we can break things up > into more granular pieces, ultimately exposing a larger testing surface and > enabling us to create exhaustive tests, giving LLDB more fine grained > testing of important subsystems. > >>> > >>> A good example of this would be LLDB’s DWARF parsing code, which is > more featureful than LLVM’s but has kind of evolved in parallel. Sinking > this into LLVM would be one early target of such an effort, although over > time there would likely be more. > >>> > >>> Anyone have any thoughts / strong opinions on this proposal, or where > the code should live? Also, does anyone have any suggestions on things > they’d like to see come out of this? Whether it’s a specific new tool, new > functionality to an existing tool, an architectural or design change to > some existing tool or library, or something else entirely, all feedback and > ideas are welcome. > >>> > >>> Thanks, > >>> Zach > >>> > >>> _______________________________________________ > >>> lldb-dev mailing list > >>> lldb-dev at lists.llvm.org > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > >> > > > > _______________________________________________ > > lldb-dev mailing list > > lldb-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180626/5783d0fc/attachment.html>
Yes that’s what I’ve been thinking about as well. One thing I’ve been giving a lot of thought to is whether to serialize the handling of trace events. I want to balance the “this is a library and you should be able to get it to work for you no matter what your use case is” aspect with the “you really just don’t want to go there, we know what’s best for you” aspect. Then there’s the fact that not all platforms behave the same, but we’d like a consistent set of expectations that makes it easy to use for everyone. So I’m leaning towards having the library serialize all tace events, because it’s a nice common denominator that every platform can implement. To be clear though, I don’t mean that if 2 processes are being traced simultaneously and A stops followed by B stopping, then the tool will necessarily block before handling B’s stop. I just mean that A and B’s stop handlers will be invoked on a single thread (not the threads which are tracing A or B). So A stops, posts its stop event on the blessed thread and waits. Then B stops and does the same thing. A’s handler runs, for whatever reason decides it will continue later, saves off the event somewhere, then processes B’s. Later something happens, it decides to continue A, signals A’s thread which wakes up. I think this kind of design eliminates a large class of race conditions without sacrificing any performance. LLDB doesn’t currently work like this, but it would be nice not to end up with another split similar to the dwarf split, so I’m curious if you can think of any fundamental assumptions of LLDB’s architecture that this would violate. This way we’d at least know that it’s possible to use the api in lldb (assuming it does everything lldb needs obviously) Thoughts? On Tue, Jun 26, 2018 at 1:09 PM Jim Ingham <jingham at apple.com> wrote:> You'd probably need to pull the Unwinder in if you want backtraces, but > that part shouldn't be that hard to disentangle. I don't think you'd need > much else? > > Basing your work on NativeProcess rather than lldb proper would also cut > the number of observer processes in half and avoid the context switches > between the server and the debugger. That seems more appropriate for a > lightweight tool. > > Jim > > > > On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev < > lldb-dev at lists.llvm.org> wrote: > > > > So you aren't planning to print values at all, just stop points (i.e. > you are only interested in the line table and function symbols part of > DWARF)? > > > > Given what you've described so far, I'm wondering if what you really > want is the NativeProcess classes with some symbol-file reading pulled in? > Is there anything that you couldn't do from there? > > > > Jim > > > > > >> On Jun 26, 2018, at 12:48 PM, Zachary Turner <zturner at google.com> > wrote: > >> > >> no expression parser or knowledge of any specific programming language. > >> > >> Basically I just mean that the parsing of the native DWARF format > itself is in scope, but anything beyond that is out of scope. For > symbolication we have things like llvm-symbolizer that already just work > and are built on top of LLVM's dwarf parsing code. Similarly, LLDB's type > system could be built on top of it as well. Given that I think everyone > mostly agrees that unifying on one DWARF parser is a good idea in > principle, this would mean no functional change from LLDB's point of view, > it would just continue to do exactly what it does regarding parsing C++ > expressions and converting these into types that clang understands. > >> > >> It will probably be useful someday to have an expression parser and > language specific type system, but when that comes I don't think we'd want > anything radically different than what LLDB already has. > >> > >> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <jingham at apple.com> wrote: > >> Just to be clear, by "no clang integration" do you mean "no expression > parser" or do you mean something more radical? For instance, adding a > TypeSystem and its DWARF parser for C family languages that uses a > different underlying representation than Clang AST's to store the results > would be a lot of work that wouldn't be terribly interesting to lldb. I > don't think that's what you meant, but wanted to be sure. > >> > >> Jim > >> > >>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev < > lldb-dev at lists.llvm.org> wrote: > >>> > >>> Hi all, > >>> > >>> We have been thinking internally about a lightweight llvm-based > ptracer. To address one question up front: the primary way in which this > differs from LLDB is that it targets a more narrow use case -- there is no > scripting support, no clang integration, no dynamic extensibility, no > support for running jitted code in the target, and no user interface. We > have several use cases internally that call for varying levels of > functionality from such a utility, and being able to use as little as > possible of the library as is necessary for the given task is important for > the scale in which we wish to use it. > >>> > >>> We are still in early discussions and planning, but I think this would > be a good addition to the LLVM upstream. Since we’re approaching this as a > set of small isolated components, my thinking is to work on this completely > upstream, directly under the llvm project (as opposed to making a separate > subproject), but I’m open to discussion if anyone feels differently. > >>> > >>> LLDB has solved a lot of the difficult problems needed for such a > tool. So in the spirit of code reuse, we think it’s worth trying > componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as > these smaller tools on top of these components, so that smaller tools can > reduce code duplication and contribute to the overall health of the code > base. At the same time we think that in doing so we can break things up > into more granular pieces, ultimately exposing a larger testing surface and > enabling us to create exhaustive tests, giving LLDB more fine grained > testing of important subsystems. > >>> > >>> A good example of this would be LLDB’s DWARF parsing code, which is > more featureful than LLVM’s but has kind of evolved in parallel. Sinking > this into LLVM would be one early target of such an effort, although over > time there would likely be more. > >>> > >>> Anyone have any thoughts / strong opinions on this proposal, or where > the code should live? Also, does anyone have any suggestions on things > they’d like to see come out of this? Whether it’s a specific new tool, new > functionality to an existing tool, an architectural or design change to > some existing tool or library, or something else entirely, all feedback and > ideas are welcome. > >>> > >>> Thanks, > >>> Zach > >>> > >>> _______________________________________________ > >>> lldb-dev mailing list > >>> lldb-dev at lists.llvm.org > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > >> > > > > _______________________________________________ > > lldb-dev mailing list > > lldb-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180626/8ce14365/attachment.html>
On Wed, 27 Jun 2018 at 01:14, Zachary Turner via lldb-dev <lldb-dev at lists.llvm.org> wrote:> > Yes that’s what I’ve been thinking about as well. > > One thing I’ve been giving a lot of thought to is whether to serialize the handling of trace events. I want to balance the “this is a library and you should be able to get it to work for you no matter what your use case is” aspect with the “you really just don’t want to go there, we know what’s best for you” aspect. Then there’s the fact that not all platforms behave the same, but we’d like a consistent set of expectations that makes it easy to use for everyone. > > So I’m leaning towards having the library serialize all tace events, because it’s a nice common denominator that every platform can implement. > > To be clear though, I don’t mean that if 2 processes are being traced simultaneously and A stops followed by B stopping, then the tool will necessarily block before handling B’s stop. I just mean that A and B’s stop handlers will be invoked on a single thread (not the threads which are tracing A or B). > > So A stops, posts its stop event on the blessed thread and waits. Then B stops and does the same thing. A’s handler runs, for whatever reason decides it will continue later, saves off the event somewhere, then processes B’s. Later something happens, it decides to continue A, signals A’s thread which wakes up. > > I think this kind of design eliminates a large class of race conditions without sacrificing any performance. >Does this mean that you will always have to have at least two threads (the one doing the tracing and the one where stop handlers are invoked)? Because if that's true, then I'm not sure I buy the no-performance-sacrifice part. Given that with ptrace (on linux at least, but I think that holds for some other OSs too), all debugging operations have to happen on a specific thread, if that thread is not the one where the core logic happens, you will have to do a lot of ping-pong to do all the debugging operations (read/write registers/memory, set breakpoints, etc.). Of all the use cases, the one where this matters most may be actually yours -- I'm not sure I understand it fully but if the goal is to have as little impact on the traced process, then this is going to be a problem, because every microsecond you spend context-switching between these two threads is a microsecond when the target process is not executing. In lldb-server we avoid these context switches (and race conditions!) by being single threaded. It think it would be good to keep things this way by having the new api (the lowest layers of it?) accessible in a single-threaded manner, at least on platforms where this is possible (everything except windows, I guess).
> On Jun 26, 2018, at 5:14 PM, Zachary Turner <zturner at google.com> wrote: > > Yes that’s what I’ve been thinking about as well. > > One thing I’ve been giving a lot of thought to is whether to serialize the handling of trace events. I want to balance the “this is a library and you should be able to get it to work for you no matter what your use case is” aspect with the “you really just don’t want to go there, we know what’s best for you” aspect. Then there’s the fact that not all platforms behave the same, but we’d like a consistent set of expectations that makes it easy to use for everyone. > > So I’m leaning towards having the library serialize all tace events, because it’s a nice common denominator that every platform can implement. > > To be clear though, I don’t mean that if 2 processes are being traced simultaneously and A stops followed by B stopping, then the tool will necessarily block before handling B’s stop. I just mean that A and B’s stop handlers will be invoked on a single thread (not the threads which are tracing A or B). > > So A stops, posts its stop event on the blessed thread and waits. Then B stops and does the same thing. A’s handler runs, for whatever reason decides it will continue later, saves off the event somewhere, then processes B’s. Later something happens, it decides to continue A, signals A’s thread which wakes up. > > I think this kind of design eliminates a large class of race conditions without sacrificing any performance. > > LLDB doesn’t currently work like this, but it would be nice not to end up with another split similar to the dwarf split, so I’m curious if you can think of any fundamental assumptions of LLDB’s architecture that this would violate. This way we’d at least know that it’s possible to use the api in lldb (assuming it does everything lldb needs obviously)What you describe is actually pretty much how the lldb driver works. Every time the lower levels of the Process (e.g. ProcessGDBRemote) class notice something interesting happening to the process they are managing, they post an event to the Listener in charge of driving that process. Then the process is allowed to continue on its way, either stopped or continued depending (the event records whether a restart has occurred.) The upper levels only know about what happened to the process when they fetch an event off the event queue. For a single process that serializes the reporting of process state. As to multiple processes, you can decide whether you want to serialize all the process events using the same mechanism or not, depending on your use case. In the lldb driver, there's one Listener that waits on all processes (the Debugger's listener). These events all get effectively serialized in its event loop. So if you were just straight using lldb classes you could trivially implement what you want to achieve. That being said, I don't think you want to use lldb's process event system for your ptracer. It has a lot of complexity which supports handling reactions to events (breakpoint commands and conditions) that have to operate in the same context as user commands even though they happen before the user has regained control, and which might or might not restart the process out from under you. They also manage the task of concealing the vast majority of stops from the higher level clients - for instance to pretend that a single "source line step over" didn't actually require lots of stops and starts. I don't think anything you have described requires handling either of these tasks. But you could use the general event system to achieve the serialization of reporting w/o hooking into the lldb private/public state thread system. Jim> > Thoughts? > > On Tue, Jun 26, 2018 at 1:09 PM Jim Ingham <jingham at apple.com> wrote: > You'd probably need to pull the Unwinder in if you want backtraces, but that part shouldn't be that hard to disentangle. I don't think you'd need much else? > > Basing your work on NativeProcess rather than lldb proper would also cut the number of observer processes in half and avoid the context switches between the server and the debugger. That seems more appropriate for a lightweight tool. > > Jim > > > > On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev <lldb-dev at lists.llvm.org> wrote: > > > > So you aren't planning to print values at all, just stop points (i.e. you are only interested in the line table and function symbols part of DWARF)? > > > > Given what you've described so far, I'm wondering if what you really want is the NativeProcess classes with some symbol-file reading pulled in? Is there anything that you couldn't do from there? > > > > Jim > > > > > >> On Jun 26, 2018, at 12:48 PM, Zachary Turner <zturner at google.com> wrote: > >> > >> no expression parser or knowledge of any specific programming language. > >> > >> Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope. For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code. Similarly, LLDB's type system could be built on top of it as well. Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands. > >> > >> It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has. > >> > >> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <jingham at apple.com> wrote: > >> Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical? For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb. I don't think that's what you meant, but wanted to be sure. > >> > >> Jim > >> > >>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <lldb-dev at lists.llvm.org> wrote: > >>> > >>> Hi all, > >>> > >>> We have been thinking internally about a lightweight llvm-based ptracer. To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface. We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it. > >>> > >>> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream. Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently. > >>> > >>> LLDB has solved a lot of the difficult problems needed for such a tool. So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base. At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems. > >>> > >>> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel. Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more. > >>> > >>> Anyone have any thoughts / strong opinions on this proposal, or where the code should live? Also, does anyone have any suggestions on things they’d like to see come out of this? Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome. > >>> > >>> Thanks, > >>> Zach > >>> > >>> _______________________________________________ > >>> lldb-dev mailing list > >>> lldb-dev at lists.llvm.org > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > >> > > > > _______________________________________________ > > lldb-dev mailing list > > lldb-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >