Pekka Paalanen
2013-Oct-19 07:14 UTC
[Nouveau] MmioTrace: Using the Instruction Decoder, etc.
On Fri, 18 Oct 2013 00:11:15 +0400 Eugene Shatokhin <euspectre at gmail.com> wrote:> Hi, > > Good to know that! > > Yes, it should be faster than page faulting, although I haven't done the > benchmarking yet. And yes, it is not needed to disable all but one CPU. In > my current implementation, I use an ordered workqueue to send the data to > the mmapped output buffer (where they will be read from from the user > space) and that ensures the order of events is kept. May be less than ideal > but it currently works quite well with network drivers, the performance > overhead is acceptable there.Ah, you are not using the ftrace framework nor relayfs? Mmiotrace used to be relayfs at one point and then converted to ftrace.> A subtle drawback may be that the system sees the memory reads and writes > made by the code of the driver directly but if the driver uses some other > kernel functions, it needs to intercept these calls and determine how they > access the memory of interest. Theoretically, it could be less accurate > than page fault handling. A page fault happens no matter if the driver > accesses the memory directly or via strcpy(), for example. I doubt this > would be a big problem for tracking the accesses to ioremapped memory > though. > > Nevertheless, it is manageable, the system already handles string > functions, for example, and reports appropriate events. The handlers for > other functions could be added as well. So this just requires a bit more > maintenance work.Are you saying that you intercept function calls, and *never* rely on page faulting? Does that mean that if a driver does the ugly thing and dereferences an iomem pointer directly, you won't catch that? Unfortunately, I think proprietary drivers do such uglies, since they are x86 and x86_64 only where it works. Or they might have the iomem accessor functions inlined. What I had in mind was to still use page faulting to catch the memory accessing machine instructions, but then use emulation to execute that instruction with the memory address diverted to the real ioremapped region instead of the dummy region given to the driver. Currently for each access, on the page fault, mmiotrace uses single stepping and page table manipulation to let the instruction run for real, and immediately afterwards set things back to page faulting. Sorry, I see my terminology was wrong. I don't think we can avoid the page faulting, but I'd like to avoid the single-stepping and page table mangling on the fly. Heh, things are slowly coming back to me. What do you thing, would it still be interesting?> > Unfortunately, my job exhausts my coding energy, and I haven't even > touched mmiotrace in years. > > I understand. I have many other responsibilities too. Code to write, bugs > to fix, etc. ;-) > > Well, then, when time permits, I'll try to prepare a prototype so that its > performance and reliability could be evaluated. Hard to tell what the > numbers will be before that. > > Suggestions, comments and other feedback are welcome of course. > > And, by the way, video drivers do not use SSE and similar instructions when > accessing ioremapped memory, do they? > Such things are rare in the kernel and usually frowned upon so I opted not > to handle them so far in KernelStrider.I don't really know. I guess everything could be possible in proprietary drivers, but you can look at the instruction decoding code in mmiotrace, which digs up the type and size of access and the value. That has been enough so far. Thanks, pq> 2013/10/17 Pekka Paalanen <pq at iki.fi> > > > On Mon, 14 Oct 2013 22:45:09 +0400 > > Eugene Shatokhin <euspectre at gmail.com> wrote: > > > > > Hi, > > > > > > There is an interesting TODO item on MmioTraceDeveloper page: > > > "kprobes has a generic instruction decoding facility, use that instead of > > > homebrewn (or KVM), and use emulation instead of page faulting" > > > > > > Actually, I have done something similar in one of my systems, > > KernelStrider > > > (http://code.google.com/p/kernel-strider/). The system instruments a > > kernel > > > module when that module is being loaded. The instrumented code executes > > > instead of the original one and provides information about the memory > > > accesses it makes and the functions it calls. These data are sent to user > > > space for further analysis. > > > > > > Currently, I use this system to detect data races in the Linux kernel > > (and > > > have found some). I suppose, it could probably be useful to MmioTrace as > > > well. > > > > > > KernelStrider uses an enhanced version of the x86 instruction decoder > > that > > > Kprobes use and relies on binary instrumentation rather than on page > > > faults. So, it can track: > > > - memory accesses (address and size of the accessed memory as well as the > > > access type are recorded) > > > - function calls (exported functions and callbacks, one can setup pre- > > and > > > post- handlers for these) > > > > > > Is there any interest in trying this approach to the task of MmioTrace? > > > > > > If so, we can discuss it. When I have time, I could try to create a > > > prototype based on KernelStrider's core that tracks the memory accesses > > > Mmiotrace needs. > > > What do you think? > > > > Hi Eugene, > > > > that is very interesting! I assume emulating the instructions is > > not only cleaner, but also faster than page-faulting, right? Maybe > > even more reliable, perhaps up to the point where we would not need > > to disable all but one CPU. > > > > Unfortunately, my job exhausts my coding energy, and I haven't even > > touched mmiotrace in years. > > > > However, let's see if there are interested people on the mailing > > lists. I'm CC'ing nouveau, since that is where mmiotrace started, > > and dri-devel in the hopes to catch other drivers' reverse > > engineers. > >
Eugene Shatokhin
2013-Oct-19 13:12 UTC
[Nouveau] MmioTrace: Using the Instruction Decoder, etc.
Hi,> Ah, you are not using the ftrace framework nor relayfs? Mmiotraceused to be relayfs at one point and then converted to ftrace. Yes, I considered these when I started working on KernelStrider but finally borrowed ideas from Perf and implemented them. A mmapped ring buffer does its job well and has a higher throughput than Ftrace in my case.> Are you saying that you intercept function calls, and *never* rely > on page faulting?The system intercepts both function calls *and* memory operations made by the driver itself. Yes, it never relies on page faulting. > Does that mean that if a driver does the ugly thing and > dereferences an iomem pointer directly, you won't catch that? It will be caught. What my system actually does is as follows. When the target kernel module has been loaded into memory but before it has begun its initialization, KernelStrider processes it, function after function. It creates an instrumented variant of each function in the module mapping space and places a jump at the beginning of the original function to point to the instrumented one. After instrumentation is done, the target driver may start executing. If some original function of the driver contained, say, mov 0xabcd (%rax), %rsi mov %rbx, 0xbeeffeed (%rsi) that will be transformed to something like lea 0xabcd (%rax), %rbx mov %rbx, <local_storage1> mov 0xabcd (%rax), %rsi lea 0xbeeffeed (%rsi), %rbx mov %rbx, <local_storage2> mov %rbx, 0xbeeffeed (%rsi) ... <send the local_storage to the output system> That is, the address which is about to be accessed is determined and stored in 'local_storage', a special memory structure. At the end of the block of instructions, the information from the local storage is sent to the output system. So the addresses and sizes of the accessed memory areas as well as the types of the accesses (read/write/update) will be available for reading from the user space. It is actually more complex than that (KernelStrider has to deal with register allocation, relocations and other things) but the principle is as I described. The function calls are processed too so that we can set our own handlers to execute at the beginning of a function and right before its exit. Yes, the functions like read[bwql]() and write[bwlq]() are usually inline but they pose no problem: on x86 they compile to ordinary MOV instructions and the like which are handled as I described above. The instrumented code will access the ioremapped area the same way as the original code would, no need for single-stepping or emulation in this case. What I wrote in my previous letter is that there is a special case when the target driver uses some non-inline function provided by the kernel proper or by another driver and that function accesses the ioremapped memory area of interest. KernelStrider needs to track all such functions in order not to miss some memory accesses to that ioremapped area. Perhaps, that's manageable. There are not too many such functions, aren't they?> I don't really know. I guess everything could be possible in > proprietary drivers, but you can look at the instruction decoding > code in mmiotrace, which digs up the type and size of access and > the value. That has been enough so far.Yes, I will take a closer look on that part of MmioTrace, thanks for the point. Regards, Eugene -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20131019/abfc4305/attachment.html>
Eugene Shatokhin
2013-Oct-19 13:16 UTC
[Nouveau] MmioTrace: Using the Instruction Decoder, etc.
Oh, messed up the registers in the example. Should be like this: If some original function of the driver contained, say, mov 0xabcd (%rax), %rsi mov %rdx, 0xbeeffeed (%rsi) that will be transformed to something like lea 0xabcd (%rax), %rbx mov %rbx, <local_storage1> mov 0xabcd (%rax), %rsi lea 0xbeeffeed (%rsi), %rbx mov %rbx, <local_storage2> mov %rdx, 0xbeeffeed (%rsi) ... <send the local_storage to the output system> Regards, Eugene -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/nouveau/attachments/20131019/90d24e40/attachment-0001.html>
Pekka Paalanen
2013-Oct-25 09:08 UTC
[Nouveau] MmioTrace: Using the Instruction Decoder, etc.
On Sat, 19 Oct 2013 17:12:20 +0400 Eugene Shatokhin <euspectre at gmail.com> wrote:> Hi, > > > Ah, you are not using the ftrace framework nor relayfs? Mmiotrace > used to be relayfs at one point and then converted to ftrace. > > Yes, I considered these when I started working on KernelStrider but finally > borrowed ideas from Perf and implemented them. A mmapped ring buffer does > its job well and has a higher throughput than Ftrace in my case. > > > Are you saying that you intercept function calls, and *never* rely > > on page faulting? > > The system intercepts both function calls *and* memory operations made by > the driver itself. Yes, it never relies on page faulting. > > > Does that mean that if a driver does the ugly thing and > > dereferences an iomem pointer directly, you won't catch that? > > It will be caught. > > What my system actually does is as follows. > > When the target kernel module has been loaded into memory but before it has > begun its initialization, KernelStrider processes it, function after > function. It creates an instrumented variant of each function in the module > mapping space and places a jump at the beginning of the original function > to point to the instrumented one. After instrumentation is done, the target > driver may start executing.Oh, that works on a completely different way than I even imagined, a whole another level of complexity. <...snip code you corrected in another email>> That is, the address which is about to be accessed is determined and stored > in 'local_storage', a special memory structure. At the end of the block of > instructions, the information from the local storage is sent to the output > system. So the addresses and sizes of the accessed memory areas as well as > the types of the accesses (read/write/update) will be available for reading > from the user space.Just curious, how do you detect interesting instructions to instrument from uninteresting instructions that do not access mmio areas? Does it rely on post-processing, in that you instrument practically everything, and then in post-processing you check if the accessed memory address actually was interesting before sending the data to user space?> It is actually more complex than that (KernelStrider has to deal with > register allocation, relocations and other things) but the principle is as > I described. > > The function calls are processed too so that we can set our own handlers to > execute at the beginning of a function and right before its exit. > > Yes, the functions like read[bwql]() and write[bwlq]() are usually inline > but they pose no problem: on x86 they compile to ordinary MOV instructions > and the like which are handled as I described above. > > The instrumented code will access the ioremapped area the same way as the > original code would, no need for single-stepping or emulation in this case.That is very cool, the possibility never even occurred to me.> What I wrote in my previous letter is that there is a special case when the > target driver uses some non-inline function provided by the kernel proper > or by another driver and that function accesses the ioremapped memory area > of interest. > > KernelStrider needs to track all such functions in order not to miss some > memory accesses to that ioremapped area. Perhaps, that's manageable. There > are not too many such functions, aren't they?I don't really know, and personally I was never even interested, since the page faulting approach was a catch-all method. We could even detect when we hit some access we couldn't handle right due to lacking instruction decoding. I guess to be sure your approach does not miss anything, we'd still need the page faulting setup as a safety net to know when or if something is missed, right? And somehow have the instrumented code circumvent it. We could use some comments from the real reverse-engineers. I used to be mostly a tool writer. Thanks, pq