thr3ads.net - Nouveau - [Nouveau] MmioTrace: Using the Instruction Decoder, etc. [Oct 2013]

If this information is useful, please help other people find it:
Share via:

Pekka Paalanen

2013-Oct-25 09:08 UTC

[Nouveau] MmioTrace: Using the Instruction Decoder, etc.

On Sat, 19 Oct 2013 17:12:20 +0400
Eugene Shatokhin <euspectre at gmail.com> wrote:
> Hi,
> 
> >  Ah, you are not using the ftrace framework nor relayfs? Mmiotrace
>  used to be relayfs at one point and then converted to ftrace.
> 
> Yes, I considered these when I started working on KernelStrider but finally
> borrowed ideas from Perf and implemented them. A mmapped ring buffer does
> its job well and has a higher throughput than Ftrace in my case.
> 
> > Are you saying that you intercept function calls, and *never* rely
> > on page faulting?
> 
> The system intercepts both function calls *and* memory operations made by
> the driver itself. Yes, it never relies on page faulting.
> 
>  > Does that mean that if a driver does the ugly thing and
>  > dereferences an iomem pointer directly, you won't catch that?
> 
> It will be caught.
> 
> What my system actually does is as follows.
> 
> When the target kernel module has been loaded into memory but before it has
> begun its initialization, KernelStrider processes it, function after
> function. It creates an instrumented variant of each function in the module
> mapping space and places a jump at the beginning of the original function
> to point to the instrumented one. After instrumentation is done, the target
> driver may start executing.
Oh, that works on a completely different way than I even imagined,
a whole another level of complexity.


<...snip code you corrected in another email>
> That is, the address which is about to be accessed is determined and stored
> in 'local_storage', a special memory structure. At the end of the
block of
> instructions, the information from the local storage is sent to the output
> system. So the addresses and sizes of the accessed memory areas as well as
> the types of the accesses (read/write/update) will be available for reading
> from the user space.
Just curious, how do you detect interesting instructions to
instrument from uninteresting instructions that do not access mmio
areas?

Does it rely on post-processing, in that you instrument practically
everything, and then in post-processing you check if the accessed
memory address actually was interesting before sending the data to user
space?
> It is actually more complex than that (KernelStrider has to deal with
> register allocation, relocations and other things) but the principle is as
> I described.
> 
> The function calls are processed too so that we can set our own handlers to
> execute at the beginning of a function and right before its exit.
> 
> Yes, the functions like read[bwql]() and write[bwlq]() are usually inline
> but they pose no problem: on x86 they compile to ordinary MOV instructions
> and the like which are handled as I described above.
> 
> The instrumented code will access the ioremapped area the same way as the
> original code would, no need for single-stepping or emulation in this case.
That is very cool, the possibility never even occurred to me.
> What I wrote in my previous letter is that there is a special case when the
> target driver uses some non-inline function provided by the kernel proper
> or by another driver and that function accesses the ioremapped memory area
> of interest.
> 
> KernelStrider needs to track all such functions in order not to miss some
> memory accesses to that ioremapped area. Perhaps, that's manageable.
There
> are not too many such functions, aren't they?
I don't really know, and personally I was never even interested,
since the page faulting approach was a catch-all method. We
could even detect when we hit some access we couldn't handle right
due to lacking instruction decoding.

I guess to be sure your approach does not miss anything, we'd still
need the page faulting setup as a safety net to know when or if
something is missed, right? And somehow have the instrumented code
circumvent it.

We could use some comments from the real reverse-engineers. I used
to be mostly a tool writer.


Thanks,
pq

Eugene Shatokhin

2013-Oct-25 13:19 UTC

head link

[Nouveau] MmioTrace: Using the Instruction Decoder, etc.

Hi,

2013/10/25 Pekka Paalanen <pq at iki.fi>
>
> Just curious, how do you detect interesting instructions to
> instrument from uninteresting instructions that do not access mmio
> areas?
>
>As I currently use this for data race detection in general, there is no
need to separate accesses to mmio areas from the accesses to other memory.
The tool just tracks all except the accesses to the data on the stack (if
it can know for sure the data are on the stack from the address of the
memory area). These are usually not interested for data race detection in
the kernel anyway.

So, yes, almost all the instructions that may access memory (except some
special instructions as well as MMX, SSE, AVX, ...) are instrumented. For
some instructions, it is easy to determine in advance if they access
memory, so I enhanced the decoder from Kprobes to provide that info. For
other instructions (e.g. CMPXCHG, conditional MOVs), it is determined in
runtime whether they access memory and whether this event should be
reported.

So, currently, it does not handle mmio areas in any special way. I am just
evaluating, if it could be useful to create a tool based on the same
technique for these purposes.

mmio areas can be obtained by a driver through a few kernel functions. A
set of currently obtained such areas could be used to filter the accesses
and decide whether to report them or not. So, yes, basically, it is
"instrument everything, filter before reporting to user space".

I guess to be sure your approach does not miss anything, we'd
still> need the page faulting setup as a safety net to know when or if
> something is missed, right? And somehow have the instrumented code
> circumvent it.
>
Page faulting as a safety net... I haven't thought that through yet.

I suppose, I'll look at the code first when I have time and try to
understand at least the common ways for a driver to access mmio areas. It
will be clearer then how to make sure we do not lose anything. And - if it
is possible with the techniques KernelStrider uses.


>
> We could use some comments from the real reverse-engineers. I used
> to be mostly a tool writer.
>
Yes, if some experts could share their knowledge of this matter, this would
be most welcome!

Regards,

Eugene

P.S. If you are interested, more info concerning KernelStrider can be found
in my recent talk at LinuxCon Europe. The slides and notes for them are
available in "Talks and slides" section on the project page (
https://code.google.com/p/kernel-strider/). This is mostly about data races
though.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.freedesktop.org/archives/nouveau/attachments/20131025/a6af0d5c/attachment.html>

Pekka Paalanen

2013-Oct-28 11:30 UTC

head link

[Nouveau] MmioTrace: Using the Instruction Decoder, etc.

On Fri, 25 Oct 2013 17:19:56 +0400
Eugene Shatokhin <euspectre at gmail.com> wrote:
> Hi,
> 
> 2013/10/25 Pekka Paalanen <pq at iki.fi>
> 
...> > We could use some comments from the real reverse-engineers. I used
> > to be mostly a tool writer.
> >
> 
> Yes, if some experts could share their knowledge of this matter, this would
> be most welcome!
Hi,

I got one comment in IRC, saying that a faster mmiotrace would be
nice, but probably not something he would invest time in.

Looking at it from that point of view, I guess mmiotrace works well enough,
and is "only" a reverse-engineering tool, not something used daily.
That, and the lack of replies here indicate that rewriting mmiotrace
with your approach might not be worth it. I don't think anyone
opposes to the idea, they just have better things to do.

It's all up to you, I believe.

Thanks,
pq

Apparently Analagous Threads

Search for more possibly parallel threads

Nouveau - Oct 2013 - MmioTrace: Using the Instruction Decoder, etc.

[Nouveau] MmioTrace: Using the Instruction Decoder, etc.

[Nouveau] MmioTrace: Using the Instruction Decoder, etc.

[Nouveau] MmioTrace: Using the Instruction Decoder, etc.

Apparently Analagous Threads