Matthieu Chase Heimer
2008-Feb-05 07:40 UTC
[dtrace-discuss] DTraceToolkit bug in how inclusive times are calculated
If there is a better place to report this please let me know. When adding the inclusive method execution time it is required to use the complete method signature. The problem is likely to be present in other scripts but I''ve verified it in the Java related scripts that calculate inclusive and exclusive execution times. The problem is caused by tracing overloaded methods, a fairly common occurrence in Java. Given the following Java program: public class Problem { public static void main(String[] args) { action(42); try{ Thread.sleep(2000); } catch(Exception e) {} } public static void action(int x) { action("duke"); try{ Thread.sleep(2000); } catch(Exception e) {} } public static void action(String s) { try{ Thread.sleep(6000); } catch(Exception e) {} } } The relevant output of j_calltime.d would be: Top 10 inclusive method elapsed times (us), PID TYPE NAME TOTAL 1346 method sun/misc/Launcher.<init> 1526459 1346 method sun/misc/Launcher.<clinit> 1527443 1346 method java/lang/ClassLoader.initSystemClassLoader 1530208 1346 method java/lang/ClassLoader.getSystemClassLoader 1530537 1346 method java/net/URL.<init> 1558066 1346 method java/lang/ClassLoader.loadClass 3195875 1346 method java/security/AccessController.doPrivileged 3578758 1346 method java/lang/Thread.sleep 9942087 1346 method Problem.main 9955080 1346 method Problem.action 14025628 Notice how action is listed as having a longer execution time than it''s caller. -- This message posted from opensolaris.org
Brendan Gregg - Sun Microsystems
2008-Feb-06 19:45 UTC
[dtrace-discuss] DTraceToolkit bug in how inclusive times are calculated
G''Day, On Mon, Feb 04, 2008 at 11:40:04PM -0800, Matthieu Chase Heimer wrote:> If there is a better place to report this please let me know.This is best.> When adding the inclusive method execution time it is required to use the complete method signature. The problem is likely to be present in other scripts but I''ve verified it in the Java related scripts that calculate inclusive and exclusive execution times. The problem is caused by tracing overloaded methods, a fairly common occurrence in Java. > > > Given the following Java program: > > public class Problem { > > public static void main(String[] args) { > action(42); > try{ Thread.sleep(2000); } catch(Exception e) {} > } > > public static void action(int x) { > action("duke"); > try{ Thread.sleep(2000); } catch(Exception e) {} > } > > public static void action(String s) { > try{ Thread.sleep(6000); } catch(Exception e) {} > } > > } > > The relevant output of j_calltime.d would be: > Top 10 inclusive method elapsed times (us), > PID TYPE NAME TOTAL > 1346 method sun/misc/Launcher.<init> 1526459 > 1346 method sun/misc/Launcher.<clinit> 1527443 > 1346 method java/lang/ClassLoader.initSystemClassLoader 1530208 > 1346 method java/lang/ClassLoader.getSystemClassLoader 1530537 > 1346 method java/net/URL.<init> 1558066 > 1346 method java/lang/ClassLoader.loadClass 3195875 > 1346 method java/security/AccessController.doPrivileged 3578758 > 1346 method java/lang/Thread.sleep 9942087 > 1346 method Problem.main 9955080 > 1346 method Problem.action 14025628 > > Notice how action is listed as having a longer execution time than it''s caller.Yes - it is expected to have a longer *inclusive* elapsed time. This is one of a number of reasons why understanding inclusive times is difficult. Note that the inclusive time reports delibrately don''t include a "Total" line - since that would add to the confusion. Consider: # ./j_flow.d | grep Problem 0 140987 5335249357945 -> Problem.main 0 140987 5335249357955 -> Problem.action 0 140987 5335249357965 -> Problem.action 0 140987 5335255360139 <- Problem.action 0 140987 5335257370085 <- Problem.action 1 140987 5335259380101 <- Problem.main On line 2 Problem.action(int) begins. Inclusive time will include everything it calls, which is the 2000 ms sleep it calls directly and the 6000 ms sleep a child method calls. Tally 8000 ms for "Problem.action". On line 3 Problem.action(string) begins. It''s inclusive time is 6000 ms. Add that to "Problem.action" and we reach 14000 ms, which is correct. Using full signatures would eliminate this particular confusion, at the expense of more cluttered output. Probably best for a seperate script. It won''t solve similar confusion for recursive functions. I should update Notes/ALLinclusive_notes.txt and explain this in more detail. :) It might be handy to include your example in the notes file, if you don''t mind, since it presents a good case. cheers, Brendan -- Brendan [CA, USA]
Matthieu Chase Heimer
2008-Feb-07 04:21 UTC
[dtrace-discuss] DTraceToolkit bug in how inclusive times are calculated
> Yes - it is expected to have a longer *inclusive* > elapsed time. This is > one of a number of reasons why understanding > inclusive times is difficult. > Note that the inclusive time reports delibrately > don''t include a "Total" > line - since that would add to the confusion. > > Consider: > > # ./j_flow.d | grep Problem > 0 140987 5335249357945 -> Problem.main > 0 140987 5335249357955 -> Problem.action > 0 140987 5335249357965 -> Problem.action > 0 140987 5335255360139 <- Problem.action > 0 140987 5335257370085 <- Problem.action > 1 140987 5335259380101 <- Problem.main > > On line 2 Problem.action(int) begins. Inclusive time > will include everything > it calls, which is the 2000 ms sleep it calls > directly and the 6000 ms > sleep a child method calls. Tally 8000 ms for > "Problem.action". > > On line 3 Problem.action(string) begins. It''s > inclusive time is 6000 ms. > Add that to "Problem.action" and we reach 14000 ms, > which is correct.I get the math but a caller should never have a shorter inclusive time than a callee. The problem example has 2 inclusive times for (lacking the full signature) the same method. Since we have to consider this as a nested method call of the same method then the shorter time should be discarded to get the accurate inclusive time. Right now the second action method is getting counted twice in the aggregation.> Using full signatures would eliminate this particular > confusion, at the > expense of more cluttered output. Probably best for > a separate script.If the method signature is moved to the last column it''s not so bad but I know what you mean. I wish there was a maximum width option for a printf format string.> It won''t solve similar confusion for recursive > functions.Not just recursive, anything reentrant at all. The fix is to count method reentrance using an associative array with the full method sig as the key and only add to the inclusive aggregation when we return to a depth of zero. The only problem is I can''t get this working without getting drops, even if I could get it working the overhead probably wouldn''t be worth it. When you compare the frequency of method overloading to reentrance in the Java APIs I think it''s worth dealing with the overloading but just give a warning for the rest.> I should update Notes/ALLinclusive_notes.txt and > explain this in more > detail. :) It might be handy to include your example > in the notes file, > if you don''t mind, since it presents a good case.It''s all yours, do whatever you want with it (any license). -- This message posted from opensolaris.org
Matthieu Chase Heimer
2008-Feb-07 05:50 UTC
[dtrace-discuss] DTraceToolkit bug in how inclusive times are calculated
So I think I''m managed to make an example that is working correctly even for reentrance. It does complicate the script a little bit though. http://www.osdev.org/dtrace/hotspot_extras/j_calltime2.d Top 10 inclusive method elapsed times (us), PID TYPE TOTAL NAME 1683 method 2370475 sun/misc/Launcher$ExtClassLoader.getExtClassLoader()Lsun/misc/Launcher$ExtClassLoader; 1683 method 2568302 sun/misc/Launcher.<init>()V 1683 method 2570196 sun/misc/Launcher.<clinit>()V 1683 method 2576193 java/lang/ClassLoader.initSystemClassLoader()V 1683 method 2576928 java/lang/ClassLoader.getSystemClassLoader()Ljava/lang/ClassLoader; 1683 method 3252744 java/security/AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;)Ljava/lang/Object; 1683 method 6028414 Problem.action(Ljava/lang/String;)V 1683 method 8037978 Problem.action(I)V 1683 method 10047762 java/lang/Thread.sleep(J)V 1683 method 10068908 Problem.main([Ljava/lang/String;)V -- This message posted from opensolaris.org
Nikita Zinoviev
2008-Feb-07 11:39 UTC
[dtrace-discuss] how to get verbose stability/dependency information
Do you know if there is a way to get stability attributes for registered dtrace probes except dtrace -l -v ? Because as far as I know while registering a usdt provider there is a way to set stability attributes (name, data, dependency) for provider, module, function, name and arguments of the probe. Is there any way of getting all of this? dtrace -l -v doesn''t seem verbose enough.. For instance, how can I get these attributes for a module or function? thanks in advance, Nikita Zinoviev