Matthias Schmidt
2006-Jul-04 12:14 UTC
[dtrace-discuss] dtrace buildin "timestamp" going back in time?
Hi *, I use a rather trivial dtrace-script (see attachment) to hunt for java-monitor contentions in ISV''s apps. The handler for contentions is this:> hotspot$target:::monitor-contended-enter > /* (uintptr_t thread_id, uintptr_t monitor_id, char* obj_class_name, > uintptr_t obj_class_name_len) */ > { > printf("%d;CT;%d;%d\n", timestamp, arg0, arg1 ); > } > > hotspot$target:::monitor-contended-entered > /*(uintptr_t thread_id, uintptr_t monitor_id, char* obj_class_name, uintptr_t obj_class_name_len)*/ > { > self->str_ptr = (char*) copyin(arg2, arg3+1); > self->str_ptr[arg3] = ''\0''; > self->class_name = (string) self->str_ptr; > > printf("%d;CE;%d;%d;%s\n", timestamp, arg0, arg1, self->class_name); > } >The raw output, which will be processed offline, looks like this: 3018202136517052;BEGIN 3018229677616648;TH;6;CompilerThread1 3018229494152592;TH;2;Reference Handler 3018229495259440;WI;2;1894104 3018229495595792;TH;3;Finalizer 3018229496054740;WI;3;1894208 3018229678170532;TH;7;Low Memory Detector 3018229676688168;TH;4;Signal Dispatcher 3018229677195064;TH;5;CompilerThread0 3018244298507680;TH;9;RMI TCP Accept-0 3018244300482232;CT;1;1894416 3018244300898536;CE;1;1894416;[[I 3018244400720780;TH;10;RMI TCP Accept-17366 3018356578278832;WO;2;1894104;java/lang/ref/Reference$Lock 3018356579127120;CT;2;1894208 3018356579163260;CE;2;1894208;java/lang/ref/ReferenceQueue$Lock 3018356585860512;WI;2;1894208 3018356579098684;WO;3;1894208;java/lang/ref/ReferenceQueue$Lock 3018356603605504;WI;3;1894104 3018805004170112;WI;2;1894104 3018804998303816;WO;2;1894208;java/lang/ref/Reference$Lock 3018804998494184;CT;2;1894104 3018804998525684;CE;2;1894104;java/lang/ref/ReferenceQueue$Lock 3018804998461816;WO;3;1894104;java/lang/ref/ReferenceQueue$Lock 3018805028039052;WI;3;1894208 3018933782080160;WO;2;1894104;java/lang/ref/Reference$Lock 3018933782255112;CT;2;1894208 3018933782287476;CE;2;1894208;java/lang/ref/ReferenceQueue$Lock 3018933784279688;WI;2;1894104 3018933792155676;WI;3;1894208 3018933782225876;WO;3;1894208;java/lang/ref/ReferenceQueue$Lock 3019020966062436;WI;13;1896808 3019020975591632;WI;16;1896496 3019020977908256;WI;18;1896288 . . . 3019032896467576;CT;34;1894312 3019032897351972;CE;16;1894312;a2 3019032902037256;CT;16;1894312 3019032902225948;CE;16;1894312;a2 3019032884788452;CE;24;1894312;a2 3019032884986640;CT;24;1894312 . . . 3024598351437028;TH;8;Thread-0 3024598351789520;TH;11;Thread-1 3024598351791816;WI;36;1897744 3024598366141784;WO;36;1897744;java/util/logging/LogManager$Cleaner 3024598366204940;WI;36;1897640 3024598395898764;WO;36;1897640;ah 3024599043545840;END Too badly from time to time some records got dropped, but even worse the buildin variable "timestamp" goes back in time. Obviously this ruins my calculations ... The Box is a T2000: # cat /etc/release Solaris 10 3/05 HW2 s10s_hw2wos_05 SPARC Copyright 2005 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 26 September 2005 # uname -a SunOS marco 5.10 Generic_118822-20 sun4v sparc SUNW,Sun-Fire-T200 Do I have to use Sol10U2 to get precision timers? Should I file a Bug? cheers, Matthias -- Matthias Schmidt Tel : (++49) 6227 356 236 Sun Microsystems GmbH Fax : (++49) 6227 356 222 Altrottstr. 31 SAP(WDF03,G5.30) : (++49) 6227 7 61093 D-69190 Walldorf Mobil : (++49) 171 5767209 This message posted from opensolaris.org -------------- next part -------------- A non-text attachment was scrubbed... Name: synchronizer.d Type: application/octet-stream Size: 3940 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20060704/2f217dba/attachment.obj>
Roch
2006-Jul-04 14:50 UTC
[dtrace-discuss] dtrace buildin "timestamp" going back in time?
Hi Matthias, Dtrace output is not sorted by time. You may want to output the curthread pointer along with the timestamp and sort out the data prior to analysis. -r Matthias Schmidt writes: > Hi *, > > I use a rather trivial dtrace-script (see attachment) to hunt for java-monitor contentions in ISV''s apps. > > The handler for contentions is this: > > > hotspot$target:::monitor-contended-enter > > /* (uintptr_t thread_id, uintptr_t monitor_id, char* obj_class_name, > > uintptr_t obj_class_name_len) */ > > { > > printf("%d;CT;%d;%d\n", timestamp, arg0, arg1 ); > > } > > > > hotspot$target:::monitor-contended-entered > > /*(uintptr_t thread_id, uintptr_t monitor_id, char* obj_class_name, uintptr_t obj_class_name_len)*/ > > { > > self->str_ptr = (char*) copyin(arg2, arg3+1); > > self->str_ptr[arg3] = ''\0''; > > self->class_name = (string) self->str_ptr; > > > > printf("%d;CE;%d;%d;%s\n", timestamp, arg0, arg1, self->class_name); > > } > > > > The raw output, which will be processed offline, looks like this: > > 3018202136517052;BEGIN > 3018229677616648;TH;6;CompilerThread1 > 3018229494152592;TH;2;Reference Handler > 3018229495259440;WI;2;1894104 > 3018229495595792;TH;3;Finalizer > 3018229496054740;WI;3;1894208 > 3018229678170532;TH;7;Low Memory Detector > 3018229676688168;TH;4;Signal Dispatcher > 3018229677195064;TH;5;CompilerThread0 > 3018244298507680;TH;9;RMI TCP Accept-0 > 3018244300482232;CT;1;1894416 > 3018244300898536;CE;1;1894416;[[I > 3018244400720780;TH;10;RMI TCP Accept-17366 > 3018356578278832;WO;2;1894104;java/lang/ref/Reference$Lock > 3018356579127120;CT;2;1894208 > 3018356579163260;CE;2;1894208;java/lang/ref/ReferenceQueue$Lock > 3018356585860512;WI;2;1894208 > 3018356579098684;WO;3;1894208;java/lang/ref/ReferenceQueue$Lock > 3018356603605504;WI;3;1894104 > 3018805004170112;WI;2;1894104 > 3018804998303816;WO;2;1894208;java/lang/ref/Reference$Lock > 3018804998494184;CT;2;1894104 > 3018804998525684;CE;2;1894104;java/lang/ref/ReferenceQueue$Lock > 3018804998461816;WO;3;1894104;java/lang/ref/ReferenceQueue$Lock > 3018805028039052;WI;3;1894208 > 3018933782080160;WO;2;1894104;java/lang/ref/Reference$Lock > 3018933782255112;CT;2;1894208 > 3018933782287476;CE;2;1894208;java/lang/ref/ReferenceQueue$Lock > 3018933784279688;WI;2;1894104 > 3018933792155676;WI;3;1894208 > 3018933782225876;WO;3;1894208;java/lang/ref/ReferenceQueue$Lock > 3019020966062436;WI;13;1896808 > 3019020975591632;WI;16;1896496 > 3019020977908256;WI;18;1896288 > . > . > . > 3019032896467576;CT;34;1894312 > 3019032897351972;CE;16;1894312;a2 > 3019032902037256;CT;16;1894312 > 3019032902225948;CE;16;1894312;a2 > 3019032884788452;CE;24;1894312;a2 > 3019032884986640;CT;24;1894312 > . > . > . > 3024598351437028;TH;8;Thread-0 > 3024598351789520;TH;11;Thread-1 > 3024598351791816;WI;36;1897744 > 3024598366141784;WO;36;1897744;java/util/logging/LogManager$Cleaner > 3024598366204940;WI;36;1897640 > 3024598395898764;WO;36;1897640;ah > 3024599043545840;END > > Too badly from time to time some records got dropped, but even worse the buildin variable "timestamp" goes back in time. > > Obviously this ruins my calculations ... > > The Box is a T2000: > > # cat /etc/release > Solaris 10 3/05 HW2 s10s_hw2wos_05 SPARC > Copyright 2005 Sun Microsystems, Inc. All Rights Reserved. > Use is subject to license terms. > Assembled 26 September 2005 > # uname -a > SunOS marco 5.10 Generic_118822-20 sun4v sparc SUNW,Sun-Fire-T200 > > Do I have to use Sol10U2 to get precision timers? Should I file a Bug? > > cheers, > > Matthias > -- > Matthias Schmidt Tel : (++49) 6227 356 236 > Sun Microsystems GmbH Fax : (++49) 6227 356 222 > Altrottstr. 31 SAP(WDF03,G5.30) : (++49) 6227 7 61093 > D-69190 Walldorf Mobil : (++49) 171 5767209 > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Matthias Schmidt
2006-Jul-04 15:39 UTC
[dtrace-discuss] Re: dtrace buildin "timestamp" going back in time?
Hi Roch, i do print out the Java-Thread-ID (which should be OK since the Native Threads 1:1 mapping) given by the mustang-dtrace probes. Nevertheless i have duplicate and dropped records and timestamps going back in time *in the same thread*. Here''s an example, where the collums are: 1. timestamp 2. CT = Contented Try or CE = Contended Enter 3. Tread-ID 4. Object-ID 5. Resolved Name of 4. (in case of an exit) 3020788127468064;CT;24;1897640 3020788127557864;CE;24;1897640;a2 3020788252864268;CE;24;1897640;a2 So (Java-)Thread 24 want to get into object # 1897640, who''s name is the meaningfull "a2" (without package, I know ...). But he enters twice??? This is just an example of a duplicate record. I have dropped ones as well. Then I see CE records without a CT. I would love get this fixed, since the possibilities for exposing Java coding glitches by overusing the synchronized keyword (just one example) is just great with the combination of and java+dtrace. cheers, Matthias This message posted from opensolaris.org
Wee Yeh Tan
2006-Jul-05 03:08 UTC
[dtrace-discuss] Re: dtrace buildin "timestamp" going back in time?
Matthias, The trace data is collected per CPU at a specified interval (switchrate) so it is possible that the thread might have migrated within the interval. You can try to increase switchrate and bufsize to avoid the drops. -- Just me, Wire ...
Jonathan Adams
2006-Jul-05 16:50 UTC
[dtrace-discuss] Re: dtrace buildin "timestamp" going back in time?
On Tue, Jul 04, 2006 at 08:39:40AM -0700, Matthias Schmidt wrote:> Hi Roch, > > i do print out the Java-Thread-ID (which should be OK since the Native > Threads 1:1 mapping) given by the mustang-dtrace probes. > > Nevertheless i have duplicate and dropped records and timestamps > going back in time *in the same thread*. Here''s an example, where the > collums are: > > 1. timestamp > 2. CT = Contented Try or CE = Contended Enter > 3. Tread-ID > 4. Object-ID > 5. Resolved Name of 4. (in case of an exit) > > 3020788127468064;CT;24;1897640 > 3020788127557864;CE;24;1897640;a2 > 3020788252864268;CE;24;1897640;a2 > > So (Java-)Thread 24 want to get into object # 1897640, who''s name is > the meaningfull "a2" (without package, I know ...). > > But he enters twice???The way dtrace works is that all data is recorded into per-CPU buffers. Once a second, the dtrace consumer copies all of those buffers out to userland, and prints out the data from them. All the records from a given CPU come out in sequence. So your timestamps are the only way to keep order. I don''t see why the above is a "duplicate"; looking at the timestamps: 3020788252864268 -3020788127557864 = 125,306,404 = 125 milliseconds So a tenth of a second has passed. By dropped records, are you actually getting messages like "n drops"? If so, what are they, precisely?> This is just an example of a duplicate record. I have dropped ones as > well. Then I see CE records without a CT.Are you sure CE records are always preceded by CTs? Cheers, - jonathan -- Jonathan Adams, Solaris Kernel Development
Matthias Schmidt
2006-Jul-07 14:49 UTC
[dtrace-discuss] Re: dtrace buildin "timestamp" going back in time?
Hi Jonathan,> The way dtrace works is that all data is recorded into per-CPU buffers. Once > a second, the dtrace consumer copies all of those buffers out to userland, > and prints out the data from them. > > All the records from a given CPU come out in sequence. So your timestamps > are the only way to keep order.I thought that it must work that way - by somehow using the cpu-internal ticks registers. I intentionally used timestamp over walltimestamp to keep the measurement cheap. Maybe I have to change the strategy here ...> > I don''t see why the above is a "duplicate"; looking at the timestamps: > > 3020788252864268 -3020788127557864 = 125,306,404 = 125 milliseconds > > So a tenth of a second has passed. > > By dropped records, are you actually getting messages like "n drops"? If so, > what are they, precisely?nope - I don''t see any "n drops"-messages.> >> This is just an example of a duplicate record. I have dropped ones as >> well. Then I see CE records without a CT. > > Are you sure CE records are always preceded by CTs?According to the mustang dtrace-probe doc''s there can be no CT without a CE. CE (in my lingo) stands for "Contented Enter" and CT for "contended Try". If someone trys to get the monitor he must enter it sometime in the future (exept the thread dies suddely). They need to come in pairs. And they usually do. I''ve verified this with a couple of other programs, but in this case here some records get dropped under load. cheers, Matthias This message posted from opensolaris.org