Kenneth Leibowitz
2005-Nov-15 13:30 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
We''re running a Solaris 10 container, with an Oracle 9.2.0.4 database - every 5-10 min, an Oracle process shoots up (using 20% + CPU) and then goes down in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried using some of the trace scripts in the Dtracetoolkit to see what the process is doing, but without any luck - also tried with the following, but dtrace process goes up to 30% CPU, then I kill it: #!/usr/sbin/dtrace -s #pragma D option flowindent pid14344::select:entry { self->follow = 1; } pid14344:::entry, pid14344:::return /self->follow/ {} pid14344::select:return /self->follow/ { self->follow = 0; exit(0); } - any help would be appreciated. Thanks, Kenneth This message posted from opensolaris.org
Enda o''Connor - Sun Microsystems Ireland - Software Engineer
2005-Nov-15 13:51 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
Hi Just out of inerest, you''re sure that DISM is not configured for Oracle in the local zone. Enda Kenneth Leibowitz wrote:>We''re running a Solaris 10 container, with an Oracle 9.2.0.4 database - every 5-10 min, an Oracle process shoots up (using 20% + CPU) and then goes down in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried using some of the trace scripts in the Dtracetoolkit to see what the process is doing, but without any luck - also tried with the following, but dtrace process goes up to 30% CPU, then I kill it: > > >#!/usr/sbin/dtrace -s >#pragma D option flowindent > >pid14344::select:entry >{ > self->follow = 1; >} > >pid14344:::entry, >pid14344:::return >/self->follow/ >{} > >pid14344::select:return >/self->follow/ >{ > self->follow = 0; > exit(0); >} > > - any help would be appreciated. > >Thanks, >Kenneth >This message posted from opensolaris.org >_______________________________________________ >dtrace-discuss mailing list >dtrace-discuss at opensolaris.org > >
Kenneth Leibowitz
2005-Nov-16 05:38 UTC
[dtrace-discuss] Re: Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
Hi Enda I''m not familiar with the DISM term, but how can I check if it''s configured for Oracle in the container? Thank you for your reply. Regds, Kenneth This message posted from opensolaris.org
Kenneth Leibowitz
2005-Nov-16 10:33 UTC
[dtrace-discuss] Re: Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
I have checked and know for sure that DISM has not been set on Oracle. Regds, Kenneth This message posted from opensolaris.org
Enda o''Connor - Sun Microsystems Ireland - Software Engineer
2005-Nov-16 11:37 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
Hi have a look at http://www.sun.com/bigadmin/features/articles/db_in_containers.html in particular there would be a ora_dism process running but you would still end up with ISM. ie pmap on ora_pmon for instance would show ism instead of dism And you''d have some serious performance problems kinda similar to what you are seeing maybe. But without any beginning info it was just a guess ! But as you don''t have dism enabled, then maybe the folowing as a starting point. As I am not very familiar wityh Dtrace is mainly just a start point perhaps some one else here can elaborate. ~cat oracle.d #!/usr/sbin/dtrace -qs int x; BEGIN{ x=-1; } /* The process that we are interested in */ proc:::create /execname == "sqlplus" / { x=pid; self->called_proc_create = 1; } syscall:::entry, /progenyof(x) && self->called_proc_create/ { @[probefunc] = count(); } Now I have put /execname == "sqlplus" / And this will watch all the children of sqlplus ie run ./oracle.d in the global then in the local zone run up sqlplus and start the DB, once you see a cpu spike, kill and you''ll see the syscall that is generating the highest hit. But as I said it''s not much use, perhaps someone can expand to say how to also output the process syscalls ie ora_pmon_* read 217 ora_smon _* read 33254 but at least it will just give a print out of the total calls to each syscall ie lseek 124 write 198 read 202 But in general oracle statspack is the way to go to diagnose Oracle specific issues, ie http://www.akadia.com/services/ora_statspack_survival_guide.html and things like iostat and kstat might provide some useful info as to what os going on. I would gather some stats via statspack if possible and contact oracle-interest and perhaps zones-interest to get more help Enda Kenneth Leibowitz wrote:>Hi Enda > >I''m not familiar with the DISM term, but how can I check if it''s >configured for Oracle in the container? > >Thank you for your reply. > >Regds, >Kenneth > > >-----Original Message----- >From: Enda o''Connor - Sun Microsystems Ireland - Software Engineer >[mailto:Enda.Oconnor at Sun.COM] >Sent: Tuesday 15 November 2005 15:52 >To: Kenneth Leibowitz >Cc: dtrace-discuss at opensolaris.org >Subject: Re: [dtrace-discuss] Oracle 9 process on Sol 10 container, >doing a pollsys, using high CPU > >Hi >Just out of inerest, you''re sure that DISM is not configured for Oracle >in the local zone. > > > >Enda > >Kenneth Leibowitz wrote: > > > >>We''re running a Solaris 10 container, with an Oracle 9.2.0.4 database - >> >> >every 5-10 min, an Oracle process shoots up (using 20% + CPU) and then >goes down in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried >using some of the trace scripts in the Dtracetoolkit to see what the >process is doing, but without any luck - also tried with the following, >but dtrace process goes up to 30% CPU, then I kill it: > > >>#!/usr/sbin/dtrace -s >>#pragma D option flowindent >> >>pid14344::select:entry >>{ >> self->follow = 1; >>} >> >>pid14344:::entry, >>pid14344:::return >>/self->follow/ >>{} >> >>pid14344::select:return >>/self->follow/ >>{ >> self->follow = 0; >> exit(0); >>} >> >>- any help would be appreciated. >> >>Thanks, >>Kenneth >>This message posted from opensolaris.org >>_______________________________________________ >>dtrace-discuss mailing list >>dtrace-discuss at opensolaris.org >> >> >> >> > > >This message and any attachments are confidential and intended solely for the addressee. If you have received this message in error, please notify Discovery immediately, telephone number +27 11 529 2888. Any unauthorised use; alteration or dissemination of the contents of this email is strictly prohibited. In no event will Discovery or the sender be liable in any manner whatsoever to any person for any loss or any direct, indirect, special or consequential damages arising from use of this email or any linked website, including, without limitation, from any lost profits, business interruption, loss of programmes or other data that may be stored on any information handling system or otherwise from any assurance that this email is virus free even if Discovery is expressly advised of the possibility of such damages. Discovery is an Authorised Financial Services Provider. > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20051116/1fbf93c3/attachment.html>
Roch Bourbonnais - Performance Engineering
2005-Nov-16 11:52 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
Did I miss a response to this ? In case there were none; What you describe seems to imply lots of time spent in the kernel. But you enabled only application level probes; thus maybe you don''t see a lot of output. Try and add the kernel fbt ones: > pid14344:::entry, > pid14344:::return, > fbt:::entry, > fbt:::return > /self->follow/ > {} That will tell you where the kernel is going. Lots of CPU consumption in the kernel can sometimes be diagnosed with a time sampling approach : "lockstat -I sleep 10" or the dtrace profile provider. Also with the latest Sun Studio 10 (Compilers and Tools), er_kernel/analyzer is a rather cool UI to track things down. HTH -roch Kenneth Leibowitz writes: > We''re running a Solaris 10 container, with an Oracle 9.2.0.4 database - every 5-10 min, an Oracle process shoots up (using 20% + CPU) and then goes down in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried using some of the trace scripts in the Dtracetoolkit to see what the process is doing, but without any luck - also tried with the following, but dtrace process goes up to 30% CPU, then I kill it: > > > #!/usr/sbin/dtrace -s > #pragma D option flowindent > > pid14344::select:entry > { > self->follow = 1; > } > > pid14344:::entry, > pid14344:::return > /self->follow/ > {} > > pid14344::select:return > /self->follow/ > { > self->follow = 0; > exit(0); > } > > - any help would be appreciated. > > Thanks, > Kenneth > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Wee Yeh Tan
2005-Nov-17 01:23 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
On 11/15/05, Kenneth Leibowitz <KennethL at discovery.co.za> wrote:> We''re running a Solaris 10 container, with an Oracle 9.2.0.4 database - every 5-10 min, an Oracle process shoots up (using 20% + CPU) and then goes down in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried using some of the trace scripts in the Dtracetoolkit to see what the process is doing, but without any luck - also tried with the following, but dtrace process goes up to 30% CPU, then I kill it:Kenneth, I agree with Roch that this is likely running in the kernel. You can approach this by tracing for a potentially expensive function when the thread does "select". The following d-script should tell you which function used up the most aggregated cputime within select(). #/usr/sbin/dtrace -s pid$target::select:entry { self->trace=1 } fbt:::entry / self->trace / { @cnt[probefunc] = count(); self->t[stackdepth] = vtimestamp; } fbt:::return / self->t[stackdepth] / { @timespent[probefunc] = sum(vtimestamp - self->t[stackdepth]); self->t[stackdepth] = 0; } pid$target::select:return / self->trace / { printa(@cnt); printa(@timespent); exit } It might also be useful to see what ''mpstat'' is reporting to get an overview. -- Just me, Wire ...