Kenneth Leibowitz
2005-Nov-15 13:30 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
We''re running a Solaris 10 container, with an Oracle 9.2.0.4 database -
every 5-10 min, an Oracle process shoots up (using 20% + CPU) and then goes down
in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried using some of the
trace scripts in the Dtracetoolkit to see what the process is doing, but without
any luck - also tried with the following, but dtrace process goes up to 30% CPU,
then I kill it:
#!/usr/sbin/dtrace -s
#pragma D option flowindent
pid14344::select:entry
{
self->follow = 1;
}
pid14344:::entry,
pid14344:::return
/self->follow/
{}
pid14344::select:return
/self->follow/
{
self->follow = 0;
exit(0);
}
- any help would be appreciated.
Thanks,
Kenneth
This message posted from opensolaris.org
Enda o''Connor - Sun Microsystems Ireland - Software Engineer
2005-Nov-15 13:51 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
Hi Just out of inerest, you''re sure that DISM is not configured for Oracle in the local zone. Enda Kenneth Leibowitz wrote:>We''re running a Solaris 10 container, with an Oracle 9.2.0.4 database - every 5-10 min, an Oracle process shoots up (using 20% + CPU) and then goes down in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried using some of the trace scripts in the Dtracetoolkit to see what the process is doing, but without any luck - also tried with the following, but dtrace process goes up to 30% CPU, then I kill it: > > >#!/usr/sbin/dtrace -s >#pragma D option flowindent > >pid14344::select:entry >{ > self->follow = 1; >} > >pid14344:::entry, >pid14344:::return >/self->follow/ >{} > >pid14344::select:return >/self->follow/ >{ > self->follow = 0; > exit(0); >} > > - any help would be appreciated. > >Thanks, >Kenneth >This message posted from opensolaris.org >_______________________________________________ >dtrace-discuss mailing list >dtrace-discuss at opensolaris.org > >
Kenneth Leibowitz
2005-Nov-16 05:38 UTC
[dtrace-discuss] Re: Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
Hi Enda I''m not familiar with the DISM term, but how can I check if it''s configured for Oracle in the container? Thank you for your reply. Regds, Kenneth This message posted from opensolaris.org
Kenneth Leibowitz
2005-Nov-16 10:33 UTC
[dtrace-discuss] Re: Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
I have checked and know for sure that DISM has not been set on Oracle. Regds, Kenneth This message posted from opensolaris.org
Enda o''Connor - Sun Microsystems Ireland - Software Engineer
2005-Nov-16 11:37 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
Hi
have a look at
http://www.sun.com/bigadmin/features/articles/db_in_containers.html
in particular there would be a
ora_dism process running
but you would still end up with ISM.
ie pmap on ora_pmon for instance would show ism instead of dism
And you''d have some serious performance problems kinda similar to what
you are seeing maybe.
But without any beginning info it was just a guess !
But as you don''t have dism enabled, then maybe the folowing as a
starting point.
As I am not very familiar wityh Dtrace is mainly just a start point
perhaps some one else here can elaborate.
~cat oracle.d
#!/usr/sbin/dtrace -qs
int x;
BEGIN{
x=-1;
}
/* The process that we are interested in */
proc:::create
/execname == "sqlplus" /
{
x=pid;
self->called_proc_create = 1;
}
syscall:::entry,
/progenyof(x) && self->called_proc_create/
{
@[probefunc] = count();
}
Now I have put /execname == "sqlplus" /
And this will watch all the children of sqlplus
ie run ./oracle.d in the global
then in the local zone run up sqlplus and start the DB, once you see a
cpu spike, kill and you''ll see the syscall that is generating the
highest hit.
But as I said it''s not much use, perhaps someone can expand to say how
to also output the process syscalls
ie ora_pmon_* read 217
ora_smon _* read 33254
but at least it will just give a print out of the total calls to each
syscall
ie
lseek 124
write 198
read 202
But in general oracle statspack is the way to go to diagnose Oracle
specific issues,
ie
http://www.akadia.com/services/ora_statspack_survival_guide.html
and things like iostat and kstat might provide some useful info as to
what os going on.
I would gather some stats via statspack if possible and contact
oracle-interest and perhaps zones-interest to get more help
Enda
Kenneth Leibowitz wrote:
>Hi Enda
>
>I''m not familiar with the DISM term, but how can I check if
it''s
>configured for Oracle in the container?
>
>Thank you for your reply.
>
>Regds,
>Kenneth
>
>
>-----Original Message-----
>From: Enda o''Connor - Sun Microsystems Ireland - Software Engineer
>[mailto:Enda.Oconnor at Sun.COM]
>Sent: Tuesday 15 November 2005 15:52
>To: Kenneth Leibowitz
>Cc: dtrace-discuss at opensolaris.org
>Subject: Re: [dtrace-discuss] Oracle 9 process on Sol 10 container,
>doing a pollsys, using high CPU
>
>Hi
>Just out of inerest, you''re sure that DISM is not configured for
Oracle
>in the local zone.
>
>
>
>Enda
>
>Kenneth Leibowitz wrote:
>
>
>
>>We''re running a Solaris 10 container, with an Oracle 9.2.0.4
database -
>>
>>
>every 5-10 min, an Oracle process shoots up (using 20% + CPU) and then
>goes down in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried
>using some of the trace scripts in the Dtracetoolkit to see what the
>process is doing, but without any luck - also tried with the following,
>but dtrace process goes up to 30% CPU, then I kill it:
>
>
>>#!/usr/sbin/dtrace -s
>>#pragma D option flowindent
>>
>>pid14344::select:entry
>>{
>> self->follow = 1;
>>}
>>
>>pid14344:::entry,
>>pid14344:::return
>>/self->follow/
>>{}
>>
>>pid14344::select:return
>>/self->follow/
>>{
>> self->follow = 0;
>> exit(0);
>>}
>>
>>- any help would be appreciated.
>>
>>Thanks,
>>Kenneth
>>This message posted from opensolaris.org
>>_______________________________________________
>>dtrace-discuss mailing list
>>dtrace-discuss at opensolaris.org
>>
>>
>>
>>
>
>
>This message and any attachments are confidential and intended solely for
the addressee. If you have received this message in error, please notify
Discovery immediately, telephone number +27 11 529 2888. Any unauthorised use;
alteration or dissemination of the contents of this email is strictly
prohibited. In no event will Discovery or the sender be liable in any manner
whatsoever to any person for any loss or any direct, indirect, special or
consequential damages arising from use of this email or any linked website,
including, without limitation, from any lost profits, business interruption,
loss of programmes or other data that may be stored on any information handling
system or otherwise from any assurance that this email is virus free even if
Discovery is expressly advised of the possibility of such damages. Discovery is
an Authorised Financial Services Provider.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20051116/1fbf93c3/attachment.html>
Roch Bourbonnais - Performance Engineering
2005-Nov-16 11:52 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
Did I miss a response to this ?
In case there were none; What you describe seems to imply
lots of time spent in the kernel. But you enabled only
application level probes; thus maybe you don''t see a lot of
output.
Try and add the kernel fbt ones:
> pid14344:::entry,
> pid14344:::return,
> fbt:::entry,
> fbt:::return
> /self->follow/
> {}
That will tell you where the kernel is going.
Lots of CPU consumption in the kernel can sometimes be
diagnosed with a time sampling approach : "lockstat -I sleep
10" or the dtrace profile provider. Also with the latest Sun
Studio 10 (Compilers and Tools), er_kernel/analyzer is a
rather cool UI to track things down.
HTH
-roch
Kenneth Leibowitz writes:
> We''re running a Solaris 10 container, with an Oracle 9.2.0.4
database - every 5-10 min, an Oracle process shoots up (using 20% + CPU) and
then goes down in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried
using some of the trace scripts in the Dtracetoolkit to see what the process is
doing, but without any luck - also tried with the following, but dtrace process
goes up to 30% CPU, then I kill it:
>
>
> #!/usr/sbin/dtrace -s
> #pragma D option flowindent
>
> pid14344::select:entry
> {
> self->follow = 1;
> }
>
> pid14344:::entry,
> pid14344:::return
> /self->follow/
> {}
>
> pid14344::select:return
> /self->follow/
> {
> self->follow = 0;
> exit(0);
> }
>
> - any help would be appreciated.
>
> Thanks,
> Kenneth
> This message posted from opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
Wee Yeh Tan
2005-Nov-17 01:23 UTC
[dtrace-discuss] Oracle 9 process on Sol 10 container, doing a pollsys, using high CPU
On 11/15/05, Kenneth Leibowitz <KennethL at discovery.co.za> wrote:> We''re running a Solaris 10 container, with an Oracle 9.2.0.4 database - every 5-10 min, an Oracle process shoots up (using 20% + CPU) and then goes down in CPU %, doing a [i]pollsys [/i](see it via dtruss). I tried using some of the trace scripts in the Dtracetoolkit to see what the process is doing, but without any luck - also tried with the following, but dtrace process goes up to 30% CPU, then I kill it:Kenneth, I agree with Roch that this is likely running in the kernel. You can approach this by tracing for a potentially expensive function when the thread does "select". The following d-script should tell you which function used up the most aggregated cputime within select(). #/usr/sbin/dtrace -s pid$target::select:entry { self->trace=1 } fbt:::entry / self->trace / { @cnt[probefunc] = count(); self->t[stackdepth] = vtimestamp; } fbt:::return / self->t[stackdepth] / { @timespent[probefunc] = sum(vtimestamp - self->t[stackdepth]); self->t[stackdepth] = 0; } pid$target::select:return / self->trace / { printa(@cnt); printa(@timespent); exit } It might also be useful to see what ''mpstat'' is reporting to get an overview. -- Just me, Wire ...