thr3ads.net - dtrace discuss - [dtrace-discuss] Solaris Internals Resource Threshold being hit [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Robin Cotgrove

2010-Oct-29 16:00 UTC

[dtrace-discuss] Solaris Internals Resource Threshold being hit

I need some assistance and guidance in writing a DTRACE script or even better,
finding an example one which would help me identify what''s going on our
system. Intermittently, and we think it might be happening after about 60 days,
on a E2900, 192GB, 24 core, Solaris 10 11.06 system with a fairly new patch
cluster (Generic_142900-13) we are running into a problem whereby we suddenly
hit a problem which results in processes failing to start and getting the error
message ''resource temporarily unavailable'' error. This is
leading to Oracle crash/startup issues.
 
I ran a simple du command at the time it was happening at got the following
response.
 
?du: No more processes: Resource temporarily unavailable?     
 
Approximately 6500 TCP connections on server at time. 6000 unix processes. The
max UNIX processes per user is set to 29995. 60GB free physical memory and no
swap being used. Absolutely baffling us at mo.
 
Not managed to truss a failing command when it happened yet because
it''s so intermitttent in it''s nature.
 
We''ve checked all the usual suspects including max processes per users
and cannot find the cause. Need a way to monitor all the internal kernel
resources to see what we''re hitting. Suggestions please on a postcard.
All welcome.
 
Robin Cotgrove
-- 
This message posted from opensolaris.org

Mike Gerdts

2010-Oct-29 16:58 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

On Fri, Oct 29, 2010 at 11:00 AM, Robin Cotgrove <robin at rjcnet.co.uk>
wrote:> I need some assistance and guidance in writing a DTRACE script or even
better, finding an example one which would help me identify what''s
going on our system. Intermittently, and we think it might be happening after
about 60 days, on a E2900, 192GB, 24 core, Solaris 10 11.06 system with a fairly
new patch cluster (Generic_142900-13) we are running into a problem whereby we
suddenly hit a problem which results in processes failing to start and getting
the error message ''resource temporarily unavailable'' error.
This is leading to Oracle crash/startup issues.
>
> I ran a simple du command at the time it was happening at got the following
response.
>
> ?du: No more processes: Resource temporarily unavailable?
Does anything get logged to /var/adm/messages?
>
> Approximately 6500 TCP connections on server at time. 6000 unix processes.
The max UNIX processes per user is set to 29995. 60GB free physical memory and
no swap being used. Absolutely baffling us at mo.
Swap may not be used, but it is certainly reserved.  Note that Solaris
has multiple definitions of swap.  That disk space you allocated and
called "swap" is one thing.  The overall RAM and swap device backed
address space is another.

Unlike Linux (default config), Solaris does not allow memory to be
overcommitted.  If something does malloc(1024 * 1024 * 1024 * 1024),
the call will fail on Solaris unless you have 1 TB of free "swap"
(memory + swap devices).  On Linux, the malloc would likely succeed.
At such a time as you actually start writing to more pages of memory
than your system has in RAM + swap devices, the allocated memory, the
Linux Out of Memory Killer will kick in and start selecting things to
kill to free up memory.

We can see this with two runs of /opt/DTT/Mem/swapinfo.d on my
OpenSolaris system.  You can get this for Solaris 10 as part of the
DTraceToolkit.

# /opt/DTT/Mem/swapinfo.d
...
Swap _______Total  2496 MB
Swap         Resv   619 MB
Swap        Avail  1877 MB
Swap    (Minfree)   222 MB

# /opt/DTT/Mem/swapinfo.d
...
Swap _______Total  2224 MB
Swap         Resv  2047 MB
Swap        Avail   176 MB
Swap    (Minfree)   222 MB

One thing I just noticed - minfree does not become 176 MB as I would
have expected.  Be careful with that value!

Why was there such a big difference in Avail?  Because I ran this program:

/* Save as foo.c then compile with gcc -o foo foo.c */
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char **argv) {
	if ( malloc(1024 * 1024 * 1700) == NULL ) {
		perror("malloc");
		exit(1);
	}
	sleep(5);
	exit(0);
}

A likely scenario that would cause a database server to temporarily
reserve a lot more swap is when a new oracle process is created.  When
a process forks, memory is reserved for all of the pages of memory
that are anonymous (e.g. not an mmapped file or device), read-write,
and not shared.  This is required to support the copy-on-write
mechanism used by the virtual memory system.  You can use pmap to take
a look at the memory mappings of a process to get an idea of how much
space this takes.

To look at the amount of available swap that matters, refer to the
swap column of vmstat.  For things like this that are transient, you
may have trouble seeing it, even with "vmstat 1".  Note that while you
are looking at vmstat output, you should always ignore the first line
of output - it is a pretty much useless average since boot.  If you
need to get values at a higher resolution, you may want to adapt
swapinfo.d from the DTraceToolkit to use the profile provider to
quantize the available swap value.
>
> Not managed to truss a failing command when it happened yet because
it''s so intermitttent in it''s nature.
>
> We''ve checked all the usual suspects including max processes per
users and cannot find the cause. Need a way to monitor all the internal kernel
resources to see what we''re hitting. Suggestions please on a postcard.
All welcome.
It seems quite likely to me that you will find that the swap that is
available to reserve temporarily dips to a minuscule value.  If this
is the case, adding more swap will help.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/

Jim Mauro

2010-Oct-29 18:27 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

Mike is correct. Pretty much every time I''ve seen this, it''s
VM (VM = virtual memory = swap) related.

There''s a DTrace script below you can run when you hit this
problem that will show us which system call is failing with an
EAGAIN error. It is most likely fork(2) (and yes, I know printing
the errno in the return action is superfluous given we use it
in the predicate - it''s me being OCD and sanity checking).

A second DTrace script further down should provide a kernel
stack trace if it is a fork(2) failure.

Or....(disk is cheap) "swap -a" (add swap space) and see if the
problem goes away.

Thanks
/jim


#!/usr/sbin/dtrace -s

#pragma D option quiet

syscall:::entry
{
	self->flag[probefunc] = 1;
}
syscall:::return
/self->flag[probefunc] && errno == 11/
{
	printf("syscall: %s, arg0: %d, arg1: %d, errno:
%d\n\n",probefunc,arg0,arg1,errno);
	self->flag[probefunc] = 0;
}


------------------------------------------------------------------------------------------------------------------------

#!/usr/sbin/dtrace -s

#pragma D option quiet

syscall::forksys:entry
{
	self->flag = 1;
	@ks[stack(),ustack()] = count();
}
syscall::forksys:return
/self->flag && arg0 == -1 && errno != 0/
{
	printf("fork failed, errno: %d\n",errno);
	printa(@ks);
	clear(@ks);
	exit(0);
}


On Oct 29, 2010, at 12:00 PM, Robin Cotgrove wrote:
> I need some assistance and guidance in writing a DTRACE script or even
better, finding an example one which would help me identify what''s
going on our system. Intermittently, and we think it might be happening after
about 60 days, on a E2900, 192GB, 24 core, Solaris 10 11.06 system with a fairly
new patch cluster (Generic_142900-13) we are running into a problem whereby we
suddenly hit a problem which results in processes failing to start and getting
the error message ''resource temporarily unavailable'' error.
This is leading to Oracle crash/startup issues.
> 
> I ran a simple du command at the time it was happening at got the following
response.
> 
> ?du: No more processes: Resource temporarily unavailable?     
> 
> Approximately 6500 TCP connections on server at time. 6000 unix processes.
The max UNIX processes per user is set to 29995. 60GB free physical memory and
no swap being used. Absolutely baffling us at mo.
> 
> Not managed to truss a failing command when it happened yet because
it''s so intermitttent in it''s nature.
> 
> We''ve checked all the usual suspects including max processes per
users and cannot find the cause. Need a way to monitor all the internal kernel
resources to see what we''re hitting. Suggestions please on a postcard.
All welcome.
> 
> Robin Cotgrove
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

Robin Cotgrove

2010-Oct-29 19:50 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

Sorry guys. Swap is not the issue. We''ve had this confirmed by Oracle
and I can clearly see there is 96GB of swap awailable on the system and ~50GB of
main memory.

Not everything relating to forking problems is swap. We have had a similar
forking issue in the past and solved it with swap file addition and in one case,
it was shared memory was being restricted a Solaris project setting. File
descriptor limits being hit is another good one. Max processes per user is
another common one.  All lot''s of common reasons. This one is weird and
we don''t know what it is.

Like the dtrace scripts though. Very useful to make things a lot clearer for
people to interpret values.
-- 
This message posted from opensolaris.org

James Litchfield

2010-Oct-29 19:57 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

I would start with adding swap. oracle''s swap recommendations are 
utterly bogus.

Jim
==
On 10/29/2010 11:27 AM, Jim Mauro wrote:> Mike is correct. Pretty much every time I''ve seen this,
it''s
> VM (VM = virtual memory = swap) related.
>
> There''s a DTrace script below you can run when you hit this
> problem that will show us which system call is failing with an
> EAGAIN error. It is most likely fork(2) (and yes, I know printing
> the errno in the return action is superfluous given we use it
> in the predicate - it''s me being OCD and sanity checking).
>
> A second DTrace script further down should provide a kernel
> stack trace if it is a fork(2) failure.
>
> Or....(disk is cheap) "swap -a" (add swap space) and see if the
> problem goes away.
>
> Thanks
> /jim
>
>
> #!/usr/sbin/dtrace -s
>
> #pragma D option quiet
>
> syscall:::entry
> {
> 	self->flag[probefunc] = 1;
> }
> syscall:::return
> /self->flag[probefunc]&&  errno == 11/
> {
> 	printf("syscall: %s, arg0: %d, arg1: %d, errno:
%d\n\n",probefunc,arg0,arg1,errno);
> 	self->flag[probefunc] = 0;
> }
>
>
>
------------------------------------------------------------------------------------------------------------------------
>
> #!/usr/sbin/dtrace -s
>
> #pragma D option quiet
>
> syscall::forksys:entry
> {
> 	self->flag = 1;
> 	@ks[stack(),ustack()] = count();
> }
> syscall::forksys:return
> /self->flag&&  arg0 == -1&&  errno != 0/
> {
> 	printf("fork failed, errno: %d\n",errno);
> 	printa(@ks);
> 	clear(@ks);
> 	exit(0);
> }
>
>
> On Oct 29, 2010, at 12:00 PM, Robin Cotgrove wrote:
>
>> I need some assistance and guidance in writing a DTRACE script or even
better, finding an example one which would help me identify what''s
going on our system. Intermittently, and we think it might be happening after
about 60 days, on a E2900, 192GB, 24 core, Solaris 10 11.06 system with a fairly
new patch cluster (Generic_142900-13) we are running into a problem whereby we
suddenly hit a problem which results in processes failing to start and getting
the error message ''resource temporarily unavailable'' error.
This is leading to Oracle crash/startup issues.
>>
>> I ran a simple du command at the time it was happening at got the
following response.
>>
>> ?du: No more processes: Resource temporarily unavailable?
>>
>> Approximately 6500 TCP connections on server at time. 6000 unix
processes. The max UNIX processes per user is set to 29995. 60GB free physical
memory and no swap being used. Absolutely baffling us at mo.
>>
>> Not managed to truss a failing command when it happened yet because
it''s so intermitttent in it''s nature.
>>
>> We''ve checked all the usual suspects including max processes
per users and cannot find the cause. Need a way to monitor all the internal
kernel resources to see what we''re hitting. Suggestions please on a
postcard. All welcome.
>>
>> Robin Cotgrove
>> -- 
>> This message posted from opensolaris.org
>> _______________________________________________
>> dtrace-discuss mailing list
>> dtrace-discuss at opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

-- 
Oracle <http://www.oracle.com>
James Litchfield | Senior Consultant
Phone: +1 4082237059 <tel:+1%204082237059> | Mobile: +1 4082180790 
<tel:+1%204082180790>
Oracle Oracle ACS
California
Green Oracle <http://www.oracle.com/commitment> Oracle is committed to 
developing practices and products that help protect the environment
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20101029/7cf72daf/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oracle_sig_logo.gif
Type: image/gif
Size: 658 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20101029/7cf72daf/attachment-0002.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: green-for-email-sig_0.gif
Type: image/gif
Size: 356 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20101029/7cf72daf/attachment-0003.gif>

Mike Gerdts

2010-Oct-29 20:45 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove <robin at rjcnet.co.uk>
wrote:> Sorry guys. Swap is not the issue. We''ve had this confirmed by
Oracle and I can clearly see there is 96GB of swap awailable on the system and
~50GB of main memory.
By who at Oracle?  Not everyone is equally qualified.  I would tend to
trust Jim Mauro (who co-wrote the books[1] on Solaris internals,
performance, & dtrace) over most of the people you will get to through
normal support channels.

1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/

How do you know that available swap doesn''t momentarily drop? 
I''ve
run into plenty of instances where a system has tens of gigabytes of
free memory but is woefully short on reservable swap (virtual memory,
as Jim approximates).  Usually "vmstat 1" is helpful in observing
spikes, but as I said before this could miss very short spikes.  If
you''ve already done this to see that swap is unlikely to be an issue,
knowing that would be useful to know.  If you are measuring the amount
of reservable swap with "swap -l", you are doing it wrong.

I do agree that there can be other shortfalls that can cause this.
This may call for speculative tracing of stacks across the fork entry
and return calls, displaying results only when the fork fails with
EAGAIN.  Jim''s second script is similar to what I suggest, except that
it doesn''t show the code path taken between syscall::forksys:entry and
syscall::forksys:return.

Also, I would be a little careful running the second script as is for
long periods of time if you have a lot of forksys activity with unique
stacks.  I think that as it is @ks may grow rather large over time
because the successful forks are not cleared.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/

Jim Mauro

2010-Oct-29 21:01 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

Thanks Mike. Good point on the script.

Indeed, use of speculative tracing would be a better
fit here. I''ll see if I can get something together and 
send it out.

Thanks,
/jim

On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
> On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove <robin at
rjcnet.co.uk> wrote:
>> Sorry guys. Swap is not the issue. We''ve had this confirmed by
Oracle and I can clearly see there is 96GB of swap awailable on the system and
~50GB of main memory.
> 
> By who at Oracle?  Not everyone is equally qualified.  I would tend to
> trust Jim Mauro (who co-wrote the books[1] on Solaris internals,
> performance, & dtrace) over most of the people you will get to through
> normal support channels.
> 
> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
> 
> How do you know that available swap doesn''t momentarily drop? 
I''ve
> run into plenty of instances where a system has tens of gigabytes of
> free memory but is woefully short on reservable swap (virtual memory,
> as Jim approximates).  Usually "vmstat 1" is helpful in observing
> spikes, but as I said before this could miss very short spikes.  If
> you''ve already done this to see that swap is unlikely to be an
issue,
> knowing that would be useful to know.  If you are measuring the amount
> of reservable swap with "swap -l", you are doing it wrong.
> 
> I do agree that there can be other shortfalls that can cause this.
> This may call for speculative tracing of stacks across the fork entry
> and return calls, displaying results only when the fork fails with
> EAGAIN.  Jim''s second script is similar to what I suggest, except
that
> it doesn''t show the code path taken between syscall::forksys:entry
and
> syscall::forksys:return.
> 
> Also, I would be a little careful running the second script as is for
> long periods of time if you have a lot of forksys activity with unique
> stacks.  I think that as it is @ks may grow rather large over time
> because the successful forks are not cleared.
> 
> -- 
> Mike Gerdts
> http://mgerdts.blogspot.com/
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

Robin Cotgrove

2010-Oct-29 21:23 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

> On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove
> <robin at rjcnet.co.uk> wrote:
> > Sorry guys. Swap is not the issue. We''ve had this
> confirmed by Oracle and I can clearly see there is
> 96GB of swap awailable on the system and ~50GB of
> main memory.
> 
> By who at Oracle?  Not everyone is equally qualified.
>  I would tend to
> rust Jim Mauro (who co-wrote the books[1] on Solaris
> internals,
> performance, & dtrace) over most of the people you
> will get to through
> normal support channels.
Agreed. The normal support channel told us the GUDS script would be better to
capture the root cause over producing a memory dump.
> 
> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
> 
> How do you know that available swap doesn''t
> momentarily drop?  
Because I have been monitoring it during the issues with vmstat and I also
understand the workload on the platform to know that nothing is starting with
huge memory requirements suddenly. This is a VCS cluster with Oracle Database
Resource Groups. DISM usage by the various Oracle DB''s is not in use as
we ran into that a bug with that some months ago. We''ve seen patched
the system but we don''t need the use of DISM on this dev/test Oracle
VCS cluster.

I''ve run into plenty of instances where a system has
tens> of gigabytes of
> free memory but is woefully short on reservable swap
> (virtual memory,
> as Jim approximates).  Usually "vmstat 1" is helpful
> in observing
> spikes, but as I said before this could miss very
> short spikes.  If
> you''ve already done this to see that swap is unlikely
> to be an issue,
> knowing that would be useful to know.  If you are
> measuring the amount
> of reservable swap with "swap -l", you are doing it
> wrong.
Agreed. I don''t use it and I don''t trust the output from the
top utility either :-)
> 
> I do agree that there can be other shortfalls that
> can cause this.
> This may call for speculative tracing of stacks
> across the fork entry
> and return calls, displaying results only when the
> fork fails with
> EAGAIN.  Jim''s second script is similar to what I
> suggest, except that
> it doesn''t show the code path taken between
> syscall::forksys:entry and
> syscall::forksys:return.
> 
> Also, I would be a little careful running the second
> script as is for
> long periods of time if you have a lot of forksys
> activity with unique
> stacks.  I think that as it is @ks may grow rather
> large over time
> because the successful forks are not cleared.
> 
> -- 
> Mike Gerdts
> http://mgerdts.blogspot.com/
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>-- 
This message posted from opensolaris.org

James Litchfield

2010-Oct-29 21:29 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

This is what Oracle says about swap for 11gR2. The comment about 
subtracting ISM is not
correct. A simple test shows that ISM does consume swap (even if it''s 
not DISM). Think
about what happens when a memory segment is created (before it goes to 
ISM), if someone
happens to attach in non-ISM mode and when everyone detaches from the 
segment and it
ceases to be ISM). In the first and last stage swap space is *required* 
and the VM system
reserves the space needed when the segment is first created.

I would be cautious about Oracle assurances...

Jim
---
> go to the following for full list of available oracle book.
> http://www.oracle.com/pls/db112/homepage
>
> which links to the 11gr2 install guide
> Db install guides
> http://www.oracle.com/pls/db112/portal.portal_db?selected=11&frame>
> which links to the following section on memory
>
http://download.oracle.com/docs/cd/E11882_01/install.112/e17163/pre_install.htm#sthref62
>
>
>
> ------
> 2.2.1 Memory Requirements
>
> The following are the memory requirements for installing Oracle 
> Database 11g Release 2.
>
>     *
>
>       At least 4 GB of RAM
>
>       To determine the RAM size, enter the following command:
>
> # /usr/sbin/prtconf | grep "Memory size"
>
> If the size of the RAM is less than the required size, then you must 
> install more memory before continuing.
>
>     *
>
>       The following table describes the relationship between installed 
> RAM and the configured swap space recommendation:
>
>       Note:
>       On Solaris, if you use non-swappable memory, like ISM, then you 
> should deduct the memory allocated to this space from the available 
> RAM before calculating swap space.
>       RAM     Swap Space
>       Between 4 GB and 16 GB     Equal to the size of RAM
>       More than 16 GB     16 GB 


On 10/29/2010 2:01 PM, Jim Mauro wrote:> Thanks Mike. Good point on the script.
>
> Indeed, use of speculative tracing would be a better
> fit here. I''ll see if I can get something together and
> send it out.
>
> Thanks,
> /jim
>
> On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
>
>> On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove<robin at
rjcnet.co.uk>  wrote:
>>> Sorry guys. Swap is not the issue. We''ve had this
confirmed by Oracle and I can clearly see there is 96GB of swap awailable on the
system and ~50GB of main memory.
>> By who at Oracle?  Not everyone is equally qualified.  I would tend to
>> trust Jim Mauro (who co-wrote the books[1] on Solaris internals,
>> performance,&  dtrace) over most of the people you will get to
through
>> normal support channels.
>>
>> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
>>
>> How do you know that available swap doesn''t momentarily drop? 
I''ve
>> run into plenty of instances where a system has tens of gigabytes of
>> free memory but is woefully short on reservable swap (virtual memory,
>> as Jim approximates).  Usually "vmstat 1" is helpful in
observing
>> spikes, but as I said before this could miss very short spikes.  If
>> you''ve already done this to see that swap is unlikely to be an
issue,
>> knowing that would be useful to know.  If you are measuring the amount
>> of reservable swap with "swap -l", you are doing it wrong.
>>
>> I do agree that there can be other shortfalls that can cause this.
>> This may call for speculative tracing of stacks across the fork entry
>> and return calls, displaying results only when the fork fails with
>> EAGAIN.  Jim''s second script is similar to what I suggest,
except that
>> it doesn''t show the code path taken between
syscall::forksys:entry and
>> syscall::forksys:return.
>>
>> Also, I would be a little careful running the second script as is for
>> long periods of time if you have a lot of forksys activity with unique
>> stacks.  I think that as it is @ks may grow rather large over time
>> because the successful forks are not cleared.
>>
>> -- 
>> Mike Gerdts
>> http://mgerdts.blogspot.com/
>> _______________________________________________
>> dtrace-discuss mailing list
>> dtrace-discuss at opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

-- 
Oracle <http://www.oracle.com>
James Litchfield | Senior Consultant
Phone: +1 4082237059 <tel:+1%204082237059> | Mobile: +1 4082180790 
<tel:+1%204082180790>
Oracle Oracle ACS
California
Green Oracle <http://www.oracle.com/commitment> Oracle is committed to 
developing practices and products that help protect the environment
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20101029/a69f9630/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oracle_sig_logo.gif
Type: image/gif
Size: 658 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20101029/a69f9630/attachment-0002.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: green-for-email-sig_0.gif
Type: image/gif
Size: 356 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20101029/a69f9630/attachment-0003.gif>

Robin Cotgrove

2010-Oct-29 22:37 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

> This is what Oracle says about swap for 11gR2. The
> comment about 
> subtracting ISM is not
> correct. A simple test shows that ISM does consume
> swap (even if it''s 
> not DISM). Think
> about what happens when a memory segment is created
> (before it goes to 
> ISM), if someone
> happens to attach in non-ISM mode and when everyone
> detaches from the 
> segment and it
> ceases to be ISM). In the first and last stage swap
> space is *required* 
> and the VM system
> reserves the space needed when the segment is first
> created.
I agree with you. In our case disabling the use of DISM really helped to make
the platform more stable and helped with overall memory usage.

By the way, we using Oracle 10.2.0.4. No use of Oracle 11gR2 yet. 

We have 192GB of physical memory and 96GB of swap device. The SGA/PGA  sizes of
all the Oracle DB''s fit well within the 192GB leaving a consistent
~50GB spare. Memory consumption stays stable on the platform and
doesn''t go up and down. This is the nature of the Oracle DB''s
allocating memory at start-up.
> 
> I would be cautious about Oracle assurances...
Yep> 
> Jim
> ---
> 
> > go to the following for full list of available
> oracle book.
> > http://www.oracle.com/pls/db112/homepage
> >
> > which links to the 11gr2 install guide
> > Db install guides
> >
> http://www.oracle.com/pls/db112/portal.portal_db?selec
> ted=11&frame> >
> > which links to the following section on memory
> >
> http://download.oracle.com/docs/cd/E11882_01/install.1
> 12/e17163/pre_install.htm#sthref62 
> >
> >
> >
> > ------
> > 2.2.1 Memory Requirements
> >
> > The following are the memory requirements for
> installing Oracle 
> > Database 11g Release 2.
> >
> >     *
> >
> >       At least 4 GB of RAM
> >
> >       To determine the RAM size, enter the
> following command:
> >
> > # /usr/sbin/prtconf | grep "Memory size"
> >
> > If the size of the RAM is less than the required
> size, then you must 
> > install more memory before continuing.
> >
> >     *
> >
> >       The following table describes the
> relationship between installed 
> > RAM and the configured swap space recommendation:
> >
> >       Note:
> >       On Solaris, if you use non-swappable memory,
> like ISM, then you 
> > should deduct the memory allocated to this space
> from the available 
> > RAM before calculating swap space.
> >       RAM     Swap Space
> >       Between 4 GB and 16 GB     Equal to the size
> of RAM
> >       More than 16 GB     16 GB 
> 
> 
> 
> On 10/29/2010 2:01 PM, Jim Mauro wrote:
> > Thanks Mike. Good point on the script.
> >
> > Indeed, use of speculative tracing would be a
> better
> > fit here. I''ll see if I can get something together
> and
> > send it out.
> >
> > Thanks,
> > /jim
> >
> > On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
> >
> >> On Fri, Oct 29, 2010 at 2:50 PM, Robin
> Cotgrove<robin at rjcnet.co.uk>  wrote:
> >>> Sorry guys. Swap is not the issue. We''ve had this
> confirmed by Oracle and I can clearly see there is
> 96GB of swap awailable on the system and ~50GB of
> main memory.
> >> By who at Oracle?  Not everyone is equally
> qualified.  I would tend to
> >> trust Jim Mauro (who co-wrote the books[1] on
> Solaris internals,
> >> performance,&  dtrace) over most of the people you
> will get to through
> >> normal support channels.
> >>
> >> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
> >>
> >> How do you know that available swap doesn''t
> momentarily drop?  I''ve
> >> run into plenty of instances where a system has
> tens of gigabytes of
> >> free memory but is woefully short on reservable
> swap (virtual memory,
> >> as Jim approximates).  Usually "vmstat 1" is
> helpful in observing
> >> spikes, but as I said before this could miss very
> short spikes.  If
> >> you''ve already done this to see that swap is
> unlikely to be an issue,
> >> knowing that would be useful to know.  If you are
> measuring the amount
> >> of reservable swap with "swap -l", you are doing
> it wrong.
> >>
> >> I do agree that there can be other shortfalls that
> can cause this.
> >> This may call for speculative tracing of stacks
> across the fork entry
> >> and return calls, displaying results only when the
> fork fails with
> >> EAGAIN.  Jim''s second script is similar to what I
> suggest, except that
> >> it doesn''t show the code path taken between
> syscall::forksys:entry and
> >> syscall::forksys:return.
> >>
> >> Also, I would be a little careful running the
> second script as is for
> >> long periods of time if you have a lot of forksys
> activity with unique
> >> stacks.  I think that as it is @ks may grow rather
> large over time
> >> because the successful forks are not cleared.
> >>
> >> -- 
> >> Mike Gerdts
> >> http://mgerdts.blogspot.com/
> >> _______________________________________________
> >> dtrace-discuss mailing list
> >> dtrace-discuss at opensolaris.org
> > _______________________________________________
> > dtrace-discuss mailing list
> > dtrace-discuss at opensolaris.org
> 
> 
> -- 
> Oracle <http://www.oracle.com>
> James Litchfield | Senior Consultant
> Phone: +1 4082237059 <tel:+1%204082237059> | Mobile:
> +1 4082180790 
> <tel:+1%204082180790>
> Oracle Oracle ACS
> California
> Green Oracle <http://www.oracle.com/commitment>
> Oracle is committed to 
> developing practices and products that help protect
> the environment
> <div id="jive-html-wrapper-div">
> 
> This is what Oracle says about swap for 11gR2.
>  The comment about
>    subtracting ISM is not<br>
> correct. A simple test shows that ISM does consume
>  swap (even if
>    it''s not DISM). Think<br>
> about what happens when a memory segment is created
>  (before it goes
>    to ISM), if someone<br>
> happens to attach in non-ISM mode and when everyone
>  detaches from
>    the segment and it<br>
> ceases to be ISM). In the first and last stage swap
>  space is
>    *required* and the VM system<br>
> reserves the space needed when the segment is first
>  created.<br>
>    <br>
>  I would be cautious about Oracle assurances...<br>
>    <br>
>  Jim<br>
>    ---<br>
>  <br>
> <blockquote type="cite">go to the following for
>  full list of
>      available oracle book. <br>
> <a moz-do-not-send="true"
>  class="moz-txt-link-freetext"
> 
> ref="http://www.oracle.com/pls/db112/homepage">http://
> www.oracle.com/pls/db112/homepage</a>
>       <br>
> <br>
>       which links to the 11gr2 install guide <br>
> Db install guides <br>
> <a moz-do-not-send="true"
>  class="moz-txt-link-freetext"
> ref="http://www.oracle.com/pls/db112/portal.portal_db?
> selected=11&amp;frame=">http://www.oracle.com/pls/db11
> 2/portal.portal_db?selected=11&amp;frame=</a>
>       <br>
> <br>
> which links to the following section on memory
>  <br>
> <a moz-do-not-send="true"
>  class="moz-txt-link-freetext"
> ref="http://download.oracle.com/docs/cd/E11882_01/inst
> all.112/e17163/pre_install.htm#sthref62">http://downlo
> ad.oracle.com/docs/cd/E11882_01/install.112/e17163/pre
> _install.htm#sthref62</a>
>       <br>
> <br>
>       <br>
> ------ <br>
>       2.2.1 Memory Requirements <br>
> <br>
> The following are the memory requirements for
>  installing Oracle
>      Database 11g Release 2. <br>
>  <br>
>      &nbsp;&nbsp;&nbsp; * <br>
>  <br>
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; At least 4 GB of
>  RAM <br>
>      <br>
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; To determine the RAM
>  size, enter the following command: <br>
>      <br>
>  # /usr/sbin/prtconf | grep "Memory size" <br>
>      <br>
> If the size of the RAM is less than the required
>  size, then you
>      must install more memory before continuing. <br>
>  <br>
>      &nbsp;&nbsp;&nbsp; * <br>
>  <br>
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The following
>  table describes the relationship between
> installed RAM and the configured swap space
>  recommendation: <br>
>      <br>
>  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Note: <br>
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; On Solaris, if
>  you use non-swappable memory, like ISM, then
> you should deduct the memory allocated to this
> space from the
> available RAM before calculating swap space.
>  <br>
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
>  RAM&nbsp;&nbsp;&nbsp;&nbsp; Swap Space <br>
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Between 4 GB and
> 16 GB&nbsp;&nbsp;&nbsp;&nbsp; Equal to the size of
>  RAM <br>
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; More than 16
>  GB&nbsp;&nbsp;&nbsp;&nbsp; 16 GB </blockquote>
>    <br>
>  <br>
>    <br>
>  On 10/29/2010 2:01 PM, Jim Mauro wrote:
>    <blockquote
> 
> ite="mid:45D35217-5BFA-40B3-9BB4-CBDC5281319C at oracle.c
> om"
>       type="cite">
> <pre wrap="">Thanks Mike. Good point on the script.
> 
> Indeed, use of speculative tracing would be a better
> fit here. I''ll see if I can get something together
> and 
> send it out.
> 
> Thanks,
> /jim
> 
> On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
> 
> </pre>
>       <blockquote type="cite">
> <pre wrap="">On Fri, Oct 29, 2010 at 2:50 PM, Robin
> Cotgrove <a class="moz-txt-link-rfc2396E"
>  href="mailto:robin at rjcnet.co.uk">&lt;robin at
rjcnet.co.
> k&gt;</a> wrote:
> </pre>
>         <blockquote type="cite">
> <pre wrap="">Sorry guys. Swap is not the issue.
> We''ve had this confirmed by Oracle and I can clearly
> see there is 96GB of swap awailable on the system
>  and ~50GB of main memory.
> /pre>
>         </blockquote>
> <pre wrap="">
> By who at Oracle?  Not everyone is equally qualified.
>  I would tend to
> rust Jim Mauro (who co-wrote the books[1] on Solaris
> internals,
> performance, &amp; dtrace) over most of the people
> you will get to through
> normal support channels.
> 
> 1. <a class="moz-txt-link-freetext"
> href="http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/">h
> ttp://www.amazon.com/Jim-Mauro/e/B001ILM8NC/</a>
> 
> How do you know that available swap doesn''t
> momentarily drop?  I''ve
> run into plenty of instances where a system has tens
> of gigabytes of
> free memory but is woefully short on reservable swap
> (virtual memory,
> as Jim approximates).  Usually "vmstat 1" is helpful
> in observing
> spikes, but as I said before this could miss very
> short spikes.  If
> you''ve already done this to see that swap is unlikely
> to be an issue,
> knowing that would be useful to know.  If you are
> measuring the amount
> of reservable swap with "swap -l", you are doing it
> wrong.
> 
> I do agree that there can be other shortfalls that
> can cause this.
> This may call for speculative tracing of stacks
> across the fork entry
> and return calls, displaying results only when the
> fork fails with
> EAGAIN.  Jim''s second script is similar to what I
> suggest, except that
> it doesn''t show the code path taken between
> syscall::forksys:entry and
> syscall::forksys:return.
> 
> Also, I would be a little careful running the second
> script as is for
> long periods of time if you have a lot of forksys
> activity with unique
> stacks.  I think that as it is @ks may grow rather
> large over time
> because the successful forks are not cleared.
> 
> -- 
> Mike Gerdts
> <a class="moz-txt-link-freetext"
> href="http://mgerdts.blogspot.com/">http://mgerdts.blo
> gspot.com/</a>
> _______________________________________________
> dtrace-discuss mailing list
> <a class="moz-txt-link-abbreviated"
> href="mailto:dtrace-discuss at opensolaris.org">dtrace-di
> scuss at opensolaris.org</a>
> </pre>
>       </blockquote>
> <pre wrap="">
> _______________________________________________
> dtrace-discuss mailing list
> <a class="moz-txt-link-abbreviated"
> href="mailto:dtrace-discuss at opensolaris.org">dtrace-di
> scuss at opensolaris.org</a>
> </pre>
>     </blockquote>
> <br>
>     <br>
> <div class="moz-signature">-- <br>
> <a href="http://www.oracle.com"
>  target="_blank"><img
> src="cid:part1.03060504.05020101 at oracle.com"
>  alt="Oracle"
>          border="0" height="26"
width="114"></a><br>
> nt size="2" color="#666666" face="Verdana, Arial,
> Helvetica,
> sans-serif">James Litchfield | Senior
>  Consultant<br>
> Phone: <a href="tel:+1%204082237059">+1
>  4082237059</a> |
> Mobile: <a href="tel:+1%204082180790">+1
>  4082180790</a> <br>
> <font color="#ff0000">Oracle</font> Oracle
> ACS<br>
>         California </font>
> r>
> <a href="http://www.oracle.com/commitment"
>  target="_blank"><img
> src="cid:part2.07030704.04000406 at oracle.com"
>  alt="Green
> Oracle" align="abscenter" border="0"
>  height="28" width="44"></a>
> <font size="1" color="#4b7d42" face="Verdana,
>  Arial, Helvetica,
> sans-serif">Oracle is committed to developing
>  practices and
> products that help protect the
>  environment</font>
> <!-- This signature was generated by the
> MyDesktop Oracle Business Signature utility version
>  3.6.0 -->
> <!-- Visit http://mydesktop.oracle.com/ and try
>  it for yourself -->
>    </div>
> </div>_______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org-- 
This message posted from opensolaris.org

Phil Harman

2010-Oct-29 22:39 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

+1

I have seen many instancesof this. It is trivial to add swap, but I''m
simply tired of the number of times DBA''s have protested "we have
enough" or even tried FUD like "Oracle won''t support us if we
add more" (yes, I had that one within the lasr year). Just do it!

As has already been pointed out, Solaris has a swap reservation model.
It''s a bit like car insurance: you have to have it to drive on the
road, but you hope you''ll never need it. Solaris won''t let you
drive underinsured.

On 29 Oct 2010, at 22:29, James Litchfield <jim.litchfield at oracle.com>
wrote:
> This is what Oracle says about swap for 11gR2. The comment about
subtracting ISM is not
> correct. A simple test shows that ISM does consume swap (even if
it''s not DISM). Think
> about what happens when a memory segment is created (before it goes to
ISM), if someone
> happens to attach in non-ISM mode and when everyone detaches from the
segment and it
> ceases to be ISM). In the first and last stage swap space is *required* and
the VM system
> reserves the space needed when the segment is first created.
> 
> I would be cautious about Oracle assurances...
> 
> Jim
> ---
> 
>> go to the following for full list of available oracle book. 
>> http://www.oracle.com/pls/db112/homepage 
>> 
>> which links to the 11gr2 install guide 
>> Db install guides 
>> http://www.oracle.com/pls/db112/portal.portal_db?selected=11&frame=
>> 
>> which links to the following section on memory 
>>
http://download.oracle.com/docs/cd/E11882_01/install.112/e17163/pre_install.htm#sthref62
>> 
>> 
>> ------ 
>> 2.2.1 Memory Requirements 
>> 
>> The following are the memory requirements for installing Oracle
Database 11g Release 2.
>> 
>>     * 
>> 
>>       At least 4 GB of RAM 
>> 
>>       To determine the RAM size, enter the following command: 
>> 
>> # /usr/sbin/prtconf | grep "Memory size" 
>> 
>> If the size of the RAM is less than the required size, then you must
install more memory before continuing.
>> 
>>     * 
>> 
>>       The following table describes the relationship between installed
RAM and the configured swap space recommendation:
>> 
>>       Note: 
>>       On Solaris, if you use non-swappable memory, like ISM, then you
should deduct the memory allocated to this space from the available RAM before
calculating swap space.
>>       RAM     Swap Space 
>>       Between 4 GB and 16 GB     Equal to the size of RAM 
>>       More than 16 GB     16 GB
> 
> 
> 
> On 10/29/2010 2:01 PM, Jim Mauro wrote:
>> 
>> Thanks Mike. Good point on the script.
>> 
>> Indeed, use of speculative tracing would be a better
>> fit here. I''ll see if I can get something together and 
>> send it out.
>> 
>> Thanks,
>> /jim
>> 
>> On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
>> 
>>> On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove <robin at
rjcnet.co.uk> wrote:
>>>> Sorry guys. Swap is not the issue. We''ve had this
confirmed by Oracle and I can clearly see there is 96GB of swap awailable on the
system and ~50GB of main memory.
>>> By who at Oracle?  Not everyone is equally qualified.  I would tend
to
>>> trust Jim Mauro (who co-wrote the books[1] on Solaris internals,
>>> performance, & dtrace) over most of the people you will get to
through
>>> normal support channels.
>>> 
>>> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
>>> 
>>> How do you know that available swap doesn''t momentarily
drop?  I''ve
>>> run into plenty of instances where a system has tens of gigabytes
of
>>> free memory but is woefully short on reservable swap (virtual
memory,
>>> as Jim approximates).  Usually "vmstat 1" is helpful in
observing
>>> spikes, but as I said before this could miss very short spikes.  If
>>> you''ve already done this to see that swap is unlikely to
be an issue,
>>> knowing that would be useful to know.  If you are measuring the
amount
>>> of reservable swap with "swap -l", you are doing it
wrong.
>>> 
>>> I do agree that there can be other shortfalls that can cause this.
>>> This may call for speculative tracing of stacks across the fork
entry
>>> and return calls, displaying results only when the fork fails with
>>> EAGAIN.  Jim''s second script is similar to what I suggest,
except that
>>> it doesn''t show the code path taken between
syscall::forksys:entry and
>>> syscall::forksys:return.
>>> 
>>> Also, I would be a little careful running the second script as is
for
>>> long periods of time if you have a lot of forksys activity with
unique
>>> stacks.  I think that as it is @ks may grow rather large over time
>>> because the successful forks are not cleared.
>>> 
>>> -- 
>>> Mike Gerdts
>>> http://mgerdts.blogspot.com/
>>> _______________________________________________
>>> dtrace-discuss mailing list
>>> dtrace-discuss at opensolaris.org
>> _______________________________________________
>> dtrace-discuss mailing list
>> dtrace-discuss at opensolaris.org
> 
> 
> -- 
> <oracle_sig_logo.gif>
> James Litchfield | Senior Consultant
> Phone: +1 4082237059 | Mobile: +1 4082180790 
> Oracle Oracle ACS
> California 
> <green-for-email-sig_0.gif> Oracle is committed to developing
practices and products that help protect the environment
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20101029/35b7b096/attachment-0001.html>

Phil Harman

2010-Oct-29 22:55 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

Oracle often seems to recommend 1:1 (which is often not enough, especially with
DISM). You don''t even have 1:1.

Solaris also uses free memory as part of its swap space allocation. Locked
memory, such as ISM/DISM eats free memory, and so reduces your available swap
further.

You should confirm that DISM is off by running "pmap -x" against a
process from each of your DBs (the shared memory should appear as
"ism")

Commands like "swap -s" and good ol'' "vmstat 5" are
useful for monitoring swap. You should also run "echo :: memstat | mdb
-k" from time to time to get a feel for hiw your RAM is being used"
(on large machines, I''ve seen it take up to an hour to complete, and it
will hig a CPU for the duration, but it seems to have little other impact on the
system).

On 29 Oct 2010, at 23:37, Robin Cotgrove <robin at rjcnet.co.uk> wrote:
>> This is what Oracle says about swap for 11gR2. The
>> comment about 
>> subtracting ISM is not
>> correct. A simple test shows that ISM does consume
>> swap (even if it''s 
>> not DISM). Think
>> about what happens when a memory segment is created
>> (before it goes to 
>> ISM), if someone
>> happens to attach in non-ISM mode and when everyone
>> detaches from the 
>> segment and it
>> ceases to be ISM). In the first and last stage swap
>> space is *required* 
>> and the VM system
>> reserves the space needed when the segment is first
>> created.
> 
> I agree with you. In our case disabling the use of DISM really helped to
make the platform more stable and helped with overall memory usage.
> 
> By the way, we using Oracle 10.2.0.4. No use of Oracle 11gR2 yet. 
> 
> We have 192GB of physical memory and 96GB of swap device. The SGA/PGA 
sizes of all the Oracle DB''s fit well within the 192GB leaving a
consistent ~50GB spare. Memory consumption stays stable on the platform and
doesn''t go up and down. This is the nature of the Oracle DB''s
allocating memory at start-up.
> 
>> 
>> I would be cautious about Oracle assurances...
> 
> Yep
>> 
>> Jim
>> ---
>> 
>>> go to the following for full list of available
>> oracle book.
>>> http://www.oracle.com/pls/db112/homepage
>>> 
>>> which links to the 11gr2 install guide
>>> Db install guides
>>> 
>> http://www.oracle.com/pls/db112/portal.portal_db?selec
>> ted=11&frame>>> 
>>> which links to the following section on memory
>>> 
>> http://download.oracle.com/docs/cd/E11882_01/install.1
>> 12/e17163/pre_install.htm#sthref62 
>>> 
>>> 
>>> 
>>> ------
>>> 2.2.1 Memory Requirements
>>> 
>>> The following are the memory requirements for
>> installing Oracle 
>>> Database 11g Release 2.
>>> 
>>>    *
>>> 
>>>      At least 4 GB of RAM
>>> 
>>>      To determine the RAM size, enter the
>> following command:
>>> 
>>> # /usr/sbin/prtconf | grep "Memory size"
>>> 
>>> If the size of the RAM is less than the required
>> size, then you must 
>>> install more memory before continuing.
>>> 
>>>    *
>>> 
>>>      The following table describes the
>> relationship between installed 
>>> RAM and the configured swap space recommendation:
>>> 
>>>      Note:
>>>      On Solaris, if you use non-swappable memory,
>> like ISM, then you 
>>> should deduct the memory allocated to this space
>> from the available 
>>> RAM before calculating swap space.
>>>      RAM     Swap Space
>>>      Between 4 GB and 16 GB     Equal to the size
>> of RAM
>>>      More than 16 GB     16 GB 
>> 
>> 
>> 
>> On 10/29/2010 2:01 PM, Jim Mauro wrote:
>>> Thanks Mike. Good point on the script.
>>> 
>>> Indeed, use of speculative tracing would be a
>> better
>>> fit here. I''ll see if I can get something together
>> and
>>> send it out.
>>> 
>>> Thanks,
>>> /jim
>>> 
>>> On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
>>> 
>>>> On Fri, Oct 29, 2010 at 2:50 PM, Robin
>> Cotgrove<robin at rjcnet.co.uk>  wrote:
>>>>> Sorry guys. Swap is not the issue. We''ve had this
>> confirmed by Oracle and I can clearly see there is
>> 96GB of swap awailable on the system and ~50GB of
>> main memory.
>>>> By who at Oracle?  Not everyone is equally
>> qualified.  I would tend to
>>>> trust Jim Mauro (who co-wrote the books[1] on
>> Solaris internals,
>>>> performance,&  dtrace) over most of the people you
>> will get to through
>>>> normal support channels.
>>>> 
>>>> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
>>>> 
>>>> How do you know that available swap doesn''t
>> momentarily drop?  I''ve
>>>> run into plenty of instances where a system has
>> tens of gigabytes of
>>>> free memory but is woefully short on reservable
>> swap (virtual memory,
>>>> as Jim approximates).  Usually "vmstat 1" is
>> helpful in observing
>>>> spikes, but as I said before this could miss very
>> short spikes.  If
>>>> you''ve already done this to see that swap is
>> unlikely to be an issue,
>>>> knowing that would be useful to know.  If you are
>> measuring the amount
>>>> of reservable swap with "swap -l", you are doing
>> it wrong.
>>>> 
>>>> I do agree that there can be other shortfalls that
>> can cause this.
>>>> This may call for speculative tracing of stacks
>> across the fork entry
>>>> and return calls, displaying results only when the
>> fork fails with
>>>> EAGAIN.  Jim''s second script is similar to what I
>> suggest, except that
>>>> it doesn''t show the code path taken between
>> syscall::forksys:entry and
>>>> syscall::forksys:return.
>>>> 
>>>> Also, I would be a little careful running the
>> second script as is for
>>>> long periods of time if you have a lot of forksys
>> activity with unique
>>>> stacks.  I think that as it is @ks may grow rather
>> large over time
>>>> because the successful forks are not cleared.
>>>> 
>>>> -- 
>>>> Mike Gerdts
>>>> http://mgerdts.blogspot.com/
>>>> _______________________________________________
>>>> dtrace-discuss mailing list
>>>> dtrace-discuss at opensolaris.org
>>> _______________________________________________
>>> dtrace-discuss mailing list
>>> dtrace-discuss at opensolaris.org
>> 
>> 
>> -- 
>> Oracle <http://www.oracle.com>
>> James Litchfield | Senior Consultant
>> Phone: +1 4082237059 <tel:+1%204082237059> | Mobile:
>> +1 4082180790 
>> <tel:+1%204082180790>
>> Oracle Oracle ACS
>> California
>> Green Oracle <http://www.oracle.com/commitment>
>> Oracle is committed to 
>> developing practices and products that help protect
>> the environment
>> <div id="jive-html-wrapper-div">
>> 
>> This is what Oracle says about swap for 11gR2.
>> The comment about
>>   subtracting ISM is not<br>
>> correct. A simple test shows that ISM does consume
>> swap (even if
>>   it''s not DISM). Think<br>
>> about what happens when a memory segment is created
>> (before it goes
>>   to ISM), if someone<br>
>> happens to attach in non-ISM mode and when everyone
>> detaches from
>>   the segment and it<br>
>> ceases to be ISM). In the first and last stage swap
>> space is
>>   *required* and the VM system<br>
>> reserves the space needed when the segment is first
>> created.<br>
>>   <br>
>> I would be cautious about Oracle assurances...<br>
>>   <br>
>> Jim<br>
>>   ---<br>
>> <br>
>> <blockquote type="cite">go to the following for
>> full list of
>>     available oracle book. <br>
>> <a moz-do-not-send="true"
>> class="moz-txt-link-freetext"
>> 
>> ref="http://www.oracle.com/pls/db112/homepage">http://
>> www.oracle.com/pls/db112/homepage</a>
>>      <br>
>> <br>
>>      which links to the 11gr2 install guide <br>
>> Db install guides <br>
>> <a moz-do-not-send="true"
>> class="moz-txt-link-freetext"
>> ref="http://www.oracle.com/pls/db112/portal.portal_db?
>> selected=11&amp;frame=">http://www.oracle.com/pls/db11
>> 2/portal.portal_db?selected=11&amp;frame=</a>
>>      <br>
>> <br>
>> which links to the following section on memory
>> <br>
>> <a moz-do-not-send="true"
>> class="moz-txt-link-freetext"
>> ref="http://download.oracle.com/docs/cd/E11882_01/inst
>> all.112/e17163/pre_install.htm#sthref62">http://downlo
>> ad.oracle.com/docs/cd/E11882_01/install.112/e17163/pre
>> _install.htm#sthref62</a>
>>      <br>
>> <br>
>>      <br>
>> ------ <br>
>>      2.2.1 Memory Requirements <br>
>> <br>
>> The following are the memory requirements for
>> installing Oracle
>>     Database 11g Release 2. <br>
>> <br>
>>     &nbsp;&nbsp;&nbsp; * <br>
>> <br>
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; At least 4 GB of
>> RAM <br>
>>     <br>
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; To determine the RAM
>> size, enter the following command: <br>
>>     <br>
>> # /usr/sbin/prtconf | grep "Memory size" <br>
>>     <br>
>> If the size of the RAM is less than the required
>> size, then you
>>     must install more memory before continuing. <br>
>> <br>
>>     &nbsp;&nbsp;&nbsp; * <br>
>> <br>
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The following
>> table describes the relationship between
>> installed RAM and the configured swap space
>> recommendation: <br>
>>     <br>
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Note: <br>
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; On Solaris, if
>> you use non-swappable memory, like ISM, then
>> you should deduct the memory allocated to this
>> space from the
>> available RAM before calculating swap space.
>> <br>
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
>> RAM&nbsp;&nbsp;&nbsp;&nbsp; Swap Space <br>
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Between 4 GB and
>> 16 GB&nbsp;&nbsp;&nbsp;&nbsp; Equal to the size of
>> RAM <br>
>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; More than 16
>> GB&nbsp;&nbsp;&nbsp;&nbsp; 16 GB </blockquote>
>>   <br>
>> <br>
>>   <br>
>> On 10/29/2010 2:01 PM, Jim Mauro wrote:
>>   <blockquote
>> 
>> ite="mid:45D35217-5BFA-40B3-9BB4-CBDC5281319C at oracle.c
>> om"
>>      type="cite">
>> <pre wrap="">Thanks Mike. Good point on the script.
>> 
>> Indeed, use of speculative tracing would be a better
>> fit here. I''ll see if I can get something together
>> and 
>> send it out.
>> 
>> Thanks,
>> /jim
>> 
>> On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
>> 
>> </pre>
>>      <blockquote type="cite">
>> <pre wrap="">On Fri, Oct 29, 2010 at 2:50 PM, Robin
>> Cotgrove <a class="moz-txt-link-rfc2396E"
>> href="mailto:robin at rjcnet.co.uk">&lt;robin at
rjcnet.co.
>> k&gt;</a> wrote:
>> </pre>
>>        <blockquote type="cite">
>> <pre wrap="">Sorry guys. Swap is not the issue.
>> We''ve had this confirmed by Oracle and I can clearly
>> see there is 96GB of swap awailable on the system
>> and ~50GB of main memory.
>> /pre>
>>        </blockquote>
>> <pre wrap="">
>> By who at Oracle?  Not everyone is equally qualified.
>> I would tend to
>> rust Jim Mauro (who co-wrote the books[1] on Solaris
>> internals,
>> performance, &amp; dtrace) over most of the people
>> you will get to through
>> normal support channels.
>> 
>> 1. <a class="moz-txt-link-freetext"
>> href="http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/">h
>> ttp://www.amazon.com/Jim-Mauro/e/B001ILM8NC/</a>
>> 
>> How do you know that available swap doesn''t
>> momentarily drop?  I''ve
>> run into plenty of instances where a system has tens
>> of gigabytes of
>> free memory but is woefully short on reservable swap
>> (virtual memory,
>> as Jim approximates).  Usually "vmstat 1" is helpful
>> in observing
>> spikes, but as I said before this could miss very
>> short spikes.  If
>> you''ve already done this to see that swap is unlikely
>> to be an issue,
>> knowing that would be useful to know.  If you are
>> measuring the amount
>> of reservable swap with "swap -l", you are doing it
>> wrong.
>> 
>> I do agree that there can be other shortfalls that
>> can cause this.
>> This may call for speculative tracing of stacks
>> across the fork entry
>> and return calls, displaying results only when the
>> fork fails with
>> EAGAIN.  Jim''s second script is similar to what I
>> suggest, except that
>> it doesn''t show the code path taken between
>> syscall::forksys:entry and
>> syscall::forksys:return.
>> 
>> Also, I would be a little careful running the second
>> script as is for
>> long periods of time if you have a lot of forksys
>> activity with unique
>> stacks.  I think that as it is @ks may grow rather
>> large over time
>> because the successful forks are not cleared.
>> 
>> -- 
>> Mike Gerdts
>> <a class="moz-txt-link-freetext"
>> href="http://mgerdts.blogspot.com/">http://mgerdts.blo
>> gspot.com/</a>
>> _______________________________________________
>> dtrace-discuss mailing list
>> <a class="moz-txt-link-abbreviated"
>> href="mailto:dtrace-discuss at opensolaris.org">dtrace-di
>> scuss at opensolaris.org</a>
>> </pre>
>>      </blockquote>
>> <pre wrap="">
>> _______________________________________________
>> dtrace-discuss mailing list
>> <a class="moz-txt-link-abbreviated"
>> href="mailto:dtrace-discuss at opensolaris.org">dtrace-di
>> scuss at opensolaris.org</a>
>> </pre>
>>    </blockquote>
>> <br>
>>    <br>
>> <div class="moz-signature">-- <br>
>> <a href="http://www.oracle.com"
>> target="_blank"><img
>> src="cid:part1.03060504.05020101 at oracle.com"
>> alt="Oracle"
>>         border="0" height="26"
width="114"></a><br>
>> nt size="2" color="#666666" face="Verdana,
Arial,
>> Helvetica,
>> sans-serif">James Litchfield | Senior
>> Consultant<br>
>> Phone: <a href="tel:+1%204082237059">+1
>> 4082237059</a> |
>> Mobile: <a href="tel:+1%204082180790">+1
>> 4082180790</a> <br>
>> <font color="#ff0000">Oracle</font> Oracle
>> ACS<br>
>>        California </font>
>> r>
>> <a href="http://www.oracle.com/commitment"
>> target="_blank"><img
>> src="cid:part2.07030704.04000406 at oracle.com"
>> alt="Green
>> Oracle" align="abscenter" border="0"
>> height="28" width="44"></a>
>> <font size="1" color="#4b7d42"
face="Verdana,
>> Arial, Helvetica,
>> sans-serif">Oracle is committed to developing
>> practices and
>> products that help protect the
>> environment</font>
>> <!-- This signature was generated by the
>> MyDesktop Oracle Business Signature utility version
>> 3.6.0 -->
>> <!-- Visit http://mydesktop.oracle.com/ and try
>> it for yourself -->
>>   </div>
>> </div>_______________________________________________
>> dtrace-discuss mailing list
>> dtrace-discuss at opensolaris.org
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

James Litchfield

2010-Oct-30 06:09 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

A recent S10 kernel patch *drastically* reduced the time consumed
by ::memstat. On large systems, it will often take just a minute
or two. I just tried it on a lightly loaded 512GB M9K and it was
less than 3 minutes.

Jim
----


On 10/29/10 03:55 PM, Phil Harman wrote:> Oracle often seems to recommend 1:1 (which is often not enough, especially
with DISM). You don''t even have 1:1.
>
> Solaris also uses free memory as part of its swap space allocation. Locked
memory, such as ISM/DISM eats free memory, and so reduces your available swap
further.
>
> You should confirm that DISM is off by running "pmap -x" against
a process from each of your DBs (the shared memory should appear as
"ism")
>
> Commands like "swap -s" and good ol'' "vmstat
5" are useful for monitoring swap. You should also run "echo ::
memstat | mdb -k" from time to time to get a feel for hiw your RAM is being
used" (on large machines, I''ve seen it take up to an hour to
complete, and it will hig a CPU for the duration, but it seems to have little
other impact on the system).
>
> On 29 Oct 2010, at 23:37, Robin Cotgrove<robin at rjcnet.co.uk> 
wrote:
>
>>> This is what Oracle says about swap for 11gR2. The
>>> comment about
>>> subtracting ISM is not
>>> correct. A simple test shows that ISM does consume
>>> swap (even if it''s
>>> not DISM). Think
>>> about what happens when a memory segment is created
>>> (before it goes to
>>> ISM), if someone
>>> happens to attach in non-ISM mode and when everyone
>>> detaches from the
>>> segment and it
>>> ceases to be ISM). In the first and last stage swap
>>> space is *required*
>>> and the VM system
>>> reserves the space needed when the segment is first
>>> created.
>> I agree with you. In our case disabling the use of DISM really helped
to make the platform more stable and helped with overall memory usage.
>>
>> By the way, we using Oracle 10.2.0.4. No use of Oracle 11gR2 yet.
>>
>> We have 192GB of physical memory and 96GB of swap device. The SGA/PGA 
sizes of all the Oracle DB''s fit well within the 192GB leaving a
consistent ~50GB spare. Memory consumption stays stable on the platform and
doesn''t go up and down. This is the nature of the Oracle DB''s
allocating memory at start-up.
>>
>>> I would be cautious about Oracle assurances...
>> Yep
>>> Jim
>>> ---
>>>
>>>> go to the following for full list of available
>>> oracle book.
>>>> http://www.oracle.com/pls/db112/homepage
>>>>
>>>> which links to the 11gr2 install guide
>>>> Db install guides
>>>>
>>> http://www.oracle.com/pls/db112/portal.portal_db?selec
>>> ted=11&frame>>>> which links to the following
section on memory
>>>>
>>> http://download.oracle.com/docs/cd/E11882_01/install.1
>>> 12/e17163/pre_install.htm#sthref62
>>>>
>>>>
>>>> ------
>>>> 2.2.1 Memory Requirements
>>>>
>>>> The following are the memory requirements for
>>> installing Oracle
>>>> Database 11g Release 2.
>>>>
>>>>     *
>>>>
>>>>       At least 4 GB of RAM
>>>>
>>>>       To determine the RAM size, enter the
>>> following command:
>>>> # /usr/sbin/prtconf | grep "Memory size"
>>>>
>>>> If the size of the RAM is less than the required
>>> size, then you must
>>>> install more memory before continuing.
>>>>
>>>>     *
>>>>
>>>>       The following table describes the
>>> relationship between installed
>>>> RAM and the configured swap space recommendation:
>>>>
>>>>       Note:
>>>>       On Solaris, if you use non-swappable memory,
>>> like ISM, then you
>>>> should deduct the memory allocated to this space
>>> from the available
>>>> RAM before calculating swap space.
>>>>       RAM     Swap Space
>>>>       Between 4 GB and 16 GB     Equal to the size
>>> of RAM
>>>>       More than 16 GB     16 GB
>>>
>>>
>>> On 10/29/2010 2:01 PM, Jim Mauro wrote:
>>>> Thanks Mike. Good point on the script.
>>>>
>>>> Indeed, use of speculative tracing would be a
>>> better
>>>> fit here. I''ll see if I can get something together
>>> and
>>>> send it out.
>>>>
>>>> Thanks,
>>>> /jim
>>>>
>>>> On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
>>>>
>>>>> On Fri, Oct 29, 2010 at 2:50 PM, Robin
>>> Cotgrove<robin at rjcnet.co.uk>   wrote:
>>>>>> Sorry guys. Swap is not the issue. We''ve had
this
>>> confirmed by Oracle and I can clearly see there is
>>> 96GB of swap awailable on the system and ~50GB of
>>> main memory.
>>>>> By who at Oracle?  Not everyone is equally
>>> qualified.  I would tend to
>>>>> trust Jim Mauro (who co-wrote the books[1] on
>>> Solaris internals,
>>>>> performance,&   dtrace) over most of the people you
>>> will get to through
>>>>> normal support channels.
>>>>>
>>>>> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
>>>>>
>>>>> How do you know that available swap doesn''t
>>> momentarily drop?  I''ve
>>>>> run into plenty of instances where a system has
>>> tens of gigabytes of
>>>>> free memory but is woefully short on reservable
>>> swap (virtual memory,
>>>>> as Jim approximates).  Usually "vmstat 1" is
>>> helpful in observing
>>>>> spikes, but as I said before this could miss very
>>> short spikes.  If
>>>>> you''ve already done this to see that swap is
>>> unlikely to be an issue,
>>>>> knowing that would be useful to know.  If you are
>>> measuring the amount
>>>>> of reservable swap with "swap -l", you are doing
>>> it wrong.
>>>>> I do agree that there can be other shortfalls that
>>> can cause this.
>>>>> This may call for speculative tracing of stacks
>>> across the fork entry
>>>>> and return calls, displaying results only when the
>>> fork fails with
>>>>> EAGAIN.  Jim''s second script is similar to what I
>>> suggest, except that
>>>>> it doesn''t show the code path taken between
>>> syscall::forksys:entry and
>>>>> syscall::forksys:return.
>>>>>
>>>>> Also, I would be a little careful running the
>>> second script as is for
>>>>> long periods of time if you have a lot of forksys
>>> activity with unique
>>>>> stacks.  I think that as it is @ks may grow rather
>>> large over time
>>>>> because the successful forks are not cleared.
>>>>>
>>>>> -- 
>>>>> Mike Gerdts
>>>>> http://mgerdts.blogspot.com/
>>>>> _______________________________________________
>>>>> dtrace-discuss mailing list
>>>>> dtrace-discuss at opensolaris.org
>>>> _______________________________________________
>>>> dtrace-discuss mailing list
>>>> dtrace-discuss at opensolaris.org
>>>
>>> -- 
>>> Oracle<http://www.oracle.com>
>>> James Litchfield | Senior Consultant
>>> Phone: +1 4082237059<tel:+1%204082237059>  | Mobile:
>>> +1 4082180790
>>> <tel:+1%204082180790>
>>> Oracle Oracle ACS
>>> California
>>> Green Oracle<http://www.oracle.com/commitment>
>>> Oracle is committed to
>>> developing practices and products that help protect
>>> the environment
>>> <div id="jive-html-wrapper-div">
>>>
>>> This is what Oracle says about swap for 11gR2.
>>> The comment about
>>>    subtracting ISM is not<br>
>>> correct. A simple test shows that ISM does consume
>>> swap (even if
>>>    it''s not DISM). Think<br>
>>> about what happens when a memory segment is created
>>> (before it goes
>>>    to ISM), if someone<br>
>>> happens to attach in non-ISM mode and when everyone
>>> detaches from
>>>    the segment and it<br>
>>> ceases to be ISM). In the first and last stage swap
>>> space is
>>>    *required* and the VM system<br>
>>> reserves the space needed when the segment is first
>>> created.<br>
>>>    <br>
>>> I would be cautious about Oracle assurances...<br>
>>>    <br>
>>> Jim<br>
>>>    ---<br>
>>> <br>
>>> <blockquote type="cite">go to the following for
>>> full list of
>>>      available oracle book.<br>
>>> <a moz-do-not-send="true"
>>> class="moz-txt-link-freetext"
>>>
>>> ref="http://www.oracle.com/pls/db112/homepage">http://
>>> www.oracle.com/pls/db112/homepage</a>
>>>       <br>
>>> <br>
>>>       which links to the 11gr2 install guide<br>
>>> Db install guides<br>
>>> <a moz-do-not-send="true"
>>> class="moz-txt-link-freetext"
>>> ref="http://www.oracle.com/pls/db112/portal.portal_db?
>>> selected=11&amp;frame=">http://www.oracle.com/pls/db11
>>> 2/portal.portal_db?selected=11&amp;frame=</a>
>>>       <br>
>>> <br>
>>> which links to the following section on memory
>>> <br>
>>> <a moz-do-not-send="true"
>>> class="moz-txt-link-freetext"
>>> ref="http://download.oracle.com/docs/cd/E11882_01/inst
>>> all.112/e17163/pre_install.htm#sthref62">http://downlo
>>> ad.oracle.com/docs/cd/E11882_01/install.112/e17163/pre
>>> _install.htm#sthref62</a>
>>>       <br>
>>> <br>
>>>       <br>
>>> ------<br>
>>>       2.2.1 Memory Requirements<br>
>>> <br>
>>> The following are the memory requirements for
>>> installing Oracle
>>>      Database 11g Release 2.<br>
>>> <br>
>>>      &nbsp;&nbsp;&nbsp; *<br>
>>> <br>
>>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; At least 4 GB of
>>> RAM<br>
>>>      <br>
>>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; To determine the
RAM
>>> size, enter the following command:<br>
>>>      <br>
>>> # /usr/sbin/prtconf | grep "Memory size"<br>
>>>      <br>
>>> If the size of the RAM is less than the required
>>> size, then you
>>>      must install more memory before continuing.<br>
>>> <br>
>>>      &nbsp;&nbsp;&nbsp; *<br>
>>> <br>
>>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The following
>>> table describes the relationship between
>>> installed RAM and the configured swap space
>>> recommendation:<br>
>>>      <br>
>>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Note:<br>
>>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; On Solaris, if
>>> you use non-swappable memory, like ISM, then
>>> you should deduct the memory allocated to this
>>> space from the
>>> available RAM before calculating swap space.
>>> <br>
>>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
>>> RAM&nbsp;&nbsp;&nbsp;&nbsp; Swap Space<br>
>>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Between 4 GB and
>>> 16 GB&nbsp;&nbsp;&nbsp;&nbsp; Equal to the size of
>>> RAM<br>
>>> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; More than 16
>>> GB&nbsp;&nbsp;&nbsp;&nbsp; 16 GB</blockquote>
>>>    <br>
>>> <br>
>>>    <br>
>>> On 10/29/2010 2:01 PM, Jim Mauro wrote:
>>>    <blockquote
>>>
>>> ite="mid:45D35217-5BFA-40B3-9BB4-CBDC5281319C at oracle.c
>>> om"
>>>       type="cite">
>>> <pre wrap="">Thanks Mike. Good point on the script.
>>>
>>> Indeed, use of speculative tracing would be a better
>>> fit here. I''ll see if I can get something together
>>> and
>>> send it out.
>>>
>>> Thanks,
>>> /jim
>>>
>>> On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
>>>
>>> </pre>
>>>       <blockquote type="cite">
>>> <pre wrap="">On Fri, Oct 29, 2010 at 2:50 PM, Robin
>>> Cotgrove<a class="moz-txt-link-rfc2396E"
>>> href="mailto:robin at rjcnet.co.uk">&lt;robin at
rjcnet.co.
>>> k&gt;</a>  wrote:
>>> </pre>
>>>         <blockquote type="cite">
>>> <pre wrap="">Sorry guys. Swap is not the issue.
>>> We''ve had this confirmed by Oracle and I can clearly
>>> see there is 96GB of swap awailable on the system
>>> and ~50GB of main memory.
>>> /pre>
>>>         </blockquote>
>>> <pre wrap="">
>>> By who at Oracle?  Not everyone is equally qualified.
>>> I would tend to
>>> rust Jim Mauro (who co-wrote the books[1] on Solaris
>>> internals,
>>> performance,&amp; dtrace) over most of the people
>>> you will get to through
>>> normal support channels.
>>>
>>> 1.<a class="moz-txt-link-freetext"
>>> href="http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/">h
>>> ttp://www.amazon.com/Jim-Mauro/e/B001ILM8NC/</a>
>>>
>>> How do you know that available swap doesn''t
>>> momentarily drop?  I''ve
>>> run into plenty of instances where a system has tens
>>> of gigabytes of
>>> free memory but is woefully short on reservable swap
>>> (virtual memory,
>>> as Jim approximates).  Usually "vmstat 1" is helpful
>>> in observing
>>> spikes, but as I said before this could miss very
>>> short spikes.  If
>>> you''ve already done this to see that swap is unlikely
>>> to be an issue,
>>> knowing that would be useful to know.  If you are
>>> measuring the amount
>>> of reservable swap with "swap -l", you are doing it
>>> wrong.
>>>
>>> I do agree that there can be other shortfalls that
>>> can cause this.
>>> This may call for speculative tracing of stacks
>>> across the fork entry
>>> and return calls, displaying results only when the
>>> fork fails with
>>> EAGAIN.  Jim''s second script is similar to what I
>>> suggest, except that
>>> it doesn''t show the code path taken between
>>> syscall::forksys:entry and
>>> syscall::forksys:return.
>>>
>>> Also, I would be a little careful running the second
>>> script as is for
>>> long periods of time if you have a lot of forksys
>>> activity with unique
>>> stacks.  I think that as it is @ks may grow rather
>>> large over time
>>> because the successful forks are not cleared.
>>>
>>> -- 
>>> Mike Gerdts
>>> <a class="moz-txt-link-freetext"
>>> href="http://mgerdts.blogspot.com/">http://mgerdts.blo
>>> gspot.com/</a>
>>> _______________________________________________
>>> dtrace-discuss mailing list
>>> <a class="moz-txt-link-abbreviated"
>>> href="mailto:dtrace-discuss at
opensolaris.org">dtrace-di
>>> scuss at opensolaris.org</a>
>>> </pre>
>>>       </blockquote>
>>> <pre wrap="">
>>> _______________________________________________
>>> dtrace-discuss mailing list
>>> <a class="moz-txt-link-abbreviated"
>>> href="mailto:dtrace-discuss at
opensolaris.org">dtrace-di
>>> scuss at opensolaris.org</a>
>>> </pre>
>>>     </blockquote>
>>> <br>
>>>     <br>
>>> <div class="moz-signature">-- <br>
>>> <a href="http://www.oracle.com"
>>> target="_blank"><img
>>> src="cid:part1.03060504.05020101 at oracle.com"
>>> alt="Oracle"
>>>          border="0" height="26"
width="114"></a><br>
>>> nt size="2" color="#666666" face="Verdana,
Arial,
>>> Helvetica,
>>> sans-serif">James Litchfield | Senior
>>> Consultant<br>
>>> Phone:<a href="tel:+1%204082237059">+1
>>> 4082237059</a>  |
>>> Mobile:<a href="tel:+1%204082180790">+1
>>> 4082180790</a>  <br>
>>> <font color="#ff0000">Oracle</font>  Oracle
>>> ACS<br>
>>>         California</font>
>>> r>
>>> <a href="http://www.oracle.com/commitment"
>>> target="_blank"><img
>>> src="cid:part2.07030704.04000406 at oracle.com"
>>> alt="Green
>>> Oracle" align="abscenter" border="0"
>>> height="28" width="44"></a>
>>> <font size="1" color="#4b7d42"
face="Verdana,
>>> Arial, Helvetica,
>>> sans-serif">Oracle is committed to developing
>>> practices and
>>> products that help protect the
>>> environment</font>
>>> <!-- This signature was generated by the
>>> MyDesktop Oracle Business Signature utility version
>>> 3.6.0 -->
>>> <!-- Visit http://mydesktop.oracle.com/ and try
>>> it for yourself -->
>>>    </div>
>>> </div>_______________________________________________
>>> dtrace-discuss mailing list
>>> dtrace-discuss at opensolaris.org
>> -- 
>> This message posted from opensolaris.org
>> _______________________________________________
>> dtrace-discuss mailing list
>> dtrace-discuss at opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>

Robin Cotgrove

2010-Nov-01 15:38 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

Thanks Jim. 

We''ll run those 2 dtrace if and when it happens again. Box has been
rebooted and issue (which was seen 2 days in a row) has not re-occured. The
workload has not changed. The virtual stays ~flat-lined all the time on the box
because of the nature of the workload. The issue seems to re-occur every 60
days, so I think something is leaking, but I''m still convinced
it''s not system virtual memory.

Interesting what you say about the ISM/DISM stuff. The way we stopped DISM being
used for Oracle 10g was setting the SGA MAX and TARGET to equal values in the
init.ora etc..  and that stopped an ORACLE process per sid called
ora_dism_xxxxxxxx on startup. Things behaved much better once we did that. We
believed we were experiencing a known Solaris bug around DISM leak. We have
since patched the system but not re-enabled the use of DISM.
-- 
This message posted from opensolaris.org

Enda o''Connor - Sun Microsystems Ireland - Software Engineer

2010-Nov-01 16:21 UTC

head link

[dtrace-discuss] Solaris Internals Resource Threshold being hit

On 11/01/10 15:38, Robin Cotgrove wrote:> Thanks Jim.
>
> We''ll run those 2 dtrace if and when it happens again. Box has
been rebooted and issue (which was seen 2 days in a row) has not re-occured. The
workload has not changed. The virtual stays ~flat-lined all the time on the box
because of the nature of the workload. The issue seems to re-occur every 60
days, so I think something is leaking, but I''m still convinced
it''s not system virtual memory.
>
> Interesting what you say about the ISM/DISM stuff. The way we stopped DISM
being used for Oracle 10g was setting the SGA MAX and TARGET to equal values in
the init.ora etc..  and that stopped an ORACLE process per sid called
ora_dism_xxxxxxxx on startup. Things behaved much better once we did that. We
believed we were experiencing a known Solaris bug around DISM leak. We have
since patched the system but not re-enabled the use of DISM.Hi
Just out of interest could you check the permissions on oradism

ls -l $OH/bin/oradism

it should have root ownership and sticky bit set.
-bash-3.00$ ls -l oradism
-rwsr-x---   1 root     oinstall 1320256 Sep 11 09:48 oradism
-bash-3.00$

Have seen the very odd case of people tarring an OH as oracle user and 
extracting as oracle user and losing the root bit/sticky bit, which 
means oradism is effectively broke. I haven''t read the thread in 
entirety but certainly things like performance will suffer, also for 
DISM as it''s not pinned entirely in memory at all times, it must be 
backed with SWAP.
Also to use ISM set SGA_MAX_SIZE to be either equal to or smaller than 
the constituents of the entire SGA, not SGA_MAX.

But for DISM, swap is vital though, very very vital in the long run.

If you have the alert logs from when it was running with DISM, grep for 
WARNING and see if any relate to dism, also if trying DISM check the 
GRANSTATE in x$ksmge view to see if all allocations are locked.

Also is this x86/sparc, what type of box, are the oracle binaries 
actually local etc.

Enda

dtrace discuss - Oct 2010 - Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit

[dtrace-discuss] Solaris Internals Resource Threshold being hit