Irma Garcia
2006-Aug-11 21:41 UTC
[zfs-discuss] Question on Zones and memory usage (65120349)
Hi All, Sun Fire V440 Solaris 10 Solaris Resource Manager Customer wrote the following: I have a v490 with 4 zones: tsunami:/#->zoneadm list -iv ID NAME STATUS PATH 0 global running / 4 fmstage running /fmstage 12 fmprod running /fmprod 15 fmtest running /fmtest fmtest has a pool assigned to it with acess to 2 cpus. When I run the psstat -Z in the fmtest zone I see; ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE 15 192 169G 163G 100% 0:29:55 96% fmtest on the global zone (tsunami) I see with the psstat -Z ; ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE 15 188 169G 163G 100% 0:46:00 48% fmtest 0 54 708M 175M 0.1% 2:23:40 0.1% global 12 27 112M 51M 0.0% 0:02:48 0.0% fmprod 4 27 281M 66M 0.0% 0:14:13 0.0% fmstage Questions? Does the 100% memory usage on each mean that the fmtest zone is using all the memory. How come when I run the top command I see different result for memory usage. What is the best method to tie a certian percentage of memory to certain zones ? rcapd ?? Thanks in Advance Irma -
Jeff Victor
2006-Aug-11 23:46 UTC
[zfs-discuss] Question on Zones and memory usage (65120349)
Irma Garcia wrote:> Hi All, > > Sun Fire V440 > Solaris 10 > Solaris Resource Manager > > Customer wrote the following: > > I have a v490 with 4 zones: > > tsunami:/#->zoneadm list -iv > ID NAME STATUS PATH > 0 global running / > 4 fmstage running /fmstage > 12 fmprod running /fmprod > 15 fmtest running /fmtest > > fmtest has a pool assigned to it with acess > to 2 cpus. When I run the psstat -Z in the > fmtest zone I see; > > ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE > 15 192 169G 163G 100% 0:29:55 96% fmtest > > on the global zone (tsunami) I see with the > psstat -Z ; > > ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE > 15 188 169G 163G 100% 0:46:00 48% fmtest > 0 54 708M 175M 0.1% 2:23:40 0.1% global > 12 27 112M 51M 0.0% 0:02:48 0.0% fmprod > 4 27 281M 66M 0.0% 0:14:13 0.0% fmstage > > Questions? > Does the 100% memory usage on each mean that > the fmtest zone is using all the memory.Are they using rcapd? Neither the man page nor a quick skim of the prstat source code at opensolaris.org provide a useful answer. It is not clear if "all the memory" means "all of the virtual memory" (unlikely) or "all of the physical memory" or "all of the memory available to the zone."> How come when I run the top command I see > different result for memory usage.A comparison of top and prstat source code would be useful, but someone familiar with those two programs would probably yield a solution more quickly.> What is the best method to tie a certian > percentage of memory to certain zones ? rcapd ??Yes. -------------------------------------------------------------------------- Jeff VICTOR Sun Microsystems jeff.victor @ sun.com OS Ambassador Sr. Technical Specialist Solaris 10 Zones FAQ: http://www.opensolaris.org/os/community/zones/faq --------------------------------------------------------------------------
Jeff Victor
2006-Aug-12 00:28 UTC
[zfs-discuss] Question on Zones and memory usage (65120349)
Follow-up: it looks to me like prstat displays the portion of the system''s physical memory in use by the processes in that zone. How much memory does that system have? Something seems amiss, as a V490 can hold up to 32GB, and prstat is showing 163GB of physical memory just for fmtest. Irma Garcia wrote:> Hi All, > > Sun Fire V440 > Solaris 10 > Solaris Resource Manager > > Customer wrote the following: > > I have a v490 with 4 zones: > > tsunami:/#->zoneadm list -iv > ID NAME STATUS PATH > 0 global running / > 4 fmstage running /fmstage > 12 fmprod running /fmprod > 15 fmtest running /fmtest > > fmtest has a pool assigned to it with acess > to 2 cpus. When I run the psstat -Z in the > fmtest zone I see; > > ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE > 15 192 169G 163G 100% 0:29:55 96% fmtest > > on the global zone (tsunami) I see with the > psstat -Z ; > > ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE > 15 188 169G 163G 100% 0:46:00 48% fmtest > 0 54 708M 175M 0.1% 2:23:40 0.1% global > 12 27 112M 51M 0.0% 0:02:48 0.0% fmprod > 4 27 281M 66M 0.0% 0:14:13 0.0% fmstage > > Questions? > Does the 100% memory usage on each mean that > the fmtest zone is using all the memory. How > come when I run the top command I see > different result for memory usage. > What is the best method to tie a certian > percentage of memory to certain zones ? rcapd ?? > > > > > Thanks in Advance > Irma > > > - > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- -------------------------------------------------------------------------- Jeff VICTOR Sun Microsystems jeff.victor @ sun.com OS Ambassador Sr. Technical Specialist Solaris 10 Zones FAQ: http://www.opensolaris.org/os/community/zones/faq --------------------------------------------------------------------------
Mike Gerdts
2006-Aug-12 01:24 UTC
[zfs-discuss] Question on Zones and memory usage (65120349)
On 8/11/06, Irma Garcia <Irma.Garcia at sun.com> wrote:> ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE > 15 188 169G 163G 100% 0:46:00 48% fmtest > 0 54 708M 175M 0.1% 2:23:40 0.1% global > 12 27 112M 51M 0.0% 0:02:48 0.0% fmprod > 4 27 281M 66M 0.0% 0:14:13 0.0% fmstage > > Questions? > Does the 100% memory usage on each mean that > the fmtest zone is using all the memory. How > come when I run the top command I see > different result for memory usage.The %mem column is the sum of the %mem that each process uses. Unfortuantely, that value seems to include the pages that are shared between many processes (e.g. database files, libc, etc.) without dividing by the number of processes that have that memory mapped. In other words, if you have 50 database processes that have used mmap() on the same 1 GB database, prstat will think that 50 GB of RAM is used when only 1 GB is really used. I have seen prstat report that oracle workloads on a 15k domain are using well over a terabyte of memory. This is kinda hard to do on a domain with ~300 GB of RAM < 50 GB swap.> What is the best method to tie a certian > percentage of memory to certain zones ? rcapd ??I *think* that rcapd suffers from the same problem that prstat does and may cause undesirable behavior. Because of the way that it works, I fully expect that if rcapd begins to force pages out, the paging activity for the piggy workload will cause severe performance degredation for everything on the machine. My personal opinion (not backed by extensive testing) is that rcapd is more likely to do more harm than good. If the workload that you are trying to control is java-based, consider using the various java flags to limit heap size. This will not protect you against memory leaks in the JVM, but it will protect against a misbehaving app. The same is likely true for the stack size. If the workload you are trying to control is some other single process, consider using ulimit to limit the stack and heap size. Set the size= option for all tmpfs file systems. Bug the folks that are working on memory sets and swap sets to get this code out sooner than later. If running on sun4v, consider LDOM''s when they are available (November?). Mike -- Mike Gerdts http://mgerdts.blogspot.com/
Jeff Victor
2006-Aug-12 19:48 UTC
[zfs-discuss] Question on Zones and memory usage (65120349)
Mike Gerdts wrote:> On 8/11/06, Irma Garcia <Irma.Garcia at sun.com> wrote: > >> ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE >> 15 188 169G 163G 100% 0:46:00 48% fmtest >> 0 54 708M 175M 0.1% 2:23:40 0.1% global >> 12 27 112M 51M 0.0% 0:02:48 0.0% fmprod >> 4 27 281M 66M 0.0% 0:14:13 0.0% fmstage >> >> Questions? >> Does the 100% memory usage on each mean that the fmtest zone is using all the memory. How >> come when I run the top command I see different result for memory usage. > > The %mem column is the sum of the %mem that each process uses. > Unfortuantely, that value seems to include the pages that are shared > between many processes (e.g. database files, libc, etc.) without > dividing by the number of processes that have that memory mapped. In > other words, if you have 50 database processes that have used mmap() > on the same 1 GB database, prstat will think that 50 GB of RAM is used > when only 1 GB is really used.Good observation, Mike. FYI, this is bug 4754856 ( http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=4754856 ) Irma, are the apps in fmtest using alot of shared memory?> I *think* that rcapd suffers from the same problem that prstat does > and may cause undesirable behavior. Because of the way that it works, > I fully expect that if rcapd begins to force pages out, the paging > activity for the piggy workload will cause severe performance > degredation for everything on the machine. My personal opinion (not > backed by extensive testing) is that rcapd is more likely to do more > harm than good.It is plausible, though not always practical, to measure the amount of shared pages for a particular zone during normal use, and factor that into the limits you specify to rcapd. It *is* easier to use rcapd safely with applications that do not use much shared memory.> Bug the folks that are working on memory sets and swap sets to get > this code out sooner than later.We are working very hard on those two feature sets. We have made a great deal of progress, especially on memory sets, which is the higher priority of the two. However, memory sets turned out to be more challenging than first expected.> If running on sun4v, consider LDOM''s when they are available (November?).LDOM''s will avoid the problems described above, at the cost of some flexibility in resource efficiency - the same cost paid by all consolidation solutions that use muliple OS instances. For example, less RAM is used by sparse-root zones because multiple instances of a program (e.g. /bin/ls) share common memory pages. LDOMs (and other multi-OS-instance solutions) cannot do that. -------------------------------------------------------------------------- Jeff VICTOR Sun Microsystems jeff.victor @ sun.com OS Ambassador Sr. Technical Specialist Solaris 10 Zones FAQ: http://www.opensolaris.org/os/community/zones/faq --------------------------------------------------------------------------
Irma Garcia
2006-Aug-15 16:37 UTC
[zfs-discuss] Question on Zones and memory usage (65120349)
Hello All, Here is customer''s reply: I guess since the zones we are working with are running /acting as Oracle 10 database servers, the 100% memory usage prstat is not accurate. Also, from the text below it seems that rcapd is not the way to go to segregate memory in zones and to wait for LDOMs which we cannot do. Also I read the following about FSS; Q: Can I use the Solaris 10 FSS (Fair Share Scheduler) with Oracle in a Solaris Container? A: There are currently (June 2006) two distinct concerns regarding the use of FSS in a Container when running Oracle databases: In testing - Oracle processes use internal methods to prioritize themselves to improve inefficiency. It is possible that these methods might not work well in conjunction with the Solaris FSS. Although there are no known problems with non-RAC configurations, Sun and Oracle are testing this type of configuration to discover any negative interactions. This testing should be completed soon. Still not sure what to do to pin a certain amount of memory to my production oracle server zone. Jeff Victor wrote On 08/12/06 13:48,:> Mike Gerdts wrote: > >>On 8/11/06, Irma Garcia <Irma.Garcia at sun.com> wrote: >> >> >>>ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE >>>15 188 169G 163G 100% 0:46:00 48% fmtest >>>0 54 708M 175M 0.1% 2:23:40 0.1% global >>>12 27 112M 51M 0.0% 0:02:48 0.0% fmprod >>>4 27 281M 66M 0.0% 0:14:13 0.0% fmstage >>> >>>Questions? >>>Does the 100% memory usage on each mean that the fmtest zone is using all the memory. How >>>come when I run the top command I see different result for memory usage. >> >>The %mem column is the sum of the %mem that each process uses. >>Unfortuantely, that value seems to include the pages that are shared >>between many processes (e.g. database files, libc, etc.) without >>dividing by the number of processes that have that memory mapped. In >>other words, if you have 50 database processes that have used mmap() >>on the same 1 GB database, prstat will think that 50 GB of RAM is used >>when only 1 GB is really used. > > > Good observation, Mike. FYI, this is bug 4754856 ( > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=4754856 ) > > Irma, are the apps in fmtest using alot of shared memory? > > >>I *think* that rcapd suffers from the same problem that prstat does >>and may cause undesirable behavior. Because of the way that it works, >>I fully expect that if rcapd begins to force pages out, the paging >>activity for the piggy workload will cause severe performance >>degredation for everything on the machine. My personal opinion (not >>backed by extensive testing) is that rcapd is more likely to do more >>harm than good. > > > It is plausible, though not always practical, to measure the amount of shared > pages for a particular zone during normal use, and factor that into the limits you > specify to rcapd. > > It *is* easier to use rcapd safely with applications that do not use much shared > memory. > > >>Bug the folks that are working on memory sets and swap sets to get >>this code out sooner than later. > > > We are working very hard on those two feature sets. We have made a great deal of > progress, especially on memory sets, which is the higher priority of the two. > However, memory sets turned out to be more challenging than first expected. > > >>If running on sun4v, consider LDOM''s when they are available (November?). > > > LDOM''s will avoid the problems described above, at the cost of some flexibility in > resource efficiency - the same cost paid by all consolidation solutions that use > muliple OS instances. For example, less RAM is used by sparse-root zones because > multiple instances of a program (e.g. /bin/ls) share common memory pages. LDOMs > (and other multi-OS-instance solutions) cannot do that. > > -------------------------------------------------------------------------- > Jeff VICTOR Sun Microsystems jeff.victor @ sun.com > OS Ambassador Sr. Technical Specialist > Solaris 10 Zones FAQ: http://www.opensolaris.org/os/community/zones/faq > ---------------------------------------------------------------------------- Irma Garcia Technical Support Engineer Phone:303-272-6420 irma.garcia at sun.com Submit/View/Update Cases at: http://www.sun.com/service/online
Irma Garcia
2006-Aug-23 19:10 UTC
[zfs-discuss] Question on Zones and memory usage (65120349)
Hi all, Customer has another questions. I''m resending : <snip> I guess since the zones we are working with are running /acting as Oracle 10 database servers, the 100% memory usage prstat is not accurate. Also, from the text below it seems that rcapd is not the way to go to segregate memory in zones and to wait for LDOMs which we cannot do. Also I read the following about FSS; Q: Can I use the Solaris 10 FSS (Fair Share Scheduler) with Oracle in a Solaris Container? A: There are currently (June 2006) two distinct concerns regarding the use of FSS in a Container when running Oracle databases: In testing - Oracle processes use internal methods to prioritize themselves to improve inefficiency. It is possible that these methods might not work well in conjunction with the Solaris FSS. Although there are no known problems with non-RAC configurations, Sun and Oracle are testing this type of configuration to discover any negative interactions. This testing should be completed soon. Still not sure what to do to pin a certain amount of memory to my production oracle server zone. <snip> Jeff Victor wrote On 08/12/06 13:48,:> Mike Gerdts wrote: > >>On 8/11/06, Irma Garcia <Irma.Garcia at sun.com> wrote: >> >> >>>ZONEID NPROC SIZE RSS MEMORY TIME CPU ZONE >>>15 188 169G 163G 100% 0:46:00 48% fmtest >>>0 54 708M 175M 0.1% 2:23:40 0.1% global >>>12 27 112M 51M 0.0% 0:02:48 0.0% fmprod >>>4 27 281M 66M 0.0% 0:14:13 0.0% fmstage >>> >>>Questions? >>>Does the 100% memory usage on each mean that the fmtest zone is using all the memory. How >>>come when I run the top command I see different result for memory usage. >> >>The %mem column is the sum of the %mem that each process uses. >>Unfortuantely, that value seems to include the pages that are shared >>between many processes (e.g. database files, libc, etc.) without >>dividing by the number of processes that have that memory mapped. In >>other words, if you have 50 database processes that have used mmap() >>on the same 1 GB database, prstat will think that 50 GB of RAM is used >>when only 1 GB is really used. > > > Good observation, Mike. FYI, this is bug 4754856 ( > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=4754856 ) > > Irma, are the apps in fmtest using alot of shared memory? > > >>I *think* that rcapd suffers from the same problem that prstat does >>and may cause undesirable behavior. Because of the way that it works, >>I fully expect that if rcapd begins to force pages out, the paging >>activity for the piggy workload will cause severe performance >>degredation for everything on the machine. My personal opinion (not >>backed by extensive testing) is that rcapd is more likely to do more >>harm than good. > > > It is plausible, though not always practical, to measure the amount of shared > pages for a particular zone during normal use, and factor that into the limits you > specify to rcapd. > > It *is* easier to use rcapd safely with applications that do not use much shared > memory. > > >>Bug the folks that are working on memory sets and swap sets to get >>this code out sooner than later. > > > We are working very hard on those two feature sets. We have made a great deal of > progress, especially on memory sets, which is the higher priority of the two. > However, memory sets turned out to be more challenging than first expected. > > >>If running on sun4v, consider LDOM''s when they are available (November?). > > > LDOM''s will avoid the problems described above, at the cost of some flexibility in > resource efficiency - the same cost paid by all consolidation solutions that use > muliple OS instances. For example, less RAM is used by sparse-root zones because > multiple instances of a program (e.g. /bin/ls) share common memory pages. LDOMs > (and other multi-OS-instance solutions) cannot do that. > > -------------------------------------------------------------------------- > Jeff VICTOR Sun Microsystems jeff.victor @ sun.com > OS Ambassador Sr. Technical Specialist > Solaris 10 Zones FAQ: http://www.opensolaris.org/os/community/zones/faq > ---------------------------------------------------------------------------- Irma Garcia Technical Support Engineer Phone:303-272-6420 irma.garcia at sun.com Submit/View/Update Cases at: http://www.sun.com/service/online