thr3ads.net - freebsd stable - 10.0 interaction with vmware [Aug 2014]

If this information is useful, please help other people find it:
Share via:

Paul Koch

2014-Aug-26 07:16 UTC

10.0 interaction with vmware

Curious if anyone has an understanding of what actually goes on
with VMWare memory control of a FreeBSD 10 guest when open-vm-tools
is installed and how it could affect performance.

Our typical customer environment is a largish VMWare server with
an appropriate amount of RAM allocated to the guest, which currently
runs FreeBSD 10.0p7 + our software, UFS root, and data stored on a
ZFS partition.  Our software mmaps large database files, does rather
largish data collection (ping, snmp, netflow, syslog, etc) and
mostly cruises along, but performance drops off a cliff in low
memory situations.

We don't install open-vm-tools at the moment, therefore we have a known
amount of memory to work with (ie. what the customer initially 
configured the guest for), but our customers (or in particular, their
VM guys) would really like vmware tools or open-vm-tools by default.

From what we gather, many sites choose to "over provision" the memory
in the VM setups, and when memory gets low, the host takes back 
some of the RAM allocated to the guest.

How does this work actually work ?  Does it only take back what
FreeBSD considers to be "free" memory or can the host start taking
back "inactive", "wired", "zfs arc" memory ?  We
tend to rely on
stuff being in inactive and zfs arc.  If we start swapping, we
are dead.

Also, is there much of a performance hit if the host steals back
free memory, and then gives it back ?  We'd assume all memory
the host gives to the guest is pre-bzero'ed so the FreeBSD wouldn't
need to also bzero it.

	Paul.
-- 
Paul Koch | Founder, CEO
AKIPS Network Monitor
http://www.akips.com
Brisbane, Australia

Patrick Bihan-Faou

2014-Aug-26 13:30 UTC

head link

10.0 interaction with vmware

Hi,

 From what I understand of the VMWare tools is that it adds a kernel 
module that comunicates with the host. When the host is under memory 
pressure it claims some of the memory used by each VM by asking the 
kernel module to grab RAM. This active  RAM is "reserved" by the
kernel
module and can then be used by the host for another VM.

This mechanism will increase the memory pressure inside your VM, which 
can lead to some swapping or freeing of otherwise less used memory pages 
in the OS.

This cooperative mode of sharing the memory pressure experienced by the 
hypervisor is called "ballooning" in VMWare terms. The kernel module 
responsible to implement the VM side of this si called vmmemctl.ko.

If the memory requirements cannot be met using this "ballooning" 
technique (or if none of the VM have the vmware tools enabled), you will 
start to see swapping at the host level, which will be much worse than 
swapping at the VM level.

This is the main reason why you should run the vmware tools.

Regards,

Patrick.

On 26/08/2014 09:16, Paul Koch wrote:> Curious if anyone has an understanding of what actually goes on
> with VMWare memory control of a FreeBSD 10 guest when open-vm-tools
> is installed and how it could affect performance.
>
> Our typical customer environment is a largish VMWare server with
> an appropriate amount of RAM allocated to the guest, which currently
> runs FreeBSD 10.0p7 + our software, UFS root, and data stored on a
> ZFS partition.  Our software mmaps large database files, does rather
> largish data collection (ping, snmp, netflow, syslog, etc) and
> mostly cruises along, but performance drops off a cliff in low
> memory situations.
>
> We don't install open-vm-tools at the moment, therefore we have a known
> amount of memory to work with (ie. what the customer initially
> configured the guest for), but our customers (or in particular, their
> VM guys) would really like vmware tools or open-vm-tools by default.
>
>  From what we gather, many sites choose to "over provision" the
memory
> in the VM setups, and when memory gets low, the host takes back
> some of the RAM allocated to the guest.
>
> How does this work actually work ?  Does it only take back what
> FreeBSD considers to be "free" memory or can the host start
taking
> back "inactive", "wired", "zfs arc" memory ? 
We tend to rely on
> stuff being in inactive and zfs arc.  If we start swapping, we
> are dead.
>
> Also, is there much of a performance hit if the host steals back
> free memory, and then gives it back ?  We'd assume all memory
> the host gives to the guest is pre-bzero'ed so the FreeBSD wouldn't
> need to also bzero it.
>
> 	Paul.

Ivan Voras

2014-Aug-29 13:18 UTC

head link

10.0 interaction with vmware

On 26/08/2014 09:16, Paul Koch wrote:
> How does this work actually work ?  Does it only take back what
> FreeBSD considers to be "free" memory or can the host start
taking
> back "inactive", "wired", "zfs arc" memory ? 
We tend to rely on
> stuff being in inactive and zfs arc.  If we start swapping, we
> are dead.
Under memory pressure, VMWare's Balooning will cause internal FreeBSD's
"memory low" triggers to fire, which will release ARC memory, which
will
probably degrade your performance. But from what I've seen, for some
reason, it's pretty hard to actually see the VMWare host activate
balooning, at least on FreeBSD servers. I've been using this combination
for years and I only saw it once, for a trivial amount of memory. It's
probably a last-resort measure.

Also, VMWare will manage guest memory even without any guest software at
all. It keeps track of recently active memory pages and may swap the
unused ones out.

FWIW, I think ZFS's crazy memory footprint makes it unsuitable for VMs
(or actually most non-file-server workflows...), but I'm sure most
people here will not agree with me :D If you have the opportunity to try
it out in production, just run a regular UFS2+SU in your VM for a couple
of days and see the difference.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 311 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20140829/b6f90237/attachment.sig>

Paul Koch

2014-Sep-03 01:07 UTC

head link

10.0 interaction with vmware

On Fri, 29 Aug 2014 15:18:32 +0200
Ivan Voras <ivoras at freebsd.org> wrote:
> On 26/08/2014 09:16, Paul Koch wrote:
> 
> > How does this work actually work ?  Does it only take back what
> > FreeBSD considers to be "free" memory or can the host start
taking
> > back "inactive", "wired", "zfs arc"
memory ?  We tend to rely on
> > stuff being in inactive and zfs arc.  If we start swapping, we
> > are dead.
> 
> Under memory pressure, VMWare's Balooning will cause internal
FreeBSD's
> "memory low" triggers to fire, which will release ARC memory,
which will
> probably degrade your performance. But from what I've seen, for some
> reason, it's pretty hard to actually see the VMWare host activate
> balooning, at least on FreeBSD servers. I've been using this
combination
> for years and I only saw it once, for a trivial amount of memory. It's
> probably a last-resort measure.
Yer, releasing ARC memory would be tragic because it would already
contain useful data for us and going back to disk/SAN would be a
hit.  We do set limits on the ARC size on the install because it 
appears to be very "aggressive" at consuming memory.

We also constantly monitor/graph memory usage, so the customer can get
some idea of what is happening on their FreeBSD VM.

eg. http://www.akips.com/gz/downloads/sys-graph.html
    http://www.akips.com/gz/downloads/poller-graph.html

On that machine, ARC has been limited to ~2G, and it appears to always
hover around there.  If ballooning was turned on and memory was tight
enough to cause ARC to drop, at least they'd be able to go back in 
time and see that something tragic happened.

 > Also, VMWare will manage guest memory even without any guest software at
> all. It keeps track of recently active memory pages and may swap the
> unused ones out.
In computing time, how long is "recently" ???

We have very few running processes, and a handful of largish mmap'ed
files.  Most of the mmap'ed files are read ~40 times a second, so we'd
assume that they are always "recently" active.

Our largest mmap'ed file is only written to once a minute with every
polled statistic. Every memory page updated, but once a minute may not
be considered "recently" in computing time.  If ballooning caused
paging out of that mmap'ed file, we'd be toast.

> FWIW, I think ZFS's crazy memory footprint makes it unsuitable for VMs
> (or actually most non-file-server workflows...), but I'm sure most
> people here will not agree with me :D If you have the opportunity to try
> it out in production, just run a regular UFS2+SU in your VM for a couple
> of days and see the difference.
We actually started out with UFS2+SU on our data partition, but wanted
a FreeBSD install configuration of "one size fits all" that would work
ok on bare metal and a VM.  We have zero control on of the platform the
customer uses - ranging from a throw away old desktop PC to high end
dedicated bare metal, or in a VM in the data centre.  Since we are
mostly CPU bound, ZFS doesn't appear to be a performance problem for
us in a VM.

On a side note, one of the reasons why switched to ZFS is because we
"thought" we had a data corruption problem with UFS2 when shutting
down.  It took a while to discover what we were doing wrong.  Doh!!

At shutdown, running on physical hardware or in a VM, we'd get to
"All Buffers Synced" and the machine would hang for ages before
powering
off or rebooting.  When it came back up, the file system was dirty, and
wasn't umounted properly.  Googling for 'all buffers synced' came up
with various issues related to USB.

But, what was happening was... we have largish mmap'ed files (eg. 2G),
which we mmap with the MAP_NOSYNC flag.  The memory pages are being
written to constantly, and we fsync() them every 600 seconds so we can
control the time when the disk write occurs.  It appears the fsync 
writes out the entire mmap'ed file sequentially because a quick calc on
the file size and raw disk write speed generally matches.  But at 
shutdown, we were forgetting to do a final fsync on those big files,
which meant that the OS had to write them out.  That doesn't appear
to occur until after the "all buffers synced" message though.  On 
real hardware, it just looks like the machine has hung, but did notice
the disk led hard on.  Running in a VirtualBox VM, at shutdown we ran
gstat/systat on the FreeBSD host, which showed the disk stuck in 100%
for ages and ages after the "all buffers synced" message.  It was
taking
so long that the VM was being killed ungracefully by the shutdown scripts.

We use MAP_NOSYNC because without it, the default sync'ing behaviour on
large mmap'ed files sucks.  It seems the shutdown behaviour is similar
or much worse.

The problem on physical hardware was no obvious messages of what the
machine was doing after the "all buffers synced" message!

Now we just do a fsync(1) of every mmap'ed file in our shutdown script,
and the machine shuts down clean and fast.

	Paul.
-- 
Paul Koch | Founder, CEO
AKIPS Network Monitor
http://www.akips.com
Brisbane, Australia

freebsd stable - Aug 2014 - 10.0 interaction with vmware

10.0 interaction with vmware

10.0 interaction with vmware

10.0 interaction with vmware

10.0 interaction with vmware