Curious if anyone has an understanding of what actually goes on with VMWare memory control of a FreeBSD 10 guest when open-vm-tools is installed and how it could affect performance. Our typical customer environment is a largish VMWare server with an appropriate amount of RAM allocated to the guest, which currently runs FreeBSD 10.0p7 + our software, UFS root, and data stored on a ZFS partition. Our software mmaps large database files, does rather largish data collection (ping, snmp, netflow, syslog, etc) and mostly cruises along, but performance drops off a cliff in low memory situations. We don't install open-vm-tools at the moment, therefore we have a known amount of memory to work with (ie. what the customer initially configured the guest for), but our customers (or in particular, their VM guys) would really like vmware tools or open-vm-tools by default. From what we gather, many sites choose to "over provision" the memory in the VM setups, and when memory gets low, the host takes back some of the RAM allocated to the guest. How does this work actually work ? Does it only take back what FreeBSD considers to be "free" memory or can the host start taking back "inactive", "wired", "zfs arc" memory ? We tend to rely on stuff being in inactive and zfs arc. If we start swapping, we are dead. Also, is there much of a performance hit if the host steals back free memory, and then gives it back ? We'd assume all memory the host gives to the guest is pre-bzero'ed so the FreeBSD wouldn't need to also bzero it. Paul. -- Paul Koch | Founder, CEO AKIPS Network Monitor http://www.akips.com Brisbane, Australia
Hi, From what I understand of the VMWare tools is that it adds a kernel module that comunicates with the host. When the host is under memory pressure it claims some of the memory used by each VM by asking the kernel module to grab RAM. This active RAM is "reserved" by the kernel module and can then be used by the host for another VM. This mechanism will increase the memory pressure inside your VM, which can lead to some swapping or freeing of otherwise less used memory pages in the OS. This cooperative mode of sharing the memory pressure experienced by the hypervisor is called "ballooning" in VMWare terms. The kernel module responsible to implement the VM side of this si called vmmemctl.ko. If the memory requirements cannot be met using this "ballooning" technique (or if none of the VM have the vmware tools enabled), you will start to see swapping at the host level, which will be much worse than swapping at the VM level. This is the main reason why you should run the vmware tools. Regards, Patrick. On 26/08/2014 09:16, Paul Koch wrote:> Curious if anyone has an understanding of what actually goes on > with VMWare memory control of a FreeBSD 10 guest when open-vm-tools > is installed and how it could affect performance. > > Our typical customer environment is a largish VMWare server with > an appropriate amount of RAM allocated to the guest, which currently > runs FreeBSD 10.0p7 + our software, UFS root, and data stored on a > ZFS partition. Our software mmaps large database files, does rather > largish data collection (ping, snmp, netflow, syslog, etc) and > mostly cruises along, but performance drops off a cliff in low > memory situations. > > We don't install open-vm-tools at the moment, therefore we have a known > amount of memory to work with (ie. what the customer initially > configured the guest for), but our customers (or in particular, their > VM guys) would really like vmware tools or open-vm-tools by default. > > From what we gather, many sites choose to "over provision" the memory > in the VM setups, and when memory gets low, the host takes back > some of the RAM allocated to the guest. > > How does this work actually work ? Does it only take back what > FreeBSD considers to be "free" memory or can the host start taking > back "inactive", "wired", "zfs arc" memory ? We tend to rely on > stuff being in inactive and zfs arc. If we start swapping, we > are dead. > > Also, is there much of a performance hit if the host steals back > free memory, and then gives it back ? We'd assume all memory > the host gives to the guest is pre-bzero'ed so the FreeBSD wouldn't > need to also bzero it. > > Paul.
On 26/08/2014 09:16, Paul Koch wrote:> How does this work actually work ? Does it only take back what > FreeBSD considers to be "free" memory or can the host start taking > back "inactive", "wired", "zfs arc" memory ? We tend to rely on > stuff being in inactive and zfs arc. If we start swapping, we > are dead.Under memory pressure, VMWare's Balooning will cause internal FreeBSD's "memory low" triggers to fire, which will release ARC memory, which will probably degrade your performance. But from what I've seen, for some reason, it's pretty hard to actually see the VMWare host activate balooning, at least on FreeBSD servers. I've been using this combination for years and I only saw it once, for a trivial amount of memory. It's probably a last-resort measure. Also, VMWare will manage guest memory even without any guest software at all. It keeps track of recently active memory pages and may swap the unused ones out. FWIW, I think ZFS's crazy memory footprint makes it unsuitable for VMs (or actually most non-file-server workflows...), but I'm sure most people here will not agree with me :D If you have the opportunity to try it out in production, just run a regular UFS2+SU in your VM for a couple of days and see the difference. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 311 bytes Desc: OpenPGP digital signature URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20140829/b6f90237/attachment.sig>
On Fri, 29 Aug 2014 15:18:32 +0200 Ivan Voras <ivoras at freebsd.org> wrote:> On 26/08/2014 09:16, Paul Koch wrote: > > > How does this work actually work ? Does it only take back what > > FreeBSD considers to be "free" memory or can the host start taking > > back "inactive", "wired", "zfs arc" memory ? We tend to rely on > > stuff being in inactive and zfs arc. If we start swapping, we > > are dead. > > Under memory pressure, VMWare's Balooning will cause internal FreeBSD's > "memory low" triggers to fire, which will release ARC memory, which will > probably degrade your performance. But from what I've seen, for some > reason, it's pretty hard to actually see the VMWare host activate > balooning, at least on FreeBSD servers. I've been using this combination > for years and I only saw it once, for a trivial amount of memory. It's > probably a last-resort measure.Yer, releasing ARC memory would be tragic because it would already contain useful data for us and going back to disk/SAN would be a hit. We do set limits on the ARC size on the install because it appears to be very "aggressive" at consuming memory. We also constantly monitor/graph memory usage, so the customer can get some idea of what is happening on their FreeBSD VM. eg. http://www.akips.com/gz/downloads/sys-graph.html http://www.akips.com/gz/downloads/poller-graph.html On that machine, ARC has been limited to ~2G, and it appears to always hover around there. If ballooning was turned on and memory was tight enough to cause ARC to drop, at least they'd be able to go back in time and see that something tragic happened.> Also, VMWare will manage guest memory even without any guest software at > all. It keeps track of recently active memory pages and may swap the > unused ones out.In computing time, how long is "recently" ??? We have very few running processes, and a handful of largish mmap'ed files. Most of the mmap'ed files are read ~40 times a second, so we'd assume that they are always "recently" active. Our largest mmap'ed file is only written to once a minute with every polled statistic. Every memory page updated, but once a minute may not be considered "recently" in computing time. If ballooning caused paging out of that mmap'ed file, we'd be toast.> FWIW, I think ZFS's crazy memory footprint makes it unsuitable for VMs > (or actually most non-file-server workflows...), but I'm sure most > people here will not agree with me :D If you have the opportunity to try > it out in production, just run a regular UFS2+SU in your VM for a couple > of days and see the difference.We actually started out with UFS2+SU on our data partition, but wanted a FreeBSD install configuration of "one size fits all" that would work ok on bare metal and a VM. We have zero control on of the platform the customer uses - ranging from a throw away old desktop PC to high end dedicated bare metal, or in a VM in the data centre. Since we are mostly CPU bound, ZFS doesn't appear to be a performance problem for us in a VM. On a side note, one of the reasons why switched to ZFS is because we "thought" we had a data corruption problem with UFS2 when shutting down. It took a while to discover what we were doing wrong. Doh!! At shutdown, running on physical hardware or in a VM, we'd get to "All Buffers Synced" and the machine would hang for ages before powering off or rebooting. When it came back up, the file system was dirty, and wasn't umounted properly. Googling for 'all buffers synced' came up with various issues related to USB. But, what was happening was... we have largish mmap'ed files (eg. 2G), which we mmap with the MAP_NOSYNC flag. The memory pages are being written to constantly, and we fsync() them every 600 seconds so we can control the time when the disk write occurs. It appears the fsync writes out the entire mmap'ed file sequentially because a quick calc on the file size and raw disk write speed generally matches. But at shutdown, we were forgetting to do a final fsync on those big files, which meant that the OS had to write them out. That doesn't appear to occur until after the "all buffers synced" message though. On real hardware, it just looks like the machine has hung, but did notice the disk led hard on. Running in a VirtualBox VM, at shutdown we ran gstat/systat on the FreeBSD host, which showed the disk stuck in 100% for ages and ages after the "all buffers synced" message. It was taking so long that the VM was being killed ungracefully by the shutdown scripts. We use MAP_NOSYNC because without it, the default sync'ing behaviour on large mmap'ed files sucks. It seems the shutdown behaviour is similar or much worse. The problem on physical hardware was no obvious messages of what the machine was doing after the "all buffers synced" message! Now we just do a fsync(1) of every mmap'ed file in our shutdown script, and the machine shuts down clean and fast. Paul. -- Paul Koch | Founder, CEO AKIPS Network Monitor http://www.akips.com Brisbane, Australia