Glen Pitt-Pladdy :: Blog
I seem to spend a lot of time using VMWare in one form or another, or supporting others using it. Currently I directly or indirectly look after 3 VMWare Server installs and 5 VMWare Workstation or Fusion installs.
Business at the speed of treacle
My current machine at the office is a quad core Intel with 4GB of memory (developers get 8GB as they are running many VMs and building on them) running Ubuntu hardy. I use VMWare for running Windows and other VMs for testing various things (like bare-metal restores from backup - which is one of the things I have been doing today in the background while doing a load of other work).
A few days ago while running an intensive long term process under the Windows VM, I got frustrated at the speed of both the VM and the host - there was times where I could hardly work. After some investigation, I discovered that the hard drive on the host was thrashing even when the VM wasn't accessing disk, and traced it down to a .vmem file in the VM directory. My understanding is that this is a page file for the VM and mirrors the VM memory - though not sure why it is needed (if at all).
Less is more
After some googling, I discovered a magic bit of config for all unix (Linux / OSX) VMWare systems to disable the .vmem file. This may have some unpleasant side-effect, so do it at your own risk, but so far I am doing fine.
To do this you need to edit (with your favourite editor - vim in my case), or create the file if it does not exist:
Add to the file a line saying:
All you need to do now is shut down the VM and then boot it up again. This time it should start without the .vmem file.
My experience is that this now runs much faster as it does not hit the disk all the time keeping the page file up to date. On servers this seems to also keep them smoother when running a bunch of VMs simultaneously.
A quick test (booting Ubuntu Hardy Desktop LiveCD in a VM with 768M of memory, then loading Firefox - measured from when the "Loading" message appears to Firefox having rendered the default page): Default config (with vmem file) - 118 seconds, and the above tweeked config (without vmem file) - 93 seconds. Before people start talking about this being disk cache related - I already booted it a few times before running the tests so everything should already be cached before running the tests.
The other thing that I do with the developers PCs is to stripe (RAID level 0) the drives (3 of 'em) that their VMs are on. This does have the risk of increasing the chance of disk failure (by the number of disks in the array), but can provide big increases in speed.
One thing I have found is that the way the stripes are done matters a lot. I think the main bottleneck is seek time rather than raw IO throughput. To improve this, I use large stripes so that chances are the 3 disks will be seeking at different times. This means lower raw throughput, but better when accessing lots of files (like when running a lot of VMs). When I did one machine with default (64k) stripes and 2 disks, there where performance issues.
I am a big fan of the Linux RAID & LVM system which in my experience has been very robust and performs far better than any built-in hardware solutions - in fact I have measured more than double performance against a LSI hardware controller on a server with the same physical disks, and far better against older controllers.
To create the array:
mdadm --create <MD device> --chunk=512 --level=0 --raid-devices=3 <partitions>
This use 512k stripes which seems to work quite nicely for me. I use JFS for on them, and add the "noatime" option in /etc/fstab.
The overall result seems quite respectable - none of the developers running this config are complaining which has to be a good sign! :)
Copyright Glen Pitt-Pladdy 2008-2013