Myself, along with two of my colleagues, went on a day trip to this year’s IPEXPO Earls Court in London where we were shown a demo of esXpress at PHD Virtual Technology’s booth. We were very impressed by the presentation and some of the futures of esXpress. We were especially interested in the File Level Recovery features of their data De-Duplication appliance.
When I got back home from IPEXPO on Thursday, I decided to test the product for myself. So, I went on to download a 30 day evaluation from http://www.phdvirtual.com/ and I can surely say that I’ve been putting PHD Virtual esXpress 3.6 through some vigorous testing for the past few days. To start off with, I found esXpress easy to deploy and I was quite impressed with it. However I must confess that when I really started to dig into how esXpress goes about its business I started to have some concerns about it. However, I have decided not make any my concerns public until I have had a good chat with some people at PHD Virtual as I think that they will have an answer to most of my questions.
I’ve been trying to install VMware vCenter Server on Windows Server 2008 R2 Enterprise Edition. This is because I am working on a few blog articles on protecting the vCenter Server against hardware failures. At the moment, I’m busy working on two blog posts.
1. Protecting vCenter with VMware vCenter Server heartbeat;
2. Protecting vCenter with Microsoft Cluster Services (MSCS).
Whilst trying to install vCenter on Windows Server 2008 R2, I ran into some issues I had to resolve before I could do anything useful.
In ESX 3.5, when trying to create a new snapshot, the following error is reported in the VI Client:
An invalid snapshot configuration was detected
Also, when reading the Virtual Machine log file (vmware.log) located in the same location as the VM, you may find references to delta files such as "vm_name-000001.vmdk", but when browsing the datastore, the delta file does not exists. Also, when running vmware-cmd hassnapshot, "hassnapshot ()= " is returned, which means that the VM has no snapshots in place.
If this is truly the case, why is "An invalid snapshot configuration was detected" returned when trying to create a new snapshot?
VMware records snapshot information about the current VM in a .vmsd file. This file is located with the rest of the virtual machine configuration and VMDK files on the datastore. The file is normally called .vmsd. This file will contain information, even if your VM has no snapshots in place.
It is possible that in some cases, ESX fails to properly clean up after previous snapshots were removed. The information for previous snapshots may still be recorded in the .vmsd file. The file may indicate that you still have snapshots in pace, although all previous snapshots were removed and the dalta files have been merged. When you then try to create a snapshot, the .vmsd file will inform the ESX host that there is a delta file in place and that it has to create a second or third delta file. When the ESX host interigates the VMFS file system, it's unable to find the snapshot delta files specified in the .vmsd file and therefore errors with "an invalid snapshot configuration was detected."
Today I was planning on testing the new 16 patches released by VMware for vSphere 4. I wanted to place these on my 2nd ESX host. I normally place my ESX hosts in maintenance mode before I remedaite updates. As I placed esx2 in maintenance mode, the VMs, as expected, started to migrate over to the other hosts in the cluster with VMotion. The VMotion migration of two of my VMs running Windows XP, failed with the following error message:
A general system error occurred: Failed to write checkpoint data (offset 33558328, size 16384): Limit exceeded
It turns out that a VM must have less than 30MB Video RAM or VRAM assigned in order to be compatible with VMotion. As I normally run these two VMs at 1680 x 1050 resolution, I went all out an assigned the maximum amount of memory allowed, which is 128MB as VRAM, hence the reason for the VMotion failure.
VMware ESX / ESXi 4 does work out of the box on the HP Proliant ML110 G5. I know, I'm using ML110's with ESX 4. However, I still see posts like this http://communities.vmware.com/thread/163029 were people have problems with this ML110 and ESX 4 and even ESX 3 combination.
Now, if you look at the link to VMware Communities, you will notice in one of the screen shots attached that the CPU is a 1.8GHz Dual Core. The ML110 G5 comes in more than one CPU, so beware that when you decide to get one of these HP Proliant ML110 G5 for running ESX 4, make sure you get one with the Intel Xeon 3065 CPU that runs at 2.3GHz. The reason is because not all Intel CPU's shipped with the ML110 supports Intel VT, and this is a requirement for ESX 4.
Now I know you're probably saying: "The guy in the post is not using ESX 4." My point is simple: It doesn't matter what version of ESX the post refers to. If you want ESX 4 to run on the ML110, make sure you have the correct processor.
This is by no means a new issue. However, I still get support calls regarding tasks that get stuck on VMs. What do I mean by "stuck tasks"? Well, I've seen cases where a snapshot task initiated by VCB got stuck in the state of "Creating Virtual Machine Snapshot". Then VM goes down and cannot be accessed via the console, does not respond to pings, and the status of VMtools turns to "Unknown". You also cannot do "Power On" on the VM either as the "Creating Virtual Machine Snapshot" task is still showing as an active task. You can wait, but after 30 minutes, chances are that it won't sort itself out, so user intervention is required!
This is normally the approuch I take to sort this out:
2. Browse the datastore where the VM is located (best to do this via the CLI on the service console with "ls -lh") and check the time stamps of the files to see how log the snapshots, if any,have been sitting there for.
3. in VirtualCenter, or "vCenter" the VM will probably still be showing as powered on. Check on which of your ESX hosts it is running.
4. Log onto the service console of the ESX host that is running the VM. Elevate your priviledges to root.
5. Now, as the VM has an active task, you won't be able to send any other commands to the VM. You won't be able to use vmware-cmd to change the state of the VM either. Until the task that's stuck in progress has completed, the ESX host will not be able to send any power commands to the VM. The only way to now release the VM from it's sorry state and get rid of the "Active task" is to kill the VM's running process from the service console. In order to do so, you need to find the PID for the "running" VM. To get the PID do:
The Syntax is:
ps -auxwww |grep <VM-NAME>
Suppose you have a VM called WKSTNL01 The command will be:
ps -auxwww |grep WKSTNL01
This should return something like this:
root 12322 0.0 0.4 3140 1320 ? S<s 13:32 0:03 /usr/lib/vmware/bin/vmkload_app --sched.group=host/user/pool1 /usr/lib/vmware/bin/vmware-vmx -ssched.group=host/user/pool1 -# name=VMware ESX;version=4.0.0;buildnumber=164009;licensename=VMware ESX Server;licenseversion=4.0 build-164009; -@ pipe=/tmp/vmhsdaemon-0/vmx673aca8b7403868b; /vmfs/volumes/489a1228-2bfd25b5-6a2c-000e0cc41e52/WKSTNL01/WKSTNL01.vmx
The PID in this instance is 12322. This is what we need to kill.
6. Kill the process ID with kill -9:
kill -9 12322
7. Delete any snapshots created
8. Power On the VM.