Author Archive

August 27, 2008

Breaking Through the Confusion about Disaster Recovery and High Availability

Virtually every company we talk to needs both disaster recovery solutions to recover their systems and data after a major disruption, and high availability to keep key applications always available. In my discussions with companies considering our everRun software, I’ve heard a lot of them say that they are confused by many vendors’ claims and counter-claims for DR and HA. One of the biggest sources of confusion is that some vendors with solid products for disaster recovery are trying to pass off their DR solutions as reliable HA solutions. If the feedback I’m getting is any indication, these DR solutions posing as HA solutions just don’t work.

It’s not hard to see why a DR solution doesn’t make a good HA solution. With a product that is good at DR, in most cases getting the data across to the other location is pretty straightforward. But when you try to use the same solution to get both the application and the data across to use it for HA, well that’s where it breaks down. Let’s look at why.

A good DR product is usually fairly easy to set up for data replication to another site. But setting up the same product to restart the whole thing, application and data, when a failover occurs is complex and prone to errors. To set it up, you have to script all the pieces to make it happen – fault detection, client redirection to the DR site, application reset, and the list goes on. No wonder we so often hear that scripted-DR-for-HA doesn’t work consistently – there are too many moving parts that have to managed and monitored. In addition, no matter how minor a failure is, failover to the remote site is required. Not every failure you face is a disaster; therefore each failure should not be treated as one. Based on these horror stories, we thought it was a good idea to put together this webinar, Breaking Through the Confusion about DR and HA. I hope to help you better understand when, how, and why DR is the best fit to meet your requirements, when to use an HA solution and how to combine the two for optimal protection.

Interested? You can register here.

July 30, 2008

Preventing Disaster Rather than Recovering from It

We all like to think that we will be prepared in the event of an emergency, or a disaster. Hospitals exist if we fall sick; fire stations surround us if flames break loose; we are constantly preparing so if a catastrophe strikes, we are ready.

Preparing for a system’s disaster is no different. However, how to go about preparing for an event like this can be confusing. There are many options out there when it comes to protecting your system, each best suited for specific requirement. Unfortunately, many vendors use terms like disaster recovery and high availability interchangeably to describe their solutions when in fact they are usually designed for one or the other.

Disaster Recovery (DR) is the way to recover applications and from a system failure. DR is a reactive solution where if a failure occurs, IT relocates the data, builds the system over, and brings everything back up to working order. This takes time, a precious commodity that typically businesses relying on critical applications don’t have. In addition, recovering applications could bring about a number of side effects which you really don’t want to endure every time some minor failure happens.

But what if I could tell you that instead of worrying about how to recover from a computer system failing, you could simply prevent it from occurring at all?

Disaster tolerance (DT) is a proactive way to prevent system failure from impacting application and data availability. A disaster tolerant solution isn’t going to recover the data if there’s a disaster. Instead it will tolerate the fault if a disaster occurs – keeping an organization’s critical applications up and running at all times. It is not recovery, but rather prevention. And with solutions like our everRun SplitSite, separate servers don’t even need to be in the same building – they can be up to 100 miles apart with fault-tolerant protection between the two locations.

DR solutions are good for applications that can afford some downtime while you recover them. But for essential applications like Microsoft Exchange, SQL, and SharePoint, which need to be available all the time, disaster tolerance is often the best way to go.

So what combination of DT and DR protection would work best for your company’s applications?

June 30, 2008

Virtualization and Availability Webinar Q&A Continued

Following last week’s discussion, event attendees had additional questions that we didn’t get to answer even though we went ten minutes over. We wanted to continue the discussion here on our blog so we figured we would post the continuation of questions and answers for everyone to see. As we mentioned before, if you would like to view the presentation delivered last week by John Humphrey’s (IDC), Simon Crosby (Citrix) and Jerry Melnick (Marathon), download the presentation here.

Are there any performance limitations with everRun VM?

everRun VM supports any guest environment created by XenServer, including multi-CPU VM’s.

Effect of losing inter-server link?

As a best practice we recommend two Availability Links for redundancy. If one should be lost, we will continue to operate unaffected using the remaining one. If both are lost we will take action to prevent complete loss of the VM or SplitBrain.

How far apart can the two machines be – i.e. is there a propagation delay issue?

Host separation is a factor of network latency, which must be <10ms round trip. Current deployments have exceeded 100 miles.

In case of a disk failure, does everRun rebuild the disk from the good physical host to the bad one?

Correct. Recovery of storage is handled as a background task so as not to require downtime or otherwise impact the running VM and application.

When will level 3 of everRun VM be available?

Level-3, System-Level fault tolerance is scheduled for later this year.

What requirements are associated with the everRun Level 3 Protection? (Bandwidth, latency, etc.)

Network and configuration requirements are the same for level-2 and level-3 protection.

Is StorServer a similar or competitive product to everRun?

StorServer is a backup appliance, not a fault-tolerant availability solution, and addresses very different requirements. It would be more complimentary then competitive.

What virtual machines (VMware, Parallel, etc) are supported by Marathon?

Currently only Citrix XenServer, however future plans are to expand upon this.

Are there certain applications that are not suited for everRun, such as I/O or compute intensive apps? Home does DR configurations affect performance?

This is very dependent on the configuration of the server, the VM, the storage and all other components. Appropriate best practices should be followed to ensure optimal performance for all applications.

Can Marathon support physical to vm HA? Does Marathon’s product fully support FC/iSCSI SAN shared storage between protected physical and/or vm pairs? Does Marathon product support local site HA server pair with a third node at a remote site in the event of site failure? Does Marathon product have latency limitations?

Marathon offers solutions for physical and virtual servers. These solutions utilize the same proven fault tolerant technologies however are independent of each other. everRun VM supports any type of storage that is supported by XenServer. Fault tolerance is configured using two VM’s. However we will soon be releasing an asynchronous solution that will allow a third replicated system at a local or remote site. Because everRun VM is a synchronous solution there is a latency requirement of 10ms round-trip between hosts. Our asynchronous solution will not have any latency requirements.

What is the pricing of everRun VM?

everRun VM lists at $4500 when bundled with XenServer Enterprise, and $2000 if you already have XenServer.

Thanks for all of your interest and questions.