This week I have been paying close attention to the developments of Hurricane Irene. In the beginning, Hurricane Irene looked like she was going visit Florida on her journey to the north. Even though it looked like Florida was going to get hit by this storm, it was still early and there was time for the storm to change course. It was also time to go out and make sure my Hurricane Supply Kit at least had the basics like batteries and flashlights as well as filling up the gas tanks of the cars. I have different levels of preparedness which depends on how close the storm is and the projected path. Just like I have steps in place to be prepared for the storm, most companies that I have worked for in Florida have a storm plan in place and like myself, do not sound the real alarm until the storm is 48 – 72 hours away from a hit but start to prepare for the alarm in case it is needed.
Disaster Planning and testing has really changed over the last decade. I can remember when I was working at a hospital and we look the annual pilgrimage to the Sunguard facility to restore the environment. This process would take days to restore and we had quite a few issues restoring to unlike hardware. During this test I got to meet up with and talk with people that were at Sunguard from New Orleans and were here for the long haul. That was the way it was before virtualization and come next year’s test, things were going to be a lot different as we redesigned the Disaster Recovery plan and really took advantage of the power with virtualization. What we ended up with was a one way OC12 SRDF connection between two Symmetrix storage arrays separated by about 1000 miles. A few stand-by VMware ESX servers would remain running at the remote site awaiting their call to duty. Once the State of Disaster call has been made the stand-by hosts will scan for the replicated luns that are presented, searches the luns for all the .vmx files, register and start the virtual machines. A very slick automated plan that can work in minutes instead of days. But this design is still very early in the virtualization explosion.
Since then, VMware has come out with SRM for its automated disaster recovery tool and most all Storage providers have some kind of replication in place as well. This is a real game changer as far as disaster recovery goes in my book. Living in Florida, We have tourist season, love bug season, hurricane season and holiday season. Every year from June 1st though Nov 1st, we except the fact that we could have a hurricane or two this year and we make plans for the big one that just might hit. One of the good things about Hurricanes is you have time to put your plans place. Enough time in face that with the site to site replication outside the storm’s path. We could fail over the site before the storm hits and once the all clear is giving, make plans to fail the site back.
Since Hurricane Irene has decided not to visit Florida, but to take a vacation in the Bahamas on its way to Washington DC, New York City and or Boston. I have to wonder about my peers in these northern states. It is not like these cities get Hurricanes all the time and unfortunately, sometimes we do not learn from others until something really happens to us. Hurricane Katrina was a prime example of that. I hope my friends and colleagues have plans in place, as well as started to implement those plans because it looks like you have a big one coming your way.
Since the storm is going to arrive during the weekend before VMworld in Vegas, I hope no real disaster comes to you or your Data Center. Be safe and see you in Vegas.
Hello,
Another item to add to the list: remember to check your emergency generator’s fuel supply delivery schedule far in advanced. Mine is every Thursday, so we may be out of luck if they cannot route a truck this way. I believe we have enough fuel, but you never know. One more thing to add to the list of monthly/quarterly disaster recovery checks.
— Edward aka @Texiwill
Good article, Steve. Something I always emphasize when talking DR – it’s not enough to have a disaster plan! You HAVE to TEST the plan at least once a year. I’ve seen far typo many instances of a great plan that wasn’t tested (a.k.a. shelf_ware) that, when it came time to actually use, proved worthless. It could be something simple like there had been an update to the data protection software and some of the steps were a little different to something serious like you had changed from fiber storage to NFS. In any case, during an emergency is not the time to discover the problem!