bamed.org | chown -R bamed. ~/base

Disaster Recovery and Business Continuity Plan on a Budget

I’ve been working on our Disaster Recovery Plan (Backups) lately, and I’m trying to do what I can so that we don’t just have disaster recovery, but we also do what we can to guarantee business continuity (things keep working).  Of course the big problem is I don’t have any money to do this with (at least not much).  If you’ve been following my blog you know that we use virtual machines, and that I now have iSCSI capability.  So after a lot of thought, this is the Disaster Recovery and Business Continuity Plan I’ve come up with, or at least an overview of it.

First I do need to get a little more hardware ( < $1000 worth), but when it's all said and done I should have:

Server1:
2x Intel Xeon 2.4GHz
4GB DDR RAM
1 80GB HDD - for host OS and VM's to run on
2x 250GB HDD's, striped - to host all data
2 Gb NIC's

Server2:
AMD Athlon64 AM2 3800+
4GB DDR RAM
1 80GB HDD - for host OS and to backup VM's
2x 250GB HDD's, striped - to backup data
1 Removable HDD Bay with 2 tray's
2 500GB SATA drives for removable HDD trays - for offsite backup
2 Gb NIC's
Both servers will be running Ubuntu and VMWare.  Server2 will also be setup as an iSCSI-target while Server1 will have an iSCSI initiator.  I will then access the HDD's in Server2 through iSCSI from Server1 and setup software RAID1 between the drives of the same size.  So the 500GB striped RAID0 set from Server1 will be mirrored to the striped RAID1 set on Server2.  The 80GB HDD's on both servers will be partitioned so that the host OS is on it's own partition and the guest OS's on another.  The partition with the guest OS's will also be mirrored between the two servers using a combination of iSCSI and RAID1.  Then Server1 and Server2 will be directly connected to each other through the extra Gigabit NIC's on their own subnet and all iSCSI traffic will travel over this direct connection.  Then Server1 and Server2 will be connected to the network through separate switches.  Finally I'll setup a heartbeat monitor between the two servers and create a script so that if Server1 is unreachable for a set amount of time Server2 will automagically start up the VM's that were mirrored from Server1 and all the data also mirrored from Server1 will still be available.  I am a little concerned about the performance of RAID1 mirroring over iSCSI, so I'll have to do some testing to see how that will work.  As far as I can tell, this setup, if all done properly, should help us guarantee business continuity in the event of hardware failure.

The second part to this plan is disaster recovery.  That's where the removable SATA HDD's come into play.  I'm going to run a standard backup to these HDD's nightly.  I'll probably just tar and gzip all files from all the drives to the 500GB removable drive, then encrypt using openssl.  I've done this before way back here.  That script mounted a NAS drive using smbfs so there was a 2GB file size limit.  No such limit here.  I believe the script in my blog was for a full backup, but I have another one I made that handles differential backups as well.  Right now I’ll probably only be able to afford 2 500GB HDD’s for the offsite backup, but I plan to eventually get one for every day of the week.  I will also be getting a SATA controller card that supports hot-swapping so I can swap the removable drives while the computer is still on.

Well, that’s the basic overview of my DR & BC plan.  I still have to get the hardware and setup all the software.  I’ll probably write a tutorial when I do in case anyone is interested in doing something similar.  I still need to research the heartbeat monitor.  I also need to figure out where a safe and convenient offsite location might be.  I live within 1 mile of the church and (worst-case scenario) a big tornado could take out the church and my house.  I also need to figure out who our backup personnel will be.  Since I’m the only IT person on staff, (again worst-case scenario) a big tornado that took out the church and my house could take out me as well, so somebody else needs to know where the off-site backups will be and what to do with them.

Write a Comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

 

Essentials