Both natural and man-made disasters can have a significant impact on all the servers and storage at your data center. When that happens, do you know what your next steps are? Here is a guide to help you get through the critical period immediately following any kind of disaster that impacts your data flow.
Outline a Plan and Anticipate Problems
Before discussing the steps to take after a disaster, it’s important to review what you should do long before any problems occur. Nobody wants to have problems in their data center, but the list of potential issues at a data center are long and varied, depending on several different factors. Understanding the types of threats you might face and creating a plan that includes contingencies for several different threats is critical. Some common threats to anticipate include:
- Malicious data center attacks (cyber-attacks)
- Weather events related to your location, such as tornadoes, hurricanes, floods, earthquakes, or wildfires
- Power failure or power surges
- Fire or flooding from within the building
- Security concerns
- Limited resources availability, such as droughts that impact water usage
Set Up and Run the Data Center Efficiently
While the setup of your data center will likely take place long before a disaster strikes, the way that you have your servers, cables, backup, and cooling system laid out can have a big impact on how well and how quickly you can recover following a problem. Avoid common problems that could impact your ability to get to your servers quickly following a disaster scenario.
- Map out the data center so that all machines are accessible without tripping over wires and cables.
- Keep the wiring clean and easy to follow so if you need to unplug or re-route just one cable you won’t have to deal with a tangled mess of wires where you end up just guessing which one is the right intranet cable (and crossing your fingers and hoping for the best when you unplug it).
- Keep entry and exit doors clear and ensure proper security so only employees with proper clearance can access the room.
- Enforce a zero-tolerance policy banning food and drink inside the data center to prevent unnecessary disasters from spilled drinks or food.
- Spread out the electrical controls and keep them covered so it is difficult or impossible to “accidentally” shut off the power.
- Have a secure off-site facility with backups for your most critical data in case all the servers and information in your data center are destroyed.
Remain Calm and Execute Your Plan
Even with plenty of safeguards in place and a plan outlined that anticipates potential problems, every data center is still at risk. When the servers go down and the room is dark following any type of natural or man-made disaster, it can be difficult not to panic. Before you start running around like a crazy person, take a deep breath and go about the recovery process methodically.
1: Get Your Backup Power Systems Running
Every data center should have a reliable uninterruptible power supply (UPS) system so when the power goes out the servers can continue running. Have dependable UPS batteries, and check them and replace them regularly. The wrong time to remember that you should have replaced your batteries is when they don’t work following a disaster.
2: Start with Mission-Critical Servers
Your data center might be filled with several rows of servers, so you need a map of which ones are considered “mission critical,” meaning that your business cannot operate without them. When these servers go down and information is unavailable to your employees and customers, the cost to your business is high, and every minute counts. It is helpful to have mission-critical servers clearly labeled and located in the same area of the data center so you know you’re working on the right ones and you won’t spend extra time running back and forth through rows of servers to find them. Keep a printed version of the data center map somewhere that is easily accessible (if it’s stored on a hard drive and the computers are down it won’t be much use).
3: Recover Second-Level Systems
After mission-critical servers are restored, move on to the systems that support your day-to-day business operations and make work easier and more convenient for employees, such as reporting, forecasting, and other similar tools. These are not mission-critical and will have minimal long-term financial impact on the company but should be restored as quickly as possible to keep operations running smoothly.
4: Consider Interdependent Systems
If you have interdependent systems that rely on each other to function, you will need to restore the entire system before you move on to another part of the recovery plan. Even some modular software systems must be completely intact to function correctly, so knowing which systems are interdependent and looking at a big-picture recovery plan can help you focus on the ones that need your attention first.
5: Avoid Unnecessary Work in the Recovery Process
As you begin rebuilding after a disaster it can be difficult to sort out what is necessary and what is not, but spending hours restoring non-critical data can slow down your entire recovery process and be frustrating for management and customers. Your disaster plan should include a list of non-essential systems, such as historical data, test systems, and employee Intranet, and other non-critical libraries that can be omitted from the initial restoration process to save time.
6: Spend Your Resources Wisely
If your backup systems only have the ability to support a portion of your data center, make sure the most important parts are hooked up to the UPS power supply. Don’t waste limited capacity on non-essential systems.
7: Cross-Train Other Employees to Start the Recovery
In some cases a natural disaster, illness, injury, or other problem may prevent your IT people from getting to the data center to begin the recovery process. For this reason your business should have several other individuals outside of IT who are cross-trained on the basics for restoring critical systems so they can get the process rolling even without an IT person available.
You can’t anticipate disasters, which is why it’s so important to have a plan in place, and review and practice your plan regularly so when it does happen you are ready. Talk to Titan Power today to find out more about creating and maintaining your data center so it’s ready for any disaster.