Ask any IT or data center manager what the most frustrating aspect of running a data center is and they will most likely tell you it is downtime. Downtime is an outage of computer service, network connectivity or online service for a period of time. The period of time could be as short as a few seconds with no limit of length. While a few seconds of downtime may sound like no big deal to most people, data center managers and those who have worked in IT departments know differently. Downtime is a big deal. Downtime is frustrating for businesses, for customers, for IT departments and for data centers. And, beyond frustration, downtime is costly. A recent report from Gartner noted the high cost of downtime, “Based on industry surveys, the number we typically cite is $5,600 p/minute, which extrapolates to well over $300K p/hour.” In order to prevent downtime, data center and IT department managers must try to understand the root cause of downtime. The trouble is, there can be a variety of causes when it comes to downtime. But interestingly, the most common cause of downtime is actually human error. The good news about this is that human error, at least in many cases, is preventable.
When talking about human error, you might wonder just how much downtime does human error account for? Data Center Knowledge notes just how much human error can influence downtime in data centers, “Data center downtime is often the result of equipment failure, or a chain reaction of unexpected events. But one of the leading causes of data center downtime is human error, as ComputerWorld reminds us in Stupid Data Center Tricks, which relays anecdotes of data center mishaps. The story notes a study by The Uptime Institute, which estimates that human error causes roughly 70 percent of the problems that plague data centers today. How can this problem be mitigated? “There is no doubt that human errors in the data center causes a great deal of downtime and some of these can be avoided by adhering to some simple steps,” said Ahmad Moshiri, director of power technical support for Emerson Network Power’s Liebert Services business.” 70 percent is a staggering statistic. A lot of human error is avoidable and Data Center Knowledge offers some valuable, practical tips that all data centers can implement.
1. Shielding Emergency OFF Buttons
2. Documented Method of Procedure
3. Correct Component Labeling
4. Consistent Operating Practices
5. Ongoing Personnel Training
6. Secure Access Policies
7. Enforcing Food/Drinks Policies
8. Avoiding Contaminants
To truly minimize downtime that is caused by human error a lot of discipline will be necessary. Practical steps are great in theory but they must be implemented with dedication and procedure for them to truly work. With patience, persistence and time downtime as a result of human error can be greatly diminished.