header-about-us-sub

Power Loss, Backup Power and the Healthcare Industry

hospitalThere are arguably few places in which reliable and sufficient backup power is more important than the healthcare industry.  While backup power is very important for a myriad of industries, in the healthcare industry, it can be the difference between life and death.  For the healthcare industry, just like any other industry, if power is interrupted or down for a prolonged period of time, critical operations and data are compromised.  Data could be lost or compromised and mission-critical operations interrupted.  But additionally, if power is lost and a sufficient backup power supply is not in place, critical life saving devices and technologies could become compromised and the health of patients risked.

For many, if a citywide power crisis occurs or some other natural disaster in which they lose power to their homes, they will look to hospitals to be a place of safety and refuge.  City residents rely on hospitals to be there when they need them, regardless of what other crisis may be happening in the surrounding city or state.  Imagine the chaos if a hospital loses power – lights out, medical devices not working, elevators not operating – it would be very frightening for everyone involved including medical professionals, hospital staff and patients.  Though no hospital thinks it will happen to them, as we have seen, it happens more often than it should.  Consumer Reports points out some notable hospital power failures and highlights why many failures happen, “When Superstorm Sandy knocked out power throughout coastal New Jersey and New York on Oct. 29, 2012, nowhere was the terror more palpable than in the darkened hallways and stairways of NYU Langone Medical Center and Bellevue Hospital. As both Manhattan hospitals lost power and their backup systems failed, nearly 1,000 patients had to be evacuated to other facilities. Most of us assume our local hospital will be a safe haven when disaster strikes. But that isn’t necessarily the case. In fact, hospital generators were known to be vulnerable long before Sandy. In 2005, after Hurricane Katrina struck New Orleans, about 215 patients died in hospitals and nursing homes, partly because generators stopped working. In 2011, a Connecticut hospital evacuated more than 40 patients when its generator failed during Hurricane Irene… A Consumer Reports investigation finds that while extended power outages in hospitals are uncommon, there are reasons for concern:

 

1.  Many generators are 50 or more years old.

2.  Many are housed in basements, and need to be protected from floods.

3.  Most hospitals can’t afford to move generators to safer locations, and federal and state governments aren’t likely to pick up the tab.

4.  There is no national standard for the installation of backup generators.

5.  Information about hospitals that fail generator tests isn’t available to the public.

 

To operate properly and be in compliance with national guidelines, hospitals must have sufficient backup power in place.  Unfortunately, common risks can go overlooked so it is important to understand what poses a threat to a steady and reliable hospital power supply.  Those risks include flood, fire, power outage, extreme weather, natural disaster, terrorist attack, and more.  Generators must be able to supply constant power in the event of an emergency or power outage because many life-saving machines and critical medical equipment require constant power.  Because of this, a sufficient backup power supply must be readily available at all times and redundancy is not just the cherry on the ice cream sundae, it is vitally important as well.  A sufficient backup power supply and redundancy are only good if one or both work well but that means maintenance and frequent testing must be part of the equation to ensure that they do not fail when they are needed most.

Many hospitals have long relied on generators to supply their backup power.  But, as Healthcare Facilities Today points out, generators are not necessarily the ideal option and should not be the only source of backup power for healthcare facilities, “Healthcare-related facilities could secure seamless operations using power management technology. Organizations relying on onsite generators for backup power are well aware that generators take time to get power to the critical equipment — interruptions in healthcare service delivery is not an option. Power outages, no matter how brief, can be dire risking patients’ lives undergoing surgery or under critical care at an ICU. While some healthcare organizations rely on smaller static UPS systems to support small, non-motor production loads, the most reliable UPS system for healthcare is the rotary UPS technology option.  The rotary UPS provides uninterrupted power for the entire infrastructure and eliminates risks of costly downtime while securing power for critical healthcare services… Healthcare organizations needing a 24×7 power supply can find the best value-add in a UPS system.  Most business-critical organizations select rotary-based UPS systems over costly static-based battery systems because battery systems require continued maintenance (consuming CAPEX funds) and must be regularly updated.”

To reduce power failure vulnerability, the healthcare industry must implement sufficient backup power supply and ensure that the load that needs support never exceeds the supply available.  In fact, it is ideal to not push the limits and, rather, keep the load at approximately 70-90% of what the backup power supply can offer in the event of an emergency.  The healthcare industry must have backup power for their servers to protect data and be able to access patient information and other critical data during an emergency.  Further, they must have steady, reliable power to medically-critical devices.  For hospitals and other healthcare facilities, there should ideally be virtually no lag time if a power loss occurs.  Consider a life-support machine – if there is even a few seconds of lag time between primary power and when the backup power supply kicks in, it could mean a patient loses their life.

Hospital Power Outage

Image Via: Consumer Reports

Redundancy and variety of backup power options is the name of the game for healthcare.  Never rely on just one type of backup power or you are sure to find yourself in a world of trouble when a power failure happens.  First, there should be off-site data storage to ensure that no data is lost if a power failure, fire, flood or other natural disaster occurs at the healthcare facility site.  That off-site data storage facility should also have sufficient backup power supply and redundancy.  Safely and securely utilizing the cloud is ideal because while it can be subject to cyber risk, it is unlikely to be impacted by a power failure or natural disaster.

Hospitals often take advantage of two primary types of UPS systems: double-conversion and parallel.  24×7 explains how double-conversion UPS works, “As the name suggests, these devices convert power twice. First, an input rectifier converts AC power into DC and feeds it to an output inverter. The output inverter then processes the power back to AC before sending it on to electronic equipment. This double-conversion process isolates critical loads from raw utility power, ensuring that the equipment receives only clean, reliable electricity.”  And, Critical Power explains how parallel UPS systems operate, “A parallel-redundant UPS system is one in which two or more modules are installed on the same system in what is termed an N+X arrangement (N being the power capacity required by the connected loads and X being the number of modules in addition to that capacity). Parallel-redundancy allows for the failure of one single UPS module in the configuration without the need for the protected load to be transferred to mains power. In such an event, the other UPS modules (all of which have spare capacity) can take over the total load.”

When choosing where to install a backup power supply such as a generator or other switchgear, think about the most secure location.  If it is a new install or you are looking to move your generator for optimal safety, install your backup power supply where it is most protected so that you can minimize its risk of being compromised.  Keep it in a location that is protected from inclement weather such as fire and flood – don’t keep it in the basement or on the roof if possible.  This will help reduce the odds of an extended utility outage.

When it comes to backup power supplies in the healthcare industry, you can never test them too much.  Prepare for the common reasons of power loss or the most likely scenarios and prepare for the unlikely and absurd power loss scenarios.  It is important to prepare for a variety of scenarios and exhaustively test your backup power supply to ensure that it will perform when it is needed most.  In addition to preparing for a variety of scenarios, backup power supplies should be consistently and routinely maintained and tested to ensure they are not only functioning but that they have an adequate power supply to support the full load if needed.  If a hospital experiences an outage that lasts even a few seconds, it can be chaotic, frightening and life-threatening.  All hospitals and medical facilities must implement proper backup power supplies, properly maintain them, prepare for a variety of scenarios, and ensure that the supply can support the full power load if necessary to give their facility the best chance of avoiding a power interruption or outage.

 

Posted in Back-up Power Industry, Computer Room Design, Mission Critical Industry, Power Management, Uncategorized, Uninterruptible Power Supply, UPS Maintenance | Tagged , , , , , , , , | Comments Off

The Dangers of Arc Flash in a Data Center and How to Avoid It

shock hazardA well-planned and well-managed data center anticipates potential hazards and works to minimize their risk. Additionally, by anticipating potential hazards, a data center can have appropriate response plans in case a potential hazard becomes reality.  At times, the level of preparation and planning can feel like overkill.  After all, what are the odds that the majority of these hazards or disasters will actually happen?  Probably very slim.  But, what happens if they happen to your data center.  It is far better to plan ahead then try to play catch up because playing catch up when disaster strikes usually means a disaster will be more expensive, damage more property, and possibly injure more people.  Arc flash is one potential hazard a data center may face.  But, it is not an unlikely, far-fetched hazard, it is very real.  Arc flash injuries are one of the most common on-the-job electrical injuries and can be incredibly harmful and even fatal.

A data center cannot simply turn off electricity to mission-critical servers if they need to be accessed or serviced.  Doing so would lead to downtime, frustrated clients, and lost revenue.  To understand why arc flash is concerning, it is important to have a clear picture of what arc flash is and how it can happen in a data center.  Consulting-Specifying Engineer elaborates on what arc-flash is and why it is such a significant hazard, “An arc flash is the result of an electric current passing through air as the result of conductor failure, equipment failure, or the accidental connection between voltage sources such as dropping a tool across buses in distribution equipment. The flash is immediate but the resultant release of energy can cause severe injury, and possibly death. There is a potential for a tremendous amount of heat to be released, which can result in overpressures, as well as flying debris. The energy released can cause temperatures exceeding 35,000 F, which can vaporize steel, copper, and aluminum. Inhaling these vaporized metals could be fatal. Injuries or fatalities could occur if personnel are in the area in front of an arc flash, which could send projectiles such as parts of metal buses away from the blast. Also, molten metal can cause significant burns, and the sudden air pressure increase can knock personnel off their feet. Each year, more than 2,000 people are treated in burn centers for injuries from arc flash incidents.”

What is truly frightening about an arc flash (as if it was not frightening enough already), is that an arc flash can set off a chain of harmful and dangerous events.  When an arc flash occurs in a data center, there is a significant amount of electricity around.  As the arc flash is happening, it can melt and even vaporize conductive material and with enough energy, can lead to an arc blast.  An arc blast is the result of the expansion of metal as it vaporizes.  An arc blast can cause additional injury and equipment damage as well as downtime.

Arc Flash Protective Gloves

Arc Flash Protective Gloves – Image via: OSHA

There are many things that can lead to arc flash.  Often, arc flash is the result of human error.  Additionally, if certain items such as tools or other items are stored on top of components, it can lead to an arc flash.  And, lack of preventative maintenance is a common reason arc flash occurs.  Preventative maintenance and visual inspections are one of the easiest things any data center personnel can do to prevent a myriad of problems, arc flash included.  Another reason arc flash can occur is if equipment was not installed properly.

As you can see, arc flash is highly preventable with proper planning, installation of equipment, and consistent maintenance.  It is important to take every precaution to prevent arc flash.  Not only because you can prevent the risk of equipment damage and downtime, but because there are certain safety standards put in place by OSHA (Occupational Safety and Health Administration) for individuals who work with electrical equipment.  If you do not take proper precautions, you could by liable for on-the-job injuries that could have been prevented.  To remain OSHA compliant, you must have a proper safety program with defined responsibilities in place.  Your personnel must wear personal protective equipment (PPE).   Additionally, personnel must be trained on the hazards of arc flash and how to use appropriate tools to create a safe working environment.  All equipment should be properly labeled with warnings.  And, it is important that you calculate the degree of arc flash hazard.

warning

Image via: OSHA

Many data centers think that if they simply install equipment the correct way and complete preventative maintenance, that they are doing enough to prevent arc flash.  It is important that a data center does not skip the arc flash analysis.  Calculating your degree of arc flash hazard is essentially a risk assessment that will provide you with the information you need to know whether or not you are doing enough or need to improve to prevent arc flash.  TechTarget explains why arc flash hazard assessments are best left to special expert contractors, “But there is no substitute for a formal arc-flash analysis. It’s not a simple job, but if it’s done right — especially for a new electrical design — it might identify ways to mitigate arc-flash hazards and identify work that could be done without either a shutdown or a hazard suit…A thorough arc-flash analysis requires a comprehensive short-circuit study first. While this is a standard requirement of the electrical design process, an arc-flash analysis must evaluate the electrical design from a different perspective to mitigate arc-flash hazards. A breaker coordination study must also be done with the same goal in mind, even though it may have been done before. Breaker coordination ensures that the breaker nearest the fault condition trips first. If done with arc flash in mind, a breaker coordination study could help you select different circuit breakers or fuses than might normally have been chosen.”

Once the degree of arc flash hazard has been assessed, a data center can begin to implement arc flash risk mitigation measures.

 

Arc Flash Risk Mitigation Measures:

 

–   Complete assessment of data center facility and all of its electrical equipment. By completing a comprehensive assessment of data center equipment and, in particular, the electrical system, you can begin to identify potential hazards or problems within your system.

 

–   Documentation of assessment in one place, where all past and future assessments can be maintained for reference. Make any notes about the condition of the equipment, as well as when it was last maintained so that proper maintenance is never neglected.

 

–   Implementation of protective devices to prevent arc flash. After assessments have been made, if protective devices, such as arc-resistant switchgear or other equipment, can be installed to prevent arc-flash.  Electrical Contractor explains how arc-resistant switchgear works, “Arc-resistant switchgear is designed to redirect arc energy up and out of the equipment through ducts/vents outdoors away from equipment operators. The system is designed with vent flaps that will open under the pressure of an arcing fault and redirect the super heated gases and arc flash energy up and out of the equipment, away from personnel.”

 

–   All equipment must be properly labeled with any necessary diagrams or safety warnings provided in plain sight.

 

–   Only qualified, trained personnel should work on electrical conductors or other electrical equipment.

 

–   Implementation of defined protection boundaries according to NFPA 70E guidelines.

 

–   Use of protective safety equipment because it minimizes arc flash exposure should an incident occur.

 

–   Grounding is another option to reduce the risk of arc flash, as Schneider Electric outlines, “The method of system grounding can have an impact on arc-flash hazards.  High-resistance grounded (HRG) systems are not new, but recently they have been promoted as an arc-flash mitigation means.  The idea is that HRG systems inherently limit the energy delivered to a ground fault by limiting the available fault current to just a few amperes, providing a great deal of protection.”

 

Implementing proper arc flash risk mitigation measures is an important thing for any data center to do.  Not only does it significantly reduce the risk of damage to equipment and the likelihood of downtime occurring but it also dramatically reduces the risk of injury to personnel.  OSHA has strict guidelines in place for a reason – to protect worker safety through reasonable safety expectations.  If you do not comply with NFPA 70E guidelines for arc flash protection, you can be cited by OSHA.  Protect your facility, your equipment, your customers, and your personnel from the hazards of arc-flash by calculating your risk of an arc-flash occurring and by implementing proper risk mitigation measures.

 

 

 

Posted in computer room construction, Computer Room Design, computer room maintenance, Data Center Build, Data Center Construction, Data Center Design, data center equipment, Data Center Infrastructure Management, data center maintenance, Datacenter Design, DCIM, Facility Maintenance, Mission Critical Industry, Power Distribution Unit, Power Management | Tagged , , , , , , | Comments Off

How to Know When It is Time to Upgrade Your UPS & How to Do So Effectively

titaninfo

One of the most important components of any data center is their Uninterruptible Power Supply (UPS).  The UPS is tasked with maintaining uptime in a data center should there ever be a power interruption.  It literally provides an uninterruptible power supply to mission critical infrastructure within a data center.  With proper maintenance, a UPS system can save a data center from downtime that is not only incredibly costly but frustrating and very problematic.  Data center UPS systems are not new; they have been used for decades.  And, just like any technology that has been around for that long, the technology has evolved and enhanced over time.  A data center UPS is a long-term investment and transitioning to a new UPS system could mean potential downtime (among other things).  For this reason, many data centers avoid upgrading their UPS system, if for no other reason than to “avoid the headache.”  But eventually, all data centers must upgrade their UPS system – so how do you know when it is time to upgrade and how do you do it in a smooth and successful way?

For data centers, it can be tempting to “leave good enough alone” with their UPS system.  The UPS seems to be doing an adequate job, it is still working, still providing its essential job duty, so what is the harm in leaving it alone?  Well, a primary reason that many data centers decide to upgrade their UPS system is that it will give them increased power capacity when they need it the most.  Data centers are increasing their infrastructure and rack density to accommodate growing server demands and accommodate various other needs.  If a power interruption occurs and the UPS kicks in to provide backup assistance, it has to be able to actually provide adequate power.  The UPS system you had in place 5 years ago may have been more than enough for your needs at the time, but is it really adequate now?  Have you really evaluated your power needs and what your current UPS system can supply?  If not, now is the time to do so.

As you likely did in the past, you need to determine what your current power needs are and anticipate what your future needs may be when choosing to upgrade your UPS.  TechTarget provides some helpful insight for determining power capacity needs for your new UPS system, ““Increased resiliency and MGE’s un-paralleled load protection will benefit our clients the most,” said Yaeger, “while increased power capacity and improved energy help us the most.” Beware of using the nameplate. This is a legality rating, and will usually give a much higher volt-ampere rating than the unit will ever draw. For example, consider a unit with a nameplate that reads 90 – 240 volts at 4 – 8 amps with a 500 watt (W) power supply. First, the numbers are backward. The larger amperage goes with the lower voltage. If you assume a nominal 120 volts at 8 amps, you get 960 VA. A PF of 0.95 would yield 912 watts. No power supply is that inefficient, and a power supply almost never runs at full power. Therefore, it is highly unlikely that this device will ever draw more than 500 watts of power, but if you want to be really conservative, multiply by 1.1 and figure 550 Ws of input power…Once you have a realistic load estimate, plan to run a UPS around 80% of actual rated capacity. That provides headroom for peak operating conditions, gives you capacity to install a duplicate system before you decommission an old one, or lets you absorb a little growth before you outgrow the unit.”

In addition to increasing your UPS capacity to meet the power demands of your data center, many data centers opt to upgrade their UPS system in an effort to improve energy efficiency.  UPS technology has evolved to be far more intelligent than it was even a few years ago.  Today’s UPS systems have more sophisticated monitoring capabilities that can be integrated with your data center infrastructure management and monitoring for a more comprehensive picture of what is going on in your data center.  Data centers are all looking for ways to improve energy efficiency. Better monitoring allows for data center managers to make more accurate and timely decisions about power in their data center, dramatically improving energy efficiency. Even small changes to improve energy efficiency can lead to significant savings over time.  EnergyStar reports that that small improvements in energy efficiency lead to big savings, “DOE estimates that a 15,000-square-foot data center operating at 100W/square foot would save $90,000 by increasing UPS efficiency from 90% to 95%.”

ups-158315_1280Once you have determined what the best UPS system is for your needs, you need to determine how to safely and effectively transition the upgrade so that you do not experience downtime or the downtime that is experienced is anticipated and well-planned-for.  Schneider Electric describes how to successfully transition to a new UPS system, “Replacing an older UPS system with a new one may be more complex and time consuming than upgrading especially if the UPS to be upgraded is already modular in design. Careful planning and execution are required in order to minimize UPS downtime during the swap. Some vendors offer a service to do this work as a turnkey project. If the owner’s operations team does not have the availability or expertise, ask the UPS OEM vendor if they can perform every task associated with this effort including remove/dispose the old system, install the new, startup and commission the system, as well as transition an existing service contract (if one exists) all under one order…At the electrical input of the UPS, verification that the feeder breakers and conductors powering the UPS will support a specific replacement UPS is essential…Verification at a minimum includes: visual inspection of breakers and conductors, confirmation of breaker maintenance, as well as a review of electrical system studies (load flow, short circuit analysis, protection coordination, and arc flash) using electrical characteristics of the replacement UPS as a basis for the study. Operational interaction of the replacement UPS with standby generator(s) should also be included in this analysis.”

Data center with network servers in futuristic room.When it is time to upgrade your data center’s UPS system, preparation is the name of the game.  You will want to assemble a diverse team of people including the data center manager, facilities manager, electrical contractor, other relevant engineers, and more.  By doing so, you have representatives with various knowledge-bases that can provide critical assistance and information during the transition.  If you cannot have any downtime whatsoever, you will need a temporary UPS while you transition to the new UPS.  If a short outage is ok, you can plan exactly when that will occur and anticipate appropriately.  You can install as many of the components of the new UPS system as possible in anticipation of the outage, leaving only the last minute parts of the installation for during the outage.  Whatever is best for your data center, plan for a smooth transition and anticipate any possible things that could arise and arrange contingency plans just in case.

If your data center’s UPS is outdated, inefficient, or incapable of managing your data center’s power needs, it is a clear sign that it is time to do something about your UPS.  The longer you wait, the more money you waste on inefficiency and the higher chance that your data center will experience downtime due to a power interruption.  By anticipating current needs, as well as anticipating the desire to scale to requirements in the future, you can choose the right modern UPS system for your unique data center needs. Making the transition to a new UPS can dramatically improve efficiency and the investment will more than pay for itself over time.

 

 

Posted in Back-up Power Industry, data center equipment, Data Center Infrastructure Management, data center maintenance, DCIM, Facility Maintenance, Power Management, Uninterruptible Power Supply, UPS Maintenance | Tagged , , , , , | Comments Off

Hyperconvergence Solutions in the Data Center

hyperconvergence_image3Hyperconvergence is not just a buzzword, it is the future of data center operations.  First there were converged data centers and now there are hyperconverged data center infrastructures that address silos that previously posed a significant problem. With hyperconverged data centers, storage, compute, and network components are optimized for better collaboration that bypasses silos.  Scaling infrastructure as needed is far easier to achieve with a hyperconverged data center.  For this reason, data centers with their eyes on future needs and scalability must be focused on hyperconvergence.

 

What is Hyperconvergence

Because technology and needs are constantly changing, data centers traditionally have a mixture of components and infrastructure.  Though each component of infrastructure served its purpose, the components did not communicate well with each other, silos were created, and data center operations became more complex and over-wrought than necessary.  BizTech expands on how hyperconvergence is changing data centers for the better, “ Gartner reports that by 2018 hyperconverged integrated systems will represent as much as 35 percent of total converged infrastructure shipments by revenue, up from a low-single-digit base in 2015…Hyperconverged solutions are clusters that typically leverage commodity components to support software-defined environments and related functions,” King observes. “Hyperconvergence delivers a radical simplification of the IT infrastructure,” says Jeff Ready, chief executive officer of Scale Computing, a virtualization and convergence provider. The approach consolidates all required functionality into a single infrastructure stack running on an efficient, elastic pool of processing resources, leading to a data center that is largely software-defined with tightly-integrated computing, storage, networking and virtualization resources. Hyperconvergence stands in contrast to a traditional converged infrastructure, where each of these resources is typically handled by a discrete component that serves only a single purpose. “Hyperconvergence takes the headache out of managing the infrastructure so that IT can focus on running the applications,” Ready says.”

For data center managers, hyperconverged infrastructure may sound like a major change, and maybe even overkill, but the majority of data centers making the switch, it is becoming the industry standard.  And, as soon as data center managers implement hyperconverged infrastructure into their data centers, they quickly realize the resistance was not just futile but that they should have made the switch sooner!  In fact, once hyperconvergence is integrated, configuration and deployment actually reduces headaches and streamlines operations.

 

Advantages of Hyperconvergences

Network switch and ethernet cables,Data Center Concept.No data centers would be making the change to a hyperconverged infrastructure if it was without advantages.  Fortunately, the advantages are many!  Though the upfront cost may seem significant, the investment will more than pay for itself.  One of the most significant advantages is scalability.  Data centers technology and storage needs are constantly changing and growing.  Many data centers are scrambling just to figure out how to not “outgrow” their current facility and infrastructure.  With the old infrastructure, it is very, very difficult to do so.  With each component or piece of infrastructure added, it may serve as a temporary Band-Aid but make no mistake; it is not a permanent solution.  Hyperconverged infrastructure makes it far easier to scale.  EdTech underscores just how beneficial it is to make it easy for data centers to scale storage and infrastructure, “For many institutions, a chief benefit of hyperconverged solutions is the high degree of scalability they offer, a benefit that has led some tech observers to compare hyperconvergence with the public cloud. “We can add nodes to the initial cluster and leverage all of those resources across the cluster,” Ferguson says. “I can add another node and add more storage. Or, if I really want a compute-focused piece, I can add a very compute-centric node. You can mix and match all of those.” Western Washington University deployed a Nutanix  hyperconverged solution in 2015. Jon Junell, assistant director of enterprise infrastructure services at WWU, agrees that the ability to easily add capacity is a valuable time-saver. “A few clicks, and it sees the other nodes, and away you go,” he says. ‘The management gets out of the way, so you can go back to doing the true value-add work.’”

In addition to scalability, another advantage is the ability to have everything under one umbrella.  Because everything is in “one box” it means that interoperability is a breeze.  Without hyperconverged infrastructure, components not only have a harder time working together but it dramatically increases the potential that something will break down or go wrong.  With hyperconverged infrastructure, this is not the case.  Having everything streamlined will reduce operating costs and dramatically improve data center management through ease of operation.

Further, a major advantage of hyperconvergence is how easy it is to deploy.  As mentioned, many data center managers may bristle at the implementation of something so significant and new but deployment really is outstandingly easy.  Rather than trying to connect and communicate with multiple subsystems, there are hyperconverged options that are literally “plug and play.”  This means that they come in a single box and can literally be deployed by plugging everything in and going.  And, the plug and play doesn’t stop there.  Adding storage is as simple as plugging more in.  The amount of time and money that would typically take to deploy or expand infrastructure is dramatically decreased and significantly streamlined by hyperconvergence.

 

Different Hyperconvergence Solutions

Hyperconvergence is being deployed in different ways depending on data center size and specific needs.  There are different hyperconvergence solutions to meet needs and each solution can be used in a variety of ways and then scaled as needed.  Data Center Knowledge takes a closer look at how data centers are using hyperconverged infrastructure, “Today, it’s used primarily to deploy general-purpose workloads, virtual desktop infrastructure, analytics (Hadoop clusters for example), and for remote or branch office workloads. In fewer cases, companies use it to run mission critical applications, server virtualization, or high-performance storage. In yet fewer instances, hyperconverged infrastructure underlies private or hybrid cloud or those agile environments that support rapid software-release cycles.”  Though this may be the case for now, industry experts believe that this will change as more and more data centers adopt hyperconverged infrastructure.

AdobeStock_88603767Though there are similarities between converged data centers and hyperconverged data centers, there are some important differences.  While both converged critical resources and allow data centers to increase density, they are managed in a very different way.  Hyperconverged infrastructure employs one important difference that is a significant game changer for data centers – everything is managed virtually which makes cloud setup easy and reduces the complexity of systems for a more streamlined and efficient management of operations.  No data center hyperconvergence solution is necessarily “better” than another, but there are certain solutions that are better suited to your specific data center. There are a variety of data center hyperconvergence solutions and Data Center Knowledge emphasizes the importance of choosing carefully for your unique needs, “Still, make sure you do your research and know which hyperconverged infrastructure technology you’re deploying. Each hyperconverged infrastructure system is unique and has its benefits and drawbacks. For example, make sure your hyperconverged infrastructure system doesn’t only work with one hypervisor. A converged system that only supports a single hypervisor, such as VMware vSphere, unnecessarily locks an organization into a single vendor and associated polices, including licensing fees and migration complexities. Similarly, if you deploy a hyperconverged infrastructure solution which comes with in-line deduplication and compression enabled (and you can’t turn it off), you need to make sure the workloads you host on that infrastructure can work well in that kind of environment.”

Hyperconvergence is not just a compelling notion.  By all industry expert predictions, it is the future of data centers. A hyperconverged infrastructure is not just something that is beneficial for massive data centers from Yahoo or Apple; hyperconvergence infrastructure works for data centers of all sizes.  Not only does hyperconvergence change how data centers can scale and streamline operations management, but it also increases profit and return on investment by increasing capabilities and decreasing waste.  For small to mid-size organizations, it is a cost-effective and efficient way to leverage IT investments and get the most bang for your buck in the long term.  The way size and volume of data, and how it is used is a clear indication that easy scalability is not a luxury but a necessity.  Hyperconvergence will make growing and changing data needs manageable and improve overall data center operations going forward.

 

 

Posted in Cloud Computing, Data Center Design, data center equipment, Data Center Infrastructure Management, data center maintenance, DCIM, Hyper Converged Infrastructure | Tagged , , | Comments Off

Server Room Fire Suppression Best Practices

datacenter45Data centers must delicately balance the need for infrastructure and equipment that runs all day and maximizes uptime with the need to manage heat and fire risk associated with electronic equipment.  This is particularly true in server rooms.  Server rooms are the heart of a data center, the hub of information.  If a server room experiences a disaster of any kind, it typically leads to downtime. Server rooms must have proper air conditioning but that is not enough, they must also have appropriate fire suppression measures in place to reduce the risk of damage, injury, and downtime.  There are many threats to data center operations but perhaps one of the most significant is fire.  Other threats may pose a risk of significant downtime but are likely to only result in moments of downtime.  Fire, on the other hand, can cause permanent damage to equipment, injury to personnel, and prolonged downtime as a result.  When it comes to fire suppression in data centers, negligence to implement suppression is simply unacceptable – a true recipe for disaster.

The statistics surrounding fires in data centers may not sound all that scary at first glance – according to NetworksAsia, only about 6% of infrastructure failures are caused by fires.  That kind of statistic may make you feel comfortable, like you do not really need to worry much about the risk of fire since you take appropriate precautions to ensure that your server rooms are cooled correctly.  But, make no mistake, the risk is real and if it happens to you, it may not just lead to downtime, but to your data center closing its doors.  Data Center Knowledge provides a wakeup call to all data centers about the very real risk of data center fires, “A small data center in Green Bay, Wisconsin was wiped out by a fire earlier this month, leaving a number of local business web sites offline. The March 19 fire destroyed 75 servers, routers and switches in the data center at Camera Corner/Connecting Point, a Green Bay business offering IT services and web site hosting…But it took 10 days to get customer web sites back online, indicating the company had no live backup plan…While the company discussed the usefulness of its fire alarms, it didn’t address whether the data center had a fire suppression system. But it doesn’t sound like it. The Green Bay Press Gazette describes “racks of blackened, melted plastic and steel.” We’ve previously looked at data center fire suppression tools and how they have evolved with the industry’s recent focus on environmental considerations.”

server room fire

Image of Server Room Fire via: Bangkok Post

Fire prevention and fire suppression should be a part of any data center disaster recovery plan.  It is important to consider what types of fire your data center is most at risk of, as well as the size of your data center, to determine the appropriate fire suppression system for your disaster recovery plan.  Your data center’s form of backup and the specific strategies for your disaster recovery plan will heavily influence the type of fire suppression system that you use.  If you have a minimal or “bare bones” disaster recovery plan, you may want the most elaborate and effective fire suppression system because you need it to work as effectively and quickly possible.  If you have a comprehensive disaster recovery plan and robust backup/redundancy, uptime is less dependent on your fire suppression system.  But, in the end, every single server room must have a fire suppression system that is more effective and comprehensive than “calling 911.”

To understand fire suppression needs and make an informed decision when choosing a fire suppression method, it is important that you understand what types of fires can occur in a server room or data center.  TechTarget explains the types of fires data centers are at risk of:

“In North America, there are five fire classes:

  • Class A: Fire with combustible materials as its fuel source, such as wood, cloth, paper, rubber and many plastics
  • Class B: Fire in flammable liquids, oils, greases, tars, oil-base paints, lacquers and flammable gases
  • Class C: Fire that involves electrical equipment
  • Class D: Fire with ignitable metals as its fuel source
  • Class K: Fire with cooking materials such as oil and fat at its fuel source

No matter where your data center is located, fire can be considered a potential disaster. Data center environments are typically at risk to Class A, B or C fires.”

sprinklerThere are two primary types of fire suppression systems: water sprinklers and gaseous agent fire suppression solution.  Water sprinklers are a very traditional type of fire suppression system and they are the most common type.  They are particularly popular because they are low cost, may already exist in the server room in the first place, and they are effective.  Once they have been activated they will continue to expel water until they have been shut off. The main problem with water sprinklers is that they can cause significant damage to the equipment.  With the goal of remaining operational and maximizing uptime while preventing catastrophic fire, dramatic water damage could still lead to downtime.  Additionally, water sprinklers could accidentally become activated and cause unnecessary damage. And, while sprinklers systems are inexpensive, the water damage that they cause is not.

For this reason, many data centers and server rooms implement pre-action water sprinklers.  Pre-action water sprinklers work in a similar way but take extra steps to prevent accidental activation and the ensuing damage.  In traditional water sprinklers, the water is kept in the pipes, right at the nozzle awaiting activation.  With pre-action water sprinklers, the water is not kept in the pipes all the way to the nozzle.  The upside is that it is still a low cost system and traditional water sprinklers can be converted to pre-action systems.  Pre-action systems require two events/alarms to activate the system, rather than one, significantly reducing the risk of accidental activation.

fire suppressionGaseous agent fire suppressant solutions are a newer technology and are more effective in suppressing a wider and more significant range of fires. Gaseous agents are delivered in a similar fashion to water sprinklers – the agent is stored in a gas tank and then piped into overhead nozzles and administered when activated.  This is the preferred method of fire suppression for server rooms because it is more effective at fire suppression when electrical equipment is involved.  The Data Center Journal describes exactly how gaseous agent fire suppressant systems work, “The Inert Gas Fire Suppression System (IGFSS) is comprised of Argon (Ar) or Nitrogen (N) gas or a blend of those gases. Argon is an inert gas, and nitrogen is also unreactive. These gases present no danger to electronics, hardware or human occupants. The systems extinguish a fire by quickly flooding the area to be protected and effectively diluting the oxygen level to about 13–15%. Combustion requires at least 16% oxygen. The reduced oxygen level is still sufficient for personnel to function and safely evacuate the area. Since their debut in the mid 1990s, these systems have proven to be safe for information technology equipment application.”  In essence, they are able to suppress fires while minimizing risk to electronic equipment.  The problem with Halon gaseous agent use is that it is no longer in production due to being a health risk and environmental danger.  But, there are Halon replacement agents available that work in a similar fashion without the risk to health or environment.  Though more effective than sprinkler systems for certain types of fire suppression, and though they carry less risk of damage, they are more expensive and cannot run continuously until shut off.  They will only run as long as the gaseous agent is available. Once the tank is empty – fire suppression will stop.

Server rooms pose the most significant risk of fire in a data center because they typically have the highest concentration of electricity and contain combustible materials.  It is absolutely imperative that, should a fire be sensed, fire suppression begins immediately and alarms sound, alerting personnel that it is time to evacuate and take disaster recovery action.  A server room is, at its core, the heart of a company’s information structure.  If the server room experiences a fire, downtime is highly likely.  But, if suppression methods are effectively and efficiently activated, downtime and damage maybe avoidable.

Posted in Data Center Build, Data Center Construction, data center cooling, Data Center Design, data center equipment, Data Center Infrastructure Management, Data Center Security, DCIM, Facility Maintenance | Tagged , , , | Comments Off

Flywheel vs. Battery UPS

flywheel vs. Battery UPS imageEvery data center utilizes a UPS – Uninterruptible Power Supply – to ensure that power is always available, even in there is a power interruption.  Minimizing downtime while maximizing energy efficiency is a primary goal of any data center or enterprise which is why choosing the right UPS is so important.  The UPS begins supplying power immediately upon sensing that the primary power source has stopped functioning.  This is important because it maximizes uptime which helps prevent frustration and financial loss, as well as prevents the loss of data.  The UPS stores power and sits in waiting until it is needed but it requires things like maintenance and testing to ensure it is ready to be used when needed.  There are two primary types of UPS: Flywheel and Battery and there are pros and cons to each that a data center must carefully weigh.

A flywheel UPS (or sometimes referred to as a “rotary” UPS) is an older type of UPS but is still a viable option for modern data centers.  Flywheel UPS and battery UPS provide the same essential function, but the way that function is achieved, the way energy is stored, is different.  Flywheel batteries store kinetic energy that remains waiting for when it is needed.  Flywheel systems pack a large energy density in a small package.

Flywheel UPS systems tend to be significantly smaller than battery UPS systems.  This can be an advantage when data center square footage is a premium.  Further, Flywheel UPS systems are easier to store – they do not need as much ventilation, require less maintenance, and do not need special disposal arrangements to be made when their lifespan is complete. Flywheel UPS systems can literally last decades with a minimal amount of maintenance which is a stark contrast to battery UPS systems.

batteryOne of the most significant drawbacks of a flywheel UPS system is its power output capacity when compared with battery UPS systems.  TechTarget explains this key difference, “The UPS reserve energy source must support the UPS output load, while UPS input power is unavailable or substandard. This situation normally occurs after the electrical utility has failed and before the standby power system is online. As you determine whether flywheels are appropriate for a project, the amount of time that the reserve energy must supply the UPS output is key. For comparable installed cost, a flywheel will provide about 15 seconds of reserve energy at full UPS output load, while a storage battery will provide at least 10 minutes. Given 15 seconds of flywheel reserve energy, the UPS capacity must be limited to what one standby generator can supply.”  Though flywheels cannot deliver the same length of power output that battery UPS systems can, multiple parallel flywheels can be installed so that they all supply backup power in the event that they are needed.

Something important to consider is the type of data center.  If your data center is part of a larger network of data centers then if power failure occurs, another data center could take over the data load and support your data center for a short time until you are back online.  Many data centers are employing this network structure as a better means of maximizing uptime and efficiency.  If this is the case, something like a flywheel UPS system may be ideal because you do not need a prolonged power supply in the event of an emergency.  A shorter UPS runtime is all that is needed.  But, make no mistake; many data center managers still want the maximum amount of time possible when it comes to UPS capacity.  Further, some data centers are opting for a hybrid UPS system that employs both battery and flywheel. While the initial investment in a hybrid UPS system may be more, it should pay for itself in a matter of a few years.

Another important consideration is energy efficiency since many data centers are trying to become more “green.”  Though flywheel UPS systems are often thought of as the green option, Schneider Electric points out that this common assumption may be incorrect, “The results may come as a surprise to many. In almost all cases, VRLA batteries had a lower overall carbon footprint, primarily because the energy consumed to operate the flywheel over its lifetime is greater than that of the equivalent VRLA battery solution, and the carbon emissions from this energy outweighs any carbon emissions savings in raw materials or cooling. Of course, the tool lets users conduct their own comparison to see for themselves. This analysis and tool are a good reminder that decisions around energy storage needs to factor in a number of variables.”  It is more apparent than ever before that ever data center must evaluate their unique, individual needs, as well as their energy goals and uptime goals when choosing which type of UPS system is best.

A battery UPS system supplies electrical power through a chemical reaction that happens within the battery, unlike a flywheel system that uses kinetic energy.  Battery UPS systems are often favored by data centers because they can provide a much longer supply of power than a flywheel UPS.  The exact length of time available will depend heavily on the battery’s age, how well it has been maintained, etc. but for reference, a battery UPS may be able to provide 5+ minutes of power (and sometimes much more depending on a variety of factors as mentioned above) vs. a flywheel UPS that may only be able to provide less than a minute of backup power.

data center maintenanceThough a battery UPS provides longer power supply when it is needed, it is not without its drawbacks.  UPS batteries must be routinely maintained.  This includes visual inspection, ensuring adequate cooling and ventilation, cleaning and more to ensure that they will work properly in the event that they are needed.  Additionally, UPS batteries have a shorter lifespan than flywheel UPS systems.  This is because the chemicals within the batteries diminish over time and ultimately lead to battery failure.  For this reason, UPS batteries must be not just routinely maintained but frequently checked to ensure that they are still working and capable of supplying power.

Further, a UPS battery has a limited number of discharge cycles.  Though it can recharge, if it is frequently discharged and then recharged, it will diminish its “expected” capacity and lifespan over time.  For flywheel UPS systems, this is not a problem (though it should be noted that flywheels can only discharge a limited number of times in a short time frame, but multiple discharges over a long period of time is not problematic).  Additionally, UPS batteries contain hazardous materials that must be safely and correctly disposed of when no longer needed.  This means that UPS batteries require special disposal methods that flywheel UPS systems do not require.

As we discussed earlier, because there are advantages and drawbacks to both flywheel and battery UPS systems, many data centers are opting for a hybrid approach.  Data Center Knowledge explains the advantages of having a hybrid system that employs the use of both flywheel and battery power, “According to Kiehn, while the general trend is toward lower-cost systems with shorter runtimes, the size of the market that still wants 5 minutes or more shouldn’t be underestimated. “A lot of customers are still asking for 5 minutes,” he said. They include colocation providers, financial services companies, as well as some enterprises…There are also reliability and TCO benefits to having both flywheel and batteries in the data center power backup chain. When utility power drops, the flywheel will react first and in most cases will never transfer the load to batteries, since the flywheel’s runtime is enough for a typical generator set to kick into gear, Anderson Hungria, senior UPS product manager at Active Power, explained. Because the batteries are rarely used, initial and replacement battery costs are lower. Theoretically, it may also extend the life of the battery, but the vendor has not yet tested for that. As two alternative energy storage solutions, the flywheel and the batteries act as backup for each other, making the overall system more reliable.”

In the technology world, processes and products that are the “old” way of doing things tend to go away quickly in favor of the latest and greatest advancements.  But, when it comes to flywheel UPS systems, they are getting a new life, particularly in the form of hybrid UPS systems.  Flywheels are not an alternative to UPS batteries when it comes to energy efficiency or length of power supply – but that does not mean they are not a viable option for many data centers.  Depending on unique data center needs, they should be considered both from a standalone perspective or as part of a hybrid UPS system to ensure better backup power supply that maximizes uptime and efficiency.

 

Posted in Back-up Power Industry, computer room maintenance, Data Center Battery, data center equipment, Data Center Infrastructure Management, data center maintenance, DCIM, Facility Maintenance, Power Management, Uninterruptible Power Supply, UPS Maintenance | Tagged , , , , , , , , | Comments Off

Proper Maintenance and Service of UPS System is Critical to Preventing Failure


UPS Maintenance Image-withlogo

There are few things more important to a data center than continuous power.  Without it, a data center will experience prolonged downtime, significant financial loss, a damaged reputation and other damaging effects.  It is for this reason that data centers focus a lot of their time and energy on power redundancy and ensuring that there is a properly functioning uninterruptible power supply (UPS).  A UPS will sit waiting and, should it be needed due to a power failure, will supply necessary power to keep data center infrastructure up and running.  There are a variety of UPS sizes to accommodate assorted power loads and many data centers implement multiple UPS systems to ensure they are protecting against downtime.  It is important that a UPS be prepared to function at a moment’s notice so that there is not significant loss of data.  The problem is, many data centers experience UPS failure and, the majority of times a UPS fails, it is due to lack of proper maintenance and servicing.

A power failure can occur for a variety of reasons – power outage, power surge, power sag and more.  Whatever causes a power fluctuation or outage, even a few moments of downtime can bring with it severe costs.  Should any power fluctuation or outage occur, a UPS will pick up right where the power supply left off, eliminating downtime, data loss, and damage to infrastructure.  A UPS is often thought of as a “dependable” power supply in case of emergency but, if it is not properly maintained and serviced, it may not be particularly dependable.

To be able to determine how to best maintain your data center UPS system, you must first understand why UPS systems fail from time to time.  Just like that 10 year old battery in your junk drawer may not have very much life left in it, UPS batteries diminish over time.  Even if you have not needed to use your UPS, the battery that powers it will lose capacity over time and not have as much life as originally intended.  UPS battery deterioration is often further expedited because of the often high temperatures inside data centers.  Fans occasionally fail because certain components such as ball bearings dry out or fans lose power from continuous use.  Additionally, power surges such as those caused by lightning or other transient spike can diminish a UPS battery.  Dust accumulation on UPS components can diminish UPS efficacy.  Further, the UPS battery discharge cycle (how many times the battery has been discharged and recharged) will shorten the overall life of a UPS battery.  A typical 3-phase UPS has an average lifespan of 10 years and without proper maintenance it could be much shorter.

batteryIf you think you are doing enough by occasionally checking your UPS battery, you may be leaving your data center exposed to an outage and downtime.  Government Technology explains just how many data centers are experiencing downtime due to UPS failure and preventable human errors, “Data center outages remain common and three major factors — uninterruptable power supply (UPS) battery failure, human error and exceeding UPS capacity — are the root causes, according to a new study released earlier this month. Study of Data Center Outages, released by the Ponemon Institute on Sept. 10, and sponsored by Emerson Network Power, revealed that 91 percent of respondents experienced an unplanned data center outage within the last 24 months, a slight dip from the 2010 survey results, when 95 percent of respondents had reported an outage…Fifty-five percent of the survey’s respondents claimed that UPS battery failure was the top root cause for data center outages, while 48 percent felt human error was the root cause.”  By correcting human error and properly maintaining your UPS system, you can dramatically decrease your data center’s risk of downtime.

To prevent UPS failure, it is imperative that you regularly maintain and service your UPS as part of your Data Center Infrastructure Management (DCIM) plan.  There are a few key components of proper UPS maintenance and service but physical inspection is at the core.  If you are not physically checking on your UPS system on a regular basis, there is no way to know if there is something visibly wrong or problematic that could lead to a failure.  The best thing you can do is create a UPS maintenance and service checklist and keep a detailed log of all maintenance and service to ensure that maintenance does not fall behind. Your checklist should include checking the UPS battery including testing it to ensure it is working, the UPS capacitors, the ambient temperature around the UPS, calibration of equipment, performing any service that might be required (check air filters, clean and remove dust), verify load share and make any necessary adjustments, and more.

If UPS battery failure is one of the most common causes of UPS failure and thus downtime, it is only logical that this should be one of the most important parts of your UPS maintenance checklist.  Battery discharge should be routinely checked to ensure that it is not diminished and incapable of handling the necessary power load in the event of a failure.  It is also important to visually inspect the area around the UPS and the battery itself for any obvious obstructions, dust collection or other things that may prevent adequate cooling.  If you are seeing a warning that the battery is near discharge perform necessary maintenance.  Further, the AC input filter capacitors should be checked, along with the DC filter capacitors and AC output capacitors for open fuses, swelling or leakage.  Next should you visually inspect all components for any obvious problems.  Inspect the major assemblies, wiring, circuit breakers, contacts, switch gear components, and more.  Should you see obvious damage, perform necessary maintenance and service.

Next, because data centers operate at a high temperature due to the energy output of the infrastructure, it is important to check the ambient temperature around the UPS system because a high temperature can diminish the battery capacity.  Schneider Electric explains best practices for maintaining ambient temperature around UPS for maximum battery life, “It is recommended that the UPS be installed in a temperature controlled environment similar to the intended application.  The UPS should not be placed near open windows or areas that contain high amounts of moisture; and the environment should be free of excessive dust and corrosive fumes.  Do not operate the UPS where the temperature and humidity are outside the specified limits.  The ventilation openings at the front, side or rear of the unit must not be blocked… All batteries have a rated capacity which is determined based on specified conditions.  The rated capacity of a UPS battery is based on an ambient temperature of 25°C (77°F).  Operating the UPS under these conditions will maximize the life of the UPS and result in optimal performance.  While a UPS will continue to operate in varying temperatures, it is important to note that this will likely result in diminishing the performance and lifespan of your battery.  A general rule to remember is that for every 8.3°C (15°F) above the ambient temperature of 25°C (77°F), the life of the battery will be reduced by 50 percent.  Therefore, keeping a UPS at a comfortable temperature is crucial to maximizing UPS life and capabilities.”

ups-158315_1280Visual inspection should include dust and dirt removal on the UPS system.  UPS system will sit and accumulate dust over time but dust could interfere with proper heat transfer so dust should be promptly removed to ensure the UPS system will function properly when needed.  Further, check all air filters for dust accumulation.  Dust accumulation on filters could lead to inefficiency and even overheating.  Clean and replace filters as needed to properly maintain your UPS.  Capacitors are also an integral component of UPS systems.  Capacitors aid in the transition of power in the event of an outage so if they fail, the UPS will likely fail.  Capacitors need to be routinely checked because they will dry out from wear and tear so they need to be replaced every few years to ensure proper UPS function.

Though much of the suggested UPS maintenance and service strategy may sound basic, even obvious, the fact of the matter is that UPS failure continually remains a primary source of data center downtime.  And, when you couple that with human error, it is easy to see that many data centers simply are not properly maintaining their UPS systems to prevent failure.  All of these tasks do not need to be completed every day or even every week, certain tasks can be performed weekly while others can be monthly, quarterly, semi-annually, and annually.  By breaking it up you ensure that your UPS system is being frequently and routinely checked while making routine maintenance a far more achievable task.  Additionally, by maintaining a detailed log you can see if UPS maintenance and service has fallen behind and immediately address any concerns.  When data center technicians routinely check the UPS system, they will become familiar with what looks normal and what looks concerning so that, should anything look problematic, it can be addressed and remedied immediately for peace of mind that your UPS will be there when you need it and prevent costly downtime.

Posted in computer room maintenance, data center equipment, Data Center Infrastructure Management, data center maintenance, DCIM, Facility Maintenance, Uninterruptible Power Supply, UPS Maintenance | Tagged , , , , , , , , | Comments Off

The Convergence of IT & OT in Data Centers

IT and OT – though they are two different things, the previous tendency to “divide and conquer” when it came to strategy, management and solutions is going away.  When it comes to IT and OT, their worlds are colliding inside data centers.  Operating as two separate entities without communication and collaboration is not effective, efficient or ideal. Though not all data centers are operating with IT/OT convergence, the transition has begun – IT/OT convergence is already happening in healthcare, energy, aviation, manufacturing, transportation, defense, mining, oil and gas, utilities, natural resources sectors, and more – and it is only a matter of time until it is simply the data center industry standard.

IT_OT Convergence ImageOT (operational technology) has a few primary focuses – maximizing uptime, ensuring the proper function of equipment and infrastructure, and the security and availability of operational assets and processes.  OT is a blend of both hardware and software so that environmental maintenance can occur.  Though some are not even familiar with the name “OT,” OT is essential to the day-to-day operations of a data center. The convergence of IT and OT is happening because the specific technology involved in operational technology (such as communications, software and security) is evolving and OT is integrating more information technology (IT) into their operations.

IT focuses on the use and integrity of data and intellectual property.  Its focus is on things like storage, networking devices, computers, and infrastructure that facilitate improved information storing and security. In contrast to OT, IT (information technology)’s security focus is the protection and preservation of confidential information.  Though they are two different things, they are not mutually exclusive and what data centers are finding is that there is more than just overlap, a convergence is happening.  Schneider Electric elaborates on why IT and OT worlds are colliding, “Security systems are needed to protect facilities. IT is needed to run security systems. Apply a bit of basic math theory to these statements, and it is easy to conclude that IT is then needed to protect facilities. If you are thinking this sounds like OT and IT convergence, you’re right; but security requirements push the boundaries even further to compel departmental collaboration between OT and IT. At the core, lies the need for reliable delivery of clean and continuous power.”

To maintain uptime and maximize security, IT and OT must work together. Think about factors that could lead to downtime or a security breach – problems with infrastructure management, equipment overheating, fire, flood, problems with lighting, problems with the security system, a physical breach of security, a cyber-attack, and more.  Many of these things fall under the OT umbrella but some fall under the IT umbrella.  And, in reality – managing and mitigating them involves both IT and OT.  In order to properly remote-manage a data center, and maintain RTOI (real-time operational intelligence), a proper DCIM must be in place and IT must be able to communicate with monitoring systems so that proper and accurate information is received.  As we have previously discussed, when this information is received in real time, downtime can be significantly reduced.  TechTarget elaborates on why IT and OT are converging in the way that they are now, and how it will improve efficiency and maximize data center operations, “While IT inherently covers communications as a part of its information scope, OT has not traditionally been networked technology. Many devices for monitoring or adjustment were not computerized and those with compute resources generally used closed, proprietary protocols and programmable logic controllers (PLC) rather than technologies that afford full computer control. The systems involved often relied on air gapping for security. Increasingly, sensors and connected systems like wireless sensor and actuator networks (WSANs) are being integrated into the management of industrial environments, such as those for water treatment, electric power and factories. The integration of automation, communications and networking in industrial environments is an integral part of the growing Internet of Things (IOT). IT/OT convergence enables more direct control and more complete monitoring, with easier analysis of data from these complex systems from anywhere in the world.”

AdobeStock_93793795When you integrate infrastructure management systems, your data center information will be able to flow between departments with ease.  Data from IT can and should be an indispensable tool in providing the information OT needs to formulate strategy and make decisions. The result will be increased productivity, improved efficiency, decreased downtime, and enhanced security.  With integration, knowing what your data center needs will be timely and accurate, making effective maintenance far easier.  Your RTOI will be accurate so, should you need to make a quick adjustment – whether large or small – you will hopefully know before you experience any problems or catastrophic events.

So, it seems like a simple solution, right? And, clearly based on the advantages of working together any data center would jump all over it?  Though IT/OT convergence are certainly the future of data centers, it is not necessarily an easy task to bring the two together.  GE elaborates on the challenges of IT/OT convergence, “Many cultural and technological impediments make IT/OT convergence challenging. From the perspective of culture, IT and OT have traditionally been well-separated domains. When smart assets and infrastructure are introduced, it becomes necessary to figure out new ways to divide ownership and responsibility for the management and maintenance of that infrastructure. This can potentially lead to turf wars and blame games. On top of that, OT standards have generally been proprietary and vendor specific, optimized exclusively for specialized tasks. Unifying IT and OT requires implementing well-defined standards that scale all the way from assets to data centers and back. These standards also need to account for enhanced security, since operational assets that were previously disconnected from widespread communication networks could now be vulnerable. It’s all about the enterprise. All that daunting work can be made easier, however, by the concept of “enterprise architecture.” Enterprise architecture is a top-down methodology for developing architecture by focusing first on organizational goals, strategy, vision, and business before delving into the technological specifics. This approach could keep IT/OT deployment aligned with achieving Industrial Internet goals. Going through the process of integrating IT and OT might require some initial effort, but the payoffs are worth it.”

With any changes in data centers, there are growing pains.  Logistical intricacies to fine tune.  Security challenges.  There will always be a list of challenges in implementing change.  But, the convergence of information technology and operational technology is a value-added change.  The specific values will vary amongst industries but, make no mistake, convergence will have a payoff. Though there will be challenges in converging IT and OT, success is very achievable with thorough planning, proper execution and full implementation of an IT/OT strategy.  All data center team members must be fully educated and on board to be properly prepared for the change.  Make no mistake; IT and OT are not the same.  Though they are converging they are different and separate yet, joint structures.  If a harmony and alignment of strategies can be found, IT and OT convergence can be a stunning success.

By converging IT and OT, there will be similar technology and this overlap of sorts will allow the two to work together synergistically.  This will be beneficial in a variety of ways but one of the most prominent ways is that it will be cost-saving. Not only because costly downtime will be reduced but because IT and OT teams can, in some ways, be combined and redundant team members pruned for efficiency. In addition to this, convergence will provide risk reduction because there will be an overlap of security issues and those issues will be able to be simultaneously addressed.  And, perhaps most significantly, data centers will enjoy enhance performance from IT/OT integration.  Bad redundancies (such as similar but separate operations that could be under one umbrella) and good redundancies (such as finding ways in which IT and OT can synergistically work together) enhanced.  Further, convergence will improve performance in the form of enhanced system availability. Better performance that will mean more uptime because of a reduced risk of things like cyber-attack, poor infrastructure management, power failure and more.  Through a collaborative effort, a focus on future technologies, a drive toward maximizing uptime and minimizing security risk, and a desire for improved efficiency, data centers will successfully achieve IT/OT convergence and step into the future of data centers.

 

 

 

 

 

 

 

Posted in Data Center Design, data center equipment, Data Center Infrastructure Management, data center maintenance, Data Center Security, Datacenter Design, DCIM, Hyper Converged Infrastructure, Power Management | Tagged , , , , , , | Comments Off

Data Center RTOI

Technology is evolving minute by minute and data centers must work to keep up with the lightening-paced evolution.  We have discussed the Internet of Things (IOT) before – the world is becoming increasingly dependent on the internet and every day processes are becoming digitized for efficiency and savings.  But, as more and more of the world becomes digitized, technology advances and data grows, and that data must be effectively and efficiently stored.  Data centers make investments in infrastructure, backup power, security and more so that they can adequately store that growing and evolving data but when things move so quickly, constant monitoring must be happening to ensure that data is not just stored properly but safely and efficiently.  Old methods of collecting and analyzing data are archaic and simply not practical.  Analyzing what went wrong after the fact, or realizing something is about to go wrong when there is not enough time to fix the problem is useless.  And, ultimately, these traditional methods are responsible for a lot of downtime in data centers.  Accurate, actionable information in real time is the only way data centers can effectively operate moving forward.

rtoi imageData centers are notoriously energy-inefficient but most data centers today are making efforts to improve and be more energy efficient.  The undertaking is not simple or straightforward because every data center is different and has unique needs.  Data centers cannot run at capacity because, should capacity change, data centers will be ill-equipped.  But, at the same time, data centers should not run way beyond what is necessary because that is a waste of energy.  More and more data centers managers are realizing the need for Real Time Operational Intelligence (RTOI).  Having access to current, accurate information is the only way to make intelligent and informed decisions about how to best manage the infrastructure of a data center.  What does RTOI look like in a data center? TechTarget provides a brief explanation of what RTOI is in a practical sense, “Real-time operational intelligence (RtOI) is an emerging discipline that allows businesses to intelligently transform vast amounts of operational data into actionable information that is accessible anywhere, anytime and across many devices, including tablets and smartphones. RtOI products turn immense amounts of raw data into simple, actionable knowledge. RtOI pulls together existing infrastructure to manage all of the data that is being pulling from a variety of sources, such as HMI/SCADA, Lab, Historian, MES, and other servers/systems. It then has the ability to organize and connect the data so it’s meaningful to the users. By integrating directly with existing systems and workflows, it can help assets perform better and help workers share more information.”  As more and more people, businesses and data centers are utilizing the cloud, and the cloud’s complexity continues to change, data management needs change and data centers struggle just to keep pace.

RTOI can greatly reduce waste and improve energy efficiency by helping identify what is in use and what is not so that things can be turned off strategically for energy savings.  Just think of all of the infrastructure in a data center that is consuming power even though it is not mission critical or in even in use. For example, determining which servers are in use and which servers can be, at least temporarily, powered down will yield significant energy savings.

One of the most significant advantages of well-executed RTOI is immediate knowledge of potential threats and the ability to deal with them before they cause downtime.  As we have often discussed, downtime is incredibly costly (costing, on average, thousands of dollars per minute).  No data center wants to experience downtime but, unfortunately, the vast majority will face it at one point or another.  Data centers can significantly reduce their risk of downtime with current, accurate, actionable information about what is happening in the data center.  As we have seen, anticipation of problems can only go so far.  Data centers simply cannot properly manage what they do not see or have knowledge about.  That is where RTOI comes in.

RTOI not only aggregates data but it measures it, tracks it, and, if well-executed, puts it easy-to-understand terms and statistics so that you can use the information to make informed decisions as well as to properly manage assets going forward.  RTOI can assist data centers in improving capacity planning, anticipating asset lifecycle and properly planning management, maintain and continuously meet regulatory compliance, optimize energy efficiency and more.

DCIM_RTOI_imagePlanning for data center capacity is far easier at the building stage but, once a data center has been built and is in operation, anticipating capacity needs, particularly as new technology means big data storage, is very challenging.  In fact, it is one of the biggest challenges data centers face today.  Panduit explains why capacity management is such a challenge in data centers, “Proactive capacity management ensures optimal availability of four critical data center resources: rack space, power, cooling and network connectivity. All four of these must be in balance for the data center to function most efficiently in terms of operations, resources and associated costs. Putting in place a holistic capacity plan prior to building a data center is a best practice that goes far to ensure optimal operations. Unfortunately, once the data center is in operation, it is all too common for it to fall out of balance over time due to organic growth and ad hoc decisions on factors like power, cooling or network management, or equipment selection and placement. The result is inefficiency and in the worstcase scenario, data center downtime. For example, carrying out asset moves, adds and changes (MACs) without full insight into the impact of asset power consumption, heat dissipation and network connectivity changes can create an imbalance that can seriously compromise the data center’s overall resilience and, in turn, its stability and uptime…Leveraging real-time infrastructure data and analytics provided by DCIM software helps maximize capacity utilization (whether for a greenfield or existing data center) and reduce fragmentation, saving the cost of retrofitting a data center or building a new one. Automating data collection via sensors and instrumentation throughout the data center generates positive return on investment (ROI) when combined with DCIM software to yield insights for better decision making.”

With accurate information in real time you can manage capacity needs and, in a moment’s notice, add capacity so that there are no problems.  Additionally, that kind of historical information is useful for predicting the need for data center expansion going forward.  For example, data centers often have orphan servers that are sitting doing nothing but collecting dust and sucking up resources like cooling and power.  Without careful and accurate management, these orphan servers could sit like this for weeks, months or even years, wasting resources that could be better allocated.  With real-time statistics about what exactly is going on in your data center, you can find these orphan servers and clean them out, freeing up capacity for other infrastructure.  In fact, carefully managing your data center’s capacity needs and more accurately anticipating future needs can mean saving millions of dollars in the long run.

DCIM and RTOI go hand in hand.  Without a proper plan for data center infrastructure management, and sophisticated monitoring software, RTOI is not achievable.  DCIM tools are necessary to measure, monitor, and manage data center operations including energy consumptions and all IT equipment as well as the facility infrastructure.  Fortunately, there are sophisticated DCIM software products available that will track sophisticated information all the way down to the rack level so that monitoring is made easy, even remotely. As mentioned, it is critical to leave behind old and archaic forms of DCIM, there is simply no way to really keep up.  Data centers, regardless of size, must focus on real-time operational intelligence as a means of accuracy. TechTarget explains why it is critical to focus on RTOI as a way of staying ahead potential problems, “Taking a new big data approach to IT analytics can provide insights not readily achievable with traditional monitoring and management tools, Volk said…For example, particularly with cloud resources, it can be difficult to anticipate how applications and data movement will affect each other. Cloud Physics allows cross-checking of logs and other indicators in real time to achieve that. This new approach is “leading edge, not bleeding edge,” Volk said. Its value to an organization will depend on the maturity and complexity of a given data center. Small and medium-sized businesses and organizations without much complexity will benefit, he said, “but companies with large and heterogeneous data centers will benefit even more.”  RTOI helps data centers provide better service to their customers, minimize downtime, improve efficiency, maximize reputation, and ultimately, save money through vastly improved operations.

Posted in Cloud Computing, data center equipment, Data Center Infrastructure Management, DCIM, Internet of Things, Mission Critical Industry | Tagged , , , , | Comments Off

Data Center Business Continuity

titan-power-business-continuity-infographicWhether you operate a data center or any other business, business continuity is incredibly important.  We all think we are immune to disaster but the reality is, if you have not formed a business continuity plan for disasters, you are leaving your data center at severe risk.  Imagine what it would be like if a disaster struck (flood, fire, etc.) and you could not get into your data center for a few hours – problematic, right?  What if that disaster was really bad and you could not get into your data center for a few days or weeks – huge problem. Business cannot come to a screaming halt so a strategy for maintaining business continuity is a must. A strategically formed, well-thought-through business continuity plan should be a part of any data center’s disaster recovery program.  A disaster recovery plan will be the big umbrella under which we will talk about business continuity because the two are inextricably related.  This is because disaster recovery focuses heavily on data recovery and management but, beyond maintaining and protecting data in the event of a disaster, a data center business and the businesses it serves must be able to continue to meet its most basic objectives.  During a disaster a data center may experience downtime in which all business operations come to a halt.  This is not a small problem – downtime may cost as much as $7,900 per minute.  A disaster recovery plan, along with a business continuity plan, will help a data center reduce downtime in the event of a disaster as well as operate continuously to meet business objectives.

To formulate a business continuity plan we must first outline what makes a successful business continuity plan.  A data center’s business continuity plan will function as a roadmap.  If a disaster strikes, you will hopefully be able to find the type of disaster in your business continuity plan and then begin following the “map” to get to the solution and restore your data center to business as usual. First and foremost, a proper business continuity plan will focus on what can be done to prevent disasters so that business continuity is never interrupted in the first place. Data centers must consider what their unique needs are because there is no such thing as a generic data center business continuity plan – it would never work.  Data centers must identify and asses all mission critical assets and risks.  Once they have been identified it will be far easier to formulate a business continuity plan with specific goals in mind.  You can prioritize your most problematic risks by focusing on the risk they pose to mission critical assets. In considering individual needs it is imperative that data centers determine what applications and processes are mission critical. For example, you’re your mission critical systems be maintained remotely? Additionally, in today’s data center world where security is a top concern, maintaining data security should be an important part of your business continuity plan.

Disaster prevention is a central part of your data center’s business continuity plan.  Identifying business continuity goals and potential problem areas will help you lay out a proper disaster prevention plan.  Depending on your unique data center, certain measures may be beneficial such as increased inspections of infrastructure, better surveillance, enhanced security in various areas including data centers grounds security and rack-based security, increased redundancy, and more.  Think in terms of real problems and real consequences; be specific so that you can make specific business continuity plans and strategies.

Some data centers may want to relocate their data center if a disaster is incredibly large but the logistics of this are far from simple.  Relocating for a disaster safely, rapidly, and securely is no simple task.  And, beyond that, it is expensive which is why many data centers – even large enterprise data centers – do not do this.  To do this properly as part of a business continuity plan, a detailed data center migration plan must accompany the business continuity plan.  Some enterprises may want to utilize regionally diverse data centers that mirror each other but this is also expensive and exceptionally complex to implement – though it can be very effective at maintaining uptime, maximizing security, and optimizing business continuity.

As mentioned, redundancy is an important part of maximizing uptime and maintaining business continuity in a data center. As part of your data center’s business continuity plan, you may want to implement load balancing and link load balancing.  Server load balancing and link load balancing are two strategies that may be used to help prevent the loss of data from an overload or outage in a data center. Continuity Central Archive explains how these two strategies can be used in data centers, “Server load balancing ensures application availability, facilitates tighter application integration, and intelligently and adaptively load balances user traffic based on a suite of application metrics and health checks. It also load balances IPS/IDS devices and composite IP-based applications, and distributes HTTP(S) traffic based on headers and SSL certificate fields. The primary function of server load balancing is to provide availability for applications running within traditional data centers, public cloud infrastructure or a private cloud. Should a server or other networking device become over-utilized or cease to function properly, the server load balancer redistributes traffic to healthy systems based on IT-defined parameters to ensure a seamless experience for end users…Link load balancing addresses WAN reliability by directing traffic to the best performing links. Should one link become inaccessible due to a bottleneck or outage, the ADC takes that link out of service, automatically directing traffic to other functioning links. Where server load balancing provides availability and business continuity for applications and infrastructure running within the data center, link load balancing ensures uninterrupted connectivity from the data center to the Internet and telecommunications networks. Link load balancing may be used to send traffic over whichever link or links prove to be most cost-effective for a given time period. What’s more, link load balancing may be used to direct select user groups and applications to specific links to ensure bandwidth and availability for business critical functions.”

Cloud computing flowchart with businessmanData centers are also utilizing the cloud for their business continuity plans because it is cost-efficient and highly effective.  The cloud platform is exceptionally effective for business continuity, particularly as data centers move more and more towards virtualization.  A cloud service with proper SLA (service level agreement) can ensure that data will be continuously saved and protected even in the event of a disaster.  This is where identifying mission critical applications and information are important.  The entirety of the data center’s workload does not need to be recovered in an instant, only that which has been determined mission critical.

In addition to the cloud, many data centers opt to implement image-based backup for continuity.  Data Center Knowledge provides a helpful description of what image-based backup is and how it can be used uniquely in data centers, “Hybrid, image-based backup is at the core of successful business continuity solutions today. A hybrid solution combines the quick restoration benefits of local backup with the off-site, economic advantages of a cloud resource. Data is first copied and stored on a local device, so that enterprises can do fast and easy restores from that device. At the same time, the data is replicated in the cloud, creating off-site copies that don’t have to be moved physically. Channel partners are also helping enterprises make a critical shift from file-based backup to image-based. With file-based backup, the IT team chooses which files to back up, and only those files are saved. If the team overlooks an essential file and a disaster occurs, that file is gone. With image-based backup, the enterprise can capture an image of the data in its environment. You can get exact replications of what is stored on a server — including the operating system, configurations and settings, and preferences. Make sure to look for a solution that automatically saves each image-based backup as a virtual machine disk (VMDK), both in the local device and the cloud. This will ensure a faster virtualization process.”

While not every data center will experience a “major” disaster where they cannot get into their facility for weeks, many data centers will experience some type of disaster.  And, as mentioned, mere minutes can cost tens of thousands of dollars.  Beyond the bottom line, the inability to continuously maintain data center business may damage your reputation irreparably.  An effective business continuity plan is capable of pivoting around both people and processes depending on the specific circumstances.  Rapidly restoring data and operations is the goal and data centers should take that goal and work backwards from there to determine the best path to maintaining business continuity.

Posted in Back-up Power Industry, Cloud Computing, Computer Room Design, DCIM, Uninterruptible Power Supply | Tagged , , , , , | Comments Off