The rate of change for data storage needs in the cloud is huge. In an example from a research paper from Google – YouTube users upload 400 hours of video every minute, requiring the addition of 1 million GB of data center storage per day. Currently, allstorage is in spinning disks, a decades-old but reliable format. But this storage format was designed for traditional servers, not the high volume data centers of today and tomorrow. Researchers in the field are seeking an answer to this problem, including collaboration between Microsoft and the University of Washington proposing the exploration of using DNA-type encoding for data storage, and Google has a proposal, as well.
View a Collection of Data Storage Devices as a Single Entity
Google’s first proposed change is to think about individual data storage devices as a single storage system and consider all of its properties as an aggregate. This approach calls for higher-level maintenance, re-balancing of data to make use of more disks including new ones, and higher-level data backup and repair functions.Taking this approach requires an initial outset of effort to redistribute data and implement new processes with periodic updates and re-distributions as new hardware is added and legacy machines are retired, but creates a more robust data storage system in the long run.
Redesigning the Disk
Google proposes to design a new storage format specifically for data centers, suggesting the redesign of the disk itself to optimize for weight, heat, vibration, and potential handling by robotic automation systems, and seeks to engage the entire industry in a conversation to develop agreed-upon specifications for a new industry standard data center storage format.
Mixing Old Tech with New Tech, changing the mixture over time in legacy systems
After a new disk format is decided upon and implemented, data centers will probably slowly phase in the new format, adding disks of the new style when more storage is needed and replacing legacy disks with the new format when they near end of life. This gradual implementation of new technology will benefit data centers, preventing them from paying to replace hardware that is still fully functional, and will also provide time to work out the bugs in the new format and preventing catastrophic failure of any data center thanks to the distribution of data over many disks of a variety of ages and designs.
Google’s proposal to update the storage format currently used by data centers around the world involves a change in the way data is distributed and managed along with thedevelopment of a completely new technology. It is a bold suggestion, and will require buy-in from most of the industry in order to be implemented.