The Future of Data Storage is in You

Dr. Mussaad M. Al-Razouki
5 min readJul 21, 2017

--

During the Spring of 1872, New Yorker turned California prospector and railroad tycoon, Leland Stanford made a bet with a few of his friends — When a horse is running or at full gallop, do all four hooves ever leave the ground at the same time? Do horses essentially fly off the ground?

As the former 8th governor of California, Stanford had recently retired to pursue his passion for thoroughbred racing. To settle the bet, Stanford hired renowned English photographer, Eadweard Muybridge, who had achieved worldwide notoriety after his famous 1868 snaps of the Yosemite Valley.

Muybridge set up 24 cameras in a row and perpendicular to the race track. All the cameras were pointing in the same direction with a trip wire type device trigger attached to them in sequence to capture the movement of the horse as it galloped past. Muybirdge carefully attached strings to all the shutters of all 24 cameras and stretched them across a track, so that as the horse passed by touching each string in turn, the cameras would take their pictures one at a time and in sequence.

Not only did the sequence show that horses really do “take temporary flight” while galloping, Muybridge’s sequence of “The Horse in Motion” in considered by many to be the birth of the motion picture industry. Though Thomas Edison is usually credited with creating the first movies in 1889 or the Lumière Brothers tend to take credit for inventing the first public cinema on December 28th 1895, many consider the work of Eadweard Muybridge as a result of setting Stanford’s bet that provided the cornerstone of Edison’s invention, which inspired the Lumière Brothers and the evolution of motion pictures.

As for Stanford, just over decade after that fateful experiment, Leland Stanford and his wife Jane, donated the majority of their wealth, 40 mn USD (approximately 1 billion dollars in today’s purchasing power) to fund the prestigious university that shares the family name in memory of their late son, Leland Jr.

Almost 150 years later and on the other coast of the American continent, this summer, researchers from Harvard Medical School and the Wyss Institute for Biologically Inspired Engineering have stored a digital version of the exact same Muybridge movie sequence as a GIF file in a different sort of sequence all together. They stored it on the biological sequence that is the route of all living matter — DNA. Not only did the scientists store the short video, they actually were then even able to retrieve it and play it again.

Using bacteria, the researchers employed the CRISPR-Cas system, a powerful gene-editing tool, which essentially allowed them to copy and paste different parts of the DNA strand. In this ground-breaking experiment, the researchers chopped up each frame into single-colored pixels. They then created DNA codes that corresponded to each color and strung several codes together. Each bacterium then took specific snippets or cuts of the video and stored it in their DNA. Taken together, the researchers were able to paste the pieces back and play the video.

With this work, scientists believe that we will eventually create living cells that operate as “molecular recorders,” and sense things in the environment, like toxins or heavy metals, and record and store that information within their DNA.

Actually, using DNA as a storage medium came to popular light back earlier in the two thousand and teens, when software pioneer Microsoft announced the purchase of ten million long oligonucleotides (strands of DNA) in 2016 from Twist Bioscience to encode digital data.

Using DNA to store digital information is a great example of using an ancient solution existing in nature to address the concerns brought about by the growing amounts of data produced as a result of modern technology such as the internet of things and all the non-ephemeral social media accounts of the world.

In fact DNA is quite ideal as a medium for data storage. The data density of DNA is orders of magnitude higher than conventional storage systems, with 1 gram of DNA able to represent close to 1 zettabyte or 1 billion terabytes or 1 trillion gigabytes of data. That’s enough smart phones to fill an entire football stadium. DNA is also remarkably robust; DNA fragments that are thousands of years old have been successfully sequenced, essentially meaning that the information stored on DNA could itself last for thousands of years.

Assuming an average human body of 65kg houses about 60 grams of DNA, each person on earth could store as much as 60 zettabytes or 60,000 exabytes of data. Just for comparisons sake IBM back in 2012 estimated that the entire world generates a total 2.5 exabytes of data per day. In 2015, Cisco Systems CSCO estimated that some 50 billion devices would be connected to the Internet by 2020, generating 44 zetabytes, or 44 trillion gigabytes of data annually (just look below at what happens in a mere minute on the internet). Thankfully, the rapid adoption of DNA as a data storage medium should easily allow us to squirrel away these vast quatities of future digital data…and then some.

Another interesting consideration is that all the words ever spoken by humankind prior to the digital revolution is estimated to command a total storage of a mere 5 exabytes — meaning that in the future you could store the entire compendium of human knowledge pre-Internet, on your small finger nail.

There is no doubt that your DNA could one day certainly store all the world’s past, present and future data.

--

--