BlurHashMuziek

Can You Record Your Favourite Playlist on DNA?

It may be possible soon. A collaborative effort by geneticists and computer scientists has uncovered the potential of our genetic material for use as a digital storage medium.

We do billions of things on the internet, and they all generate data – a lot of data. To store all that data, both now and in the future, we are going to need more and more magnetic tapes, hard drives and flash memory. Not to mention all the extra data centres and warehouses that will be required to hold all those storage media. It doesn’t sound technically or economically feasible. Scientists are therefore looking for different and better storage methods.

DNA to the rescue

The solution to our storage problem could well be found in our DNA. It has been referred to as “nature's oldest storage device”, containing all the information that is needed to build and maintain an organism.

Each one of our more than 30 trillion cells contains DNA, which stores characters packaged in the form of base pairs. Each human cell contains about 3 billion base pairs. An immense number! In our bodies, this huge amount of data is stored in just one part of each cell: within the cell nucleus. In other words, it is pretty compact.

A shoebox of data

DNA also lasts for an incredibly long time. Its half-life is 521 years – in other words, it takes about five centuries for half the data to become worn out. Under the right conditions it can even survive for hundreds of thousands of years. We know this because we have already recovered DNA from Neanderthals (+/- 400,000 years old), a horse from prehistoric times (+/- 700,000 years old), and even mammoths (+/- 1.2 million years old) and we have been able to reconstruct it.

In other words: DNA has the potential to store a huge amount of information in a very small space for a very long time. If we succeed in storing data on DNA, some people say that all the information in a data centre would easily fit into a shoe box, and we would still be able to access it 100 years from now.

How can we do this?

We know that everything in a computer is binary (made up of ones and zeroes), while DNA consists of four bases (A, T, C & G). This means we can translate all binary values into a "base code," for example, where A = 00, T = 11, C = 01 and G = 10. A binary code like “110001010010” then becomes TACCAG.

Once it has been translated, you can "print out" the bases to make synthetic DNA. This can be safely stored (in test tubes, on glass slides etc.) and later on you can retrieve it and convert it back to the original binary values, allowing a computer to process it again.

Does it actually work?

Some time ago scientists managed to store Deep Purple's song "Smoke on the Water" in the form of artificial strands of DNA. They reported that the strands were read back with 100% accuracy.

Don't get too excited though: we are not quite there yet. The strands can be read correctly, but the next challenge for bioengineers is to locate a specific piece of information, for example the solo. There are also some disadvantages with the current DNA synthesis methods: it is a slow and very expensive process.

Like all technologies, however, it will become cheaper and faster through continuing research and further technological advances. So we are keeping a shoe box ready!

Sources: