DNA storage expands your alphabet to store unlimited digital data

A group of scientists has artificially extended the “alphabet” used in DNA storage, with the aim of increasing the capacity to store digital data in an unlimited way.

The amount of digital data generated by the world continues to increase and there will come a time when it will be difficult to process all of it by traditional means. Biological storage is one of the technologies being developed. And the most advanced, since it would allow files, photos, documents and any other digital data to be stored using nature’s own information database: DNA.

Until recently it was a science fiction technology, but the latest scientific advances aim to bring it closer to reality. The general idea is to treat DNA like any other digital storage device. Only strands of DNA are synthesized here instead of binary data that is encoded as magnetic regions on a hard drive or in NAND flash memory by an SSD. The advantages are notable for the incredible density of DNA and its ability to survive unaltered for thousands of yearsthey say, as long as life existed on Earth.

DNA storage: even more capacity

DNA encodes genetic information with four molecules called nucleotides. Adenine, guanine, cytosine and thymine, or for what concerns us A, G, C and T. In biological storage, DNA strands are synthesized that store 96 bits where each of the bases (TGAC) represents a binary value (T and G = 1, A and C = 0). To read the information stored in DNA, you just have to sequence it – like a human genome – and convert each of the TGAC bases back to binary.

To help with sequencing, each strand of DNA has a 19-bit address block at the beginning so the DNA can be sequenced out of order and then sorted into usable data using the addresses. With just those four letters, deoxyribonucleic acid is capable of storing the genetic instructions used in the development and functioning of all living organisms on the planet.

But, What if we had a longer alphabet? Using that approach, a group of scientists has artificially added seven new letters to those of DNA. Instead of converting just 0s and 1s into A, G, C, and T, they intend to use the new letters to potentially achieve unlimited digital data storage.

The researchers have also coined a novel mechanism that accurately reads the data from the synthetic DNA. The system uses deep learning algorithms and artificial intelligence to discern between the seven artificial human-made and natural DNA letters, as well as to differentiate them from each other. «We tested 77 different combinations of the 11 nucleotides, and our method was able to differentiate each of them perfectly»they assure.

We’ll see what comes of all this. There will come a time when we will not have the capacity to handle all the digital data generated in the world and we will need another type of technology. Storage in DNA is the most daring, but the most natural that we know, with enormous capacity and resistant without alterations for life. There are still decades to go before it becomes a viable storage option on a commercial scale. Two key steps are missing: translating digital bits (ones and zeros) into synthetic DNA strands that represent these bits with encoding software and a DNA synthesizer, and reading and decoding the information back into bits to retrieve that information in digital form.

Related Articles

Leave a Reply

Your email address will not be published.