I am as guilty of crowding and cluttering the digital space as the 2,267,233,742 other Internet users worldwide. This blog is just one example of the useless content creation that occupies an ever increasing digital space. After all, what I write is stored digitally somewhere.
This information/content overload is beginning to pose a serious challenge to the digital storage of data. It is in this context that I note with great interest the remarkable new technology of storing data to DNA. A team of three scientists, George M. Crunch, Yuan Gao and Sriram Kosuri report in the journal Science about this new technology.
An extract on the journal’s website says, “Digital information is accumulating at an astounding rate, straining our ability to store and archive it. DNA is among the most dense and stable information media known. The development of new technologies in both DNA synthesis and sequencing make DNA an increasingly feasible digital storage medium. Here, we develop a strategy to encode arbitrary digital information in DNA, write a 5.27-megabit book using DNA microchips, and read the book using next-generation DNA sequencing.”
Intrigued by the idea I reached out to Dr. Kosuri, co-author of the Science piece who works at Harvard’s Wyss Institute. I emailed him this morning my intention to report the story for the IANS wire. Once I have interacted with him I will report my conversation on the wire as well as here.
In writing about the new technology Geraint Jones of The Guardian touches upon a thought that instantly crossed my mind on reading the extract in Science. Could a living organism be used for encoding data? While I wait for Dr. Kosuri’s response to me on the question, Jones says, “The work did not involve living organisms, which would have introduced unnecessary complications and some risks. The biological function of a cell could be affected and portions of DNA not used by the cell could be removed or mutated. "If the goal is information storage, there's no need to use a cell," said Kosuri.”
Perhaps I am running ahead of myself but the idea of encoding data onto a living organism, namely me for instance, is fraught with technological singularity.
Although I will carry Dr. Kosuri’s full interview tomorrow, here is the bit that relates to whether it is theoretically possible to encode data to oneself. He says, “It is possible, but there are a lot of disadvantages. First, some of the sequences could be detrimental to the cell and could be lost. Second, because the information is extraneous to the cell's functioning, it's likely to be mutated or deleted wholesale in the long term. Finally, there is no real use to keep it inside the cell, as the cell or a body wouldn't really know what to do with it. It's not like we can read our own genomes inside ourselves and learn from it.”
Speaking of data and living organisms, one little nugget that has been widely distributed on the Net lately is about how much data a single sperm carries. I do not at all vouch for the accuracy of this but I have received several emails from friends containing this bit of information. According to this one sperm has 37.5 MB of DNA, which in turn, means that one normal ejaculation represents a data transfer of 1,587 GB in about 3 seconds. There is massive data being flushed down every minute of every day.