A few weeks ago my wife and I went to the Clark Art Institute in Williamstown in the northwest corner of Massachusetts. This is a fabulous, small art gallery with many of the paintings you learned about in your Art Appreciation or Art History course. But from September 6 through November 2 it also has one of the four original copies of the 1215 Magna Carta. For the first time, one of these original copies was in the US – the copy in the National Archive in Washington DC is from 1297.

The document was copied by hand with very small letters made with a quill pen and dipped ink. The letters have faded and the cotton “paper” has discolored, and while my Medieval Latin is rusty making my ability to comprehend the script limited, the document is readable 799 years later.

In another case of amazing longevity for saved data, a friend of mine was able to get data from computer tapes from the 1960s. The story of finding the tapes, finding a tape drive that would read them, and a company that had the technology and process to make it all work makes that data recovery remarkable. If you have data on floppy disks (remember them?), try to figure out how you would access it.

Is it even possible to save today’s data for 800 years? Maybe, but not easily.

You need four things in order to save data for the long term:

  1. A digital copy of the data.
    For digital data, that is fairly easy; just copy it. For analog data, like vinyl records, magnetic tape, or paper, you need to first get it into a digital electronic form. In the case of the 1215 Magna Carta, the four existing copies are not identical. Since it was copied by hand, sometimes by monks who could not read, there are accidental differences among the copies. The same thing happens with analog data – every time you read it you damage it, and any copy is modified from the original.
  2. A media that will last for the time period you want.
    CDs and DVDs are probably good for up to 20 years, thumb drives for probably longer. The more critical factor is how many times you write to the thumb drive, not how often you read it or even how you treat it while stored. Even an inexpensive thumb drive will support 3,000 to 5,000 erase / write cycles. Potentially the weakest part is the physical connector that you plug into your computer: they are only specified to withstand about 1,500 insert / removal cycles. For the purpose of archive, these limitations are not significant.
  3. A device to read the media later.
    The latest Macintosh desktop I have has no optical drive. While I could still purchase one, it is likely that ten years from now it will be difficult to find a drive to read CDs or DVDs. At some point, USB ports will also disappear, to be replaced by some newer better faster cheaper connection mechanism. For a while there will be gadgets that will still accept that thumb drive, but quicker than you can image it will be very difficult, and expensive, to read a thumb drive.
  4. A program to read the data.
    Perhaps the most significant long-term risk is having some program that can interpret the data on the media. With the 1215 Magna Carta, all I would need is my eyes, a magnifying glass, plus a refresher course in old Latin. Try to find a program that can read a Microsoft Word document created in 1982, or worse a document created by a program published by a company that does not exist. I lost some drawings I had created in an extinct Macintosh program that does not run on existing hardware and operating systems. Fortunately, I didn’t really care, but it was annoying. For long term storage, I suggest not using the native program format (e.g., .docx) but create PDF files. I expect that PDF, standard picture formats like .jpg, and using iTunes compatible formats for music will still be readable for decades, or at least give you time to convert the file formats. If you do need to keep the native formats, plan on running a test before you completely move to a new version of a program, a new platform (e.g., Macintosh to Windows or vice versa), or a new major operating system release. If it looks like it may be a problem, convert to a newer or different native format before you make the jump. A good rule of thumb is to update the native format files at least every five years anyway.

In general, you should not expect to successfully get data from stored electronic media after ten years, and you should plan to refresh your long-term data storage every five years or so. So you could endow an organization to do the refresh every five years and have some expectation that your data would still be accessible in 800 years.

Or you could print a dozen copies on cotton paper and give one to each of a dozen monasteries or cathedrals in England.

The last word:

That monk who copied the Magna Carta would, other than language, be pretty much at home in England for the first 600 years of the document’s existence. After that, with the changes including the indoor plumbing that first appeared in England around 1890 in London, he would be more and more lost. He would however have to find a different line of work, maybe typesetting, after about 225 years.

He, like many of us, would be baffled by a world where almost everything changes every 20 years.

