Letter Re: Preserving a Digital Library

Dear Mr Rawles,
Since I have worked for a few decades now with computers as programmer, installing systems and building/repairing computers, I read last week’s articles/letters on a digital libraries with interest. Though most information provided is correct, some possibilities weren’t discussed, while others may not be entirely clear or confusing to the uninitiated.
So, in addition to the previous postings, here is my take on ‘digital libraries for dummies’:

Putting together a digital library is a good idea and I have one too. It contains everything from books to reference diagrams, user manuals and SurvivalBlog archives. However it can become a needless burden on (possibly scarce) resources if not done correctly. So before you run out to buy things you may not need, lets take a look at whether your intended
approach fits with your other preparations.

– How much storage is required? As much as you need/can afford/deem necessary. I know that doesn’t say much but it is really what it comes down to. For example I have scanned all old family pictures I could find and stored the scans along with newer digital pictures. They are part of the library, together with copies of music CDs and vinyl records, a few movies and family videos. And some games in case people get really bored. The computer says the library has grown to over 150,000 files most of which are compressed by lossless algorithms to around 100GB total required storage space. That’s a small hard drive, average SSD, 4 32GB SDHC cards, 20+ DVDs or 150 CDs.

– How do you manage this much information? I do not use a program to manage the library but simply use a folder structure to keep everything in a place where I can find it. For instance there is a ‘books’ folder, a ‘documents’ folder, a ‘pictures’ folder, etc. Each of these folders contains a tree of subfolders to quickly find items. I know: I’m old fashioned but it works and saves me the trouble of having to learn another piece of software that may or may not work (I still remember losing a number of pictures due to buggy picture management software that came with a camera). Besides, if I really can’t find something, Linux has built-in commands to find (the path to) files and to scan any and all documents for keywords.

– So do you need encryption? Well that depends but, realistically, the answer is probably not. If your library consists of KJV, Moby Dick and chicken coop blueprints published by the government in 1922, then you are better off without encryption since that won’t raise anyone’s suspicions. On the other hand, if you are carrying around guerrilla warfare planning documents … you are probably in way over your head if you are looking for advice here. Please keep in mind that weak encryption is worse than no encryption, because you may rely on the encryption to keep your secrets whereas un-encrypted info won’t give you that false sense of security. FWIW I don’t use encryption on my library except for folders containing personal info and password vaults.

– Should you rely on CD and/or DVD disks for your library? As H335 pointed out you will be dealing with bit rot. This can be somewhat alleviated by storing archival type (= relatively expensive) disks in a cool, dark, dry place but even that is not fool proof according test data available on the internet. Do I use disks as back up? Yes, but I keep three copies of all documents in my library on three different media: a very reliable old hard drive, DVD disks and SDHC cards. Surely something will survive!

– Why SDHC cards? They are small (=easy to hide), cheap and reliable. All you need for them to work is a good quality USB reader. Don’t buy any reader that costs less than $10-$15 or you *will* regret it. For the cards themselves, try to buy units that carry a lifetime warranty [for the best price you can find]. The really nice thing about the cards is that they are re-programmable. Apart from being able to delete unwanted documents this greatly enhances their longevity. Here is how that works: their data retention is usually specified as 5-20 years depending on quality of parts used. They should also allow a minimum of 3000 write cycles before wearing out the cells. So to be on the safe side, I refresh the data (=copy to another card) once every year or so and can, conservatively, do so a 1000 times. I think they will outlast my needs … Because of their small size I am not really worried about EMP damage, but it doesn’t take much to protect them properly. If you want/need something really tiny, get a microSD card. They are about half the size of your finger nails and just as thin and have the same storage capacity as regular SDHC cards. Easy to lose but might come in handy if you want to sew them into your coat. If you don’t mind something bigger than SDHC cards, USB sticks (in many disguises) can be used the same way.

– Do you really need printers, paper, toner, etc.? You might if you plan to be holed up in your fortress and expect to be without power for extended periods of time. In that case I suggest you start printing now when supplies and power are still cheap and plentiful. My philosophy is that I may need to leave in hurry without the possibility of dragging paper around so I have made no provisions for printing large quantities of documents. Nor do I care to leaf through hundreds of printed pages looking for a passage or table when the computer can find it much quicker. However if you plan to be teaching a community group for example, there are legitimate reasons to stock up on supplies.

– By going fully paperless I will need something that can read and display the stored information. A full desktop computer will do nicely, especially on your retreat, but may not be the best solution. Laptops and tablets use less space and energy.

– laptops. I usually keep two of them around. They are identical so if one dies I can use the other one and have spare parts for it. My personal preference is to use Dell Latitudes because they are plentiful (=cheap) and have worked well over the years for me. I also know how to take them apart and fix them which helps. IBM’s Thinkpads also have a good reputation. If you go shopping for a laptop: look for an off-lease business laptop – they are made with premium grade components and all the bad apples have been weeded out long before they come off-lease. Do *not* buy a pallet full of laptops for $50; chances are none of them will work when you plug them in for the first time. 30%-50% of the units should be salvageable but only if you know how. Your $50 is better used for buying a laptop that has been tested and is guaranteed to be not DOA (plenty of those listings on Ebay). As a rule these laptops have their hard drives wiped and a fresh install of the OS. If the hard drive wasn’t wiped there is no real reason to go out and have them professionally wiped. This was a good idea in years past when we had low capacity drives. However hard drives that were build in the last 3-4 years use very narrow magnetic tracks that can be effectively wiped by simply having your computer overwrite them once with new data as shown by blind testing in data recovery labs. Of course there is a downside to this: your own data can be lost that much easier too … Don’t go for the latest and the greatest. Older laptops are built better and have sturdier electronics because they are build on larger process nodes. Single core machines are just fine for what you will likely use it for. I still have a laptop that is over 10 years old. I only use it for programming micro-controllers which means it gets lugged around all over the place, but its doing just fine and I am less afraid of breaking it than the newer ones. It even gets 4 hours run time out of today’s higher capacity batteries. The downside is that I need to run Windows 98 or something like Puppy Linux because its underpowered for almost any other OS.

-tablets. I have been thinking of getting one but have a hard time justifying the purchase. Their big attractions are small, light weight and energy efficiency which is important if you don’t have too much available. But … they are throw-away electronics. Especially the ones where you cannot replace the battery. Under normal daily use/nightly charge cycles the battery should give out in about a year (you might still get 1-2 hrs run time on a charge but nowhere near advertised spec.). So you are either tethered to your charger or can go buy a new one. That’s assuming you haven’t run into any of the wear-and-tear issues associated with today’s high performance/small footprint/passive cooling designs. So if I need to keep laptops around as backup in (the somewhat likely) case that the tablet fails, why not just stick with the laptops. The second thing I am not too keen on is that most tablets (and smart phones for that matter) work as personal tracking devices in their default configurations. And they are really good at it. Having said that, if you already own one and it has an SD card reader or accessible USB port; there is nothing wrong with using it with your library. Just don’t depend on it as your only reader.

-Windows XP. I noticed it mentioned in some posts. This product is fine to use as operating system for your library reader provided you understand the risks. From April 8, 2014 onward Microsoft will no longer support it. Without security updates you will be a sitting duck for viruses and other types of attacks. So you should only use it on computers that are not connected to the internet which may not be a problem when SHTF. However SHTF also means you will not be able to re-activate your copy should your computer crash or need a new hard drive, CPU, etc. For these situations there is a solution. Make sure you have downloaded and stored a piece of software called AntiWPA. You install this right after you install Windows XP. It works by starting windows in safe mode and switching to normal mode once you are past the activation code check. Your windows license is not tied to your activation code but to your machine. Assuming you bought your machine with a retail copy of windows or the machine came with a COA sticker, you are not doing anything illegal by using AntiWPA to start your machine. If your machine came with a COA sticker (likely if its an off-lease business laptop), make sure you make or download your own CD with a copy of windows (or any other OS) and know how to install it or know someone who does. Just adding a how-to document to your library will lead to some very unpleasant moments/thoughts when the computer tells you it can’t find a bootable hard drive. As for me, I still use Windows XP occasionally to reliably run some older programs and create my tax returns. But it lives inside a Virtual Machine (VM) without access to the internet. Its universe is restricted to the 10GB file on a hard drive in which it resides. If you are really concerned with (internet) security, take a look at a program called VirtualBox. It surprisingly stable and easy to use and comes with sane defaults so you can just click your way through the initial setup wizard to get started. And if you mess up, you delete the file and start over again till you get it right … which works great for practicing OS installs too.

– What about data security? There are many aspects to this question most of which you won’t be able or need to deal with. Here I will highlight three: local data storage, cloud and internet use.

– Local storage security. Data security of your locally stored information can be achieved to a reasonable degree if you wish to do so. If you want to add a digital layer of protection to your locally stored information, the most important aspect is your password. It needs to be long, unusual and contain numbers and punctuation marks. Password cracking software tends to incorporate lists of often used passwords or even a dictionary because trying those first yields far better results than applying brute force techniques due to people’s common password choices. It also needs to be long because top-of-the-line graphics cards (think Radeon HD7970 @$350-$400) can find any password of less than 9 characters via brute force in 2 days or less. The next model (due out in October) is expected to do it about 30% faster. At any rate a 12-15 character password should be safe for the next few years. In case the government confiscates your disks to look at them, I doubt any type of encryption available to you will stand up against their attempts. And please give sufficient thought to how and where you store your backups. Under a slab of concrete is far more secure than in a kitchen cupboard.

– Cloud security. Assume it doesn’t exist and that the cloud is as transparent as glass. This goes for both data storage and information processing in the cloud. If you don’t believe me read the fine print in the ‘terms of use’ you are agreeing to. Some companies use OSS cloud software which lowers your risk somewhat but you still have to traverse the internet. For example I saw someone touting the virtues of removing EXIF data from pictures before posting or emailing them. He had the right idea: I never send any picture out without stripping all its EXIF data. Then he mentioned this could be easily done in the cloud: all you have to do is send your picture over and it would come back to you in stripped format. You just have no idea how many copies were made before the exif data was stripped. For real OPSEC you want to download something like ‘exif-tools’ and process your image at home.

– Data security during transmission outside your computer should not be assumed as has been documented by Mr. Snowden et al. However there are a few things you can do to lower your risk because not all software is created equal. Running DOS might be fine because it pre-dates the time that the internet was a household word. Its just that its kind of useless in that it won’t run any program that is capable of rendering today’s web pages. All other Microsoft OS, MS Internet explorer and Apple products are suspect and I don’t use them to get on the internet if I can help it. Unfortunately I feel I also have to put Google’s Android and Chrome OS in this list. So what’s left to lower your risk? Basically something called Open Source Software (OSS); this means that the source code of the programs that you run is freely available for download by anyone interested in improving the code, looking for bugs, back doors, etc.

The premier OSS operating system is Linux. But Linux by itself isn’t much fun: you will also need a desktop environment and apps to do something useful. Examples of Linux based user interfaces are Android, Ubuntu, Fedora, Mint, etc. (The reason I mentioned Android as suspect is that its user interface comes with [closed source] binaries that cannot be inspected). If you want to try a Linux flavor for the first time: download a free copy of Linux Mint 13 (codename Maya and supported till 2017) because it has the most windows-like user interface of all Linux distros. It even has the familiar ‘start’ button, though they call it ‘menu’. Burn the iso image on a DVD and start your computer from that DVD – guaranteed virus free for the lifetime of the DVD and also a very useful approach should your system crash after the grid goes down. Alternately you can use a program called Unetbootin to load the image on a USB stick and start your computer from there. Mint comes with Firefox as default browser and includes media players (including VLC), document viewers, pdf readers and an office suite out of the box. It also has a software center for additional app downloads.

I would be remiss if I didn’t explicitly point out that ‘lowering your risk’ is not the same as ‘taking away your risk’. For example using Linux will lower your risk of running into a virus or giving easy access to your documents via a backdoor. Sending an encrypted email lowers your risk of people other than the recipient reading them. The stronger the encryption, the lower your risk. However, in the last few weeks a number of valid concerns have been raised that the NSA has spend a lot of effort making sure that various internet encryption protocols were designed in such a way that their implementations would be easy to crack for them. In such a scenario a properly written OSS app without known backdoor still would not provide adequate protection against NSA efforts. In laymen’s terms: depending on the contents of your encrypted messages you may want to consider using carrier pigeons instead of the Internet. – D.P.