Being that I’ve spent a lot of time this week working on updating the backup system for my photo archive I figured it would be great time for me to share how I handle my backup. Every photographer handles the backup of his or her files differently so keep that in mind when reading this. Just because I do things a certain way doesn’t mean you should do it the same exact way.
When I used to shoot film, things were easy; I had one copy of each negative locked away in a filing cabinet and that was it, end of the story. Sure I may have some prints of my favorite pictures stored elsewhere, but not prints of all the negatives. Once I started scanning my film, life got even better. Now I had a digital copy of the image as well as the negative in the filing cabinet as a backup. There was no such thing as multiple copies of a negative, or having an off-site backup of the negatives. If the building burned down where I kept my negatives then they were lost forever.
Life in the digital photography realm is different. From the moment of their creation images live in a world of zeros and ones on some form of digital media. I usually shoot tethered to my computer so my images go strait from the camera to a hard drive and on those drives they will live forever. I have always been a fan of keeping images on hard drives instead of on CD’s, DVDs, Blue Rays, or any other writeable media. The idea of hunting through pages and pages of DVD’s in a case to find the images I needed to find never sounded appealing to me. If I had a full time archivist maybe it would be fine, but I don’t. Also, the idea of having a closet full of hard drives which I need to look through when looking for an old job is inefficient and slow to me. Also, what happens if I try plugging in that hard drive in a year and it doesn’t work?
The solution that works for me is keeping my entire archive “live” on servers that are assessable over the network and any moment. Is this more expensive than using books and books of DVD’s to store my archive? Yes and No. The servers, upkeep, hard drives, and electricity do cost more than a pile of blank DVD’s, but when you consider how much I would need to pay for someone to burn all those DVD’s and maintain them all organized; suddenly my method becomes much cheaper.
What if a server crashes? What if a hard drive fails? Some people can become down right obsessed with backup and go over the top in how they backup their images. I don’t blame them, as a photographer my images are my living, and if I loose them, it causes some big problems. One of the reasons some photographers go way overboard with backup is because it’s really easy to do so. When I shot film, there was one copy of that piece of film, but with digital, I can make as many identical copies of my originals as I want. The real question is how many copies is enough?
I treat new image files differently than old archive image files. From the instant a new image is created, I want be sure that it is safely protected from hardware failure. I often see photographers shooting to a compact flash card or tethered to a laptop computer’s internal drive. If they have a good assistant those images will be backed up to an external drive thought-out the day so at least now they have a second backup. In the event no backup is made and that computer fails, that photographer is in big trouble. This happens all the time!
This is where RAID comes in. Get ready; things are going to get real techie real quick. I’m going to try to simplify things as much as I can so that they’re easy to understand. First things first, everyone has hear the word, but what is RAID? It stands for Redundant Array of Independent Disks; basically a drive comprised of multiple drives. Why would someone want to bother with RAID? By using multiple hard drives configured together to make a single RAID drive it allows for the drive to become faster, more tolerant of faults, or both. As photographers we want our images to be safe in case of a disk failure, so it’s great that RAID allows us to do this. Since I shoot with a 80 Megapixel camera and deal with huge files, increased speed is also a big plus for me. As you can see RAID drives are incredibly useful.
Since RAID drives combine multiple drives into a single drive, you need some sort of device or software to create this new drive. This is referred to as the RAID controller or basically the brain that will decide where and how data is stored across the group of drives. There are two kinds of RAID controllers, software RAID and hardware RAID controllers. For example when you buy an external hard drive or server that supports certain RAID levels, that hardware is creating the RAID. If you plug a few drives into your computer and use disk utility or another piece of software to turn those disks into a RAID drive, then that would be software RAID. Generally speaking hardware RAIDs are faster because the controller is built in. Enough about that for now …
In the beginning RAID was quite simple, there was RAID 0 (Stripe), and RAID 1 (Mirror). These days, new RAID configurations are coming out all the time. We have RAID 0,1,10, 2, 3, 4, 5, 6 and any combination of these such as 1+0, or 5+3. You should absolutely go to Wikipedia and read their description of the various RAID levels, but I’m going to summarize those I feel are most important to photographers here because it’s a lot to take in, especially for a non-techie photographers.
RAID 0 – Also referred to as “stripe”. In this configuration you take multiple hard drives (at least 2) and add them together to make one big faster drive with zero fault tolerance. For example, if I have 4 X 500GB drives, using this configuration I’ll end up with a single 2TB drive. Then, if I copy a 100 Megabyte file to this new drive, the file will be broken up into 4 pieces and each drive will store 25 Megabytes on it. This allows the drive to hypothetically write 4X faster than a single drive. This is usefully for someone who needs to write or read files very quickly. The major downside to this configuration is that if any of those 4 drives fail, you loose everything, and I mean everything. Without all the pieces, the files can’t be put back together again like Humpty Dumpty.
RAID 1 – Also referred to as “mirror”. This configuration is the easiest to understand. Basically two drives are combined together and the same information goes onto both at the same time. This adds redundancy but almost no speed improvement. Write speed will be the same speed as if you were using a single drive, but read speed will increase slightly because you can read from both drives at the same time. This is the most common configuration photographer’s use for image capture drives. With 2 X 500GB drives, you end up with a single 500GB mirrored drive. If I copy a 100 Megabyte file onto the new drive, the same 100 Megabyte file is copied onto both drives. If one drive fails, no worries, all the information is on the other drive in the mirror.
RAID 5 or 6 – Also referred to as “stripe with parity”. This is the most complicated RAID configuration to understand, but one of the most used in servers. RAID 5 needs at least 3 drives to be created. Without getting into the real techie details, all your data is spread out amongst many drives but if any one drive fails, all the data is still safe. With RAID 6 you need at least 4 drives, and if any two drives fail the data is also still safe. The exact detail of how this works is very complicated and if you want to know more about how parity drives work, check out this link. These RAID configurations offer lots of options and offer a great balance between fault tolerance and increased speed. If you have 4 X 500GB drives in a RAID 5 configuration, your new drive will be 1.5TB in size (you loose one drive worth of space to parity). If you have 4 X 500GB drives in a RAID 6 configuration, your drive will be 1TB in size (you loose two drives to parity).
So, what is the best RAID configuration? The one that works for your particular backup needs. In my studio I use a combination or RAID 0,1,5, and 6 for different parts of my workflow. On this page you can see a couple diagrams showing how I have setup my backup system. The stages to the system are:
1. Capture – In the studio I capture from my Phase One IQ180 digital back to my Mac Pro towe. In my tower I have a four drive RAID 5 drive setup as my main capture drive that I call “Shoot Raid”. I am using the Apple Raid Card in the Mac Pro as my RAID controller. As a startup drive I’m also using a SSD drive for maximum speed for my applications. Thanks to this setup I have redundancy built into the drive I’m shooting to so that every capture is protected against failure the second I shoot it. If one of the drives fails in my RAID 5 drive, my data is still safe. It would take multiple simultaneous drive failures to destroy all my data; which is highly unlikely.
2. Hourly Backup – I have an external two drive RAID 0 drive connected via eSATA to my Mac Pro Tower. This drive automatically does an hourly backup of Shoot Raid. In the event my tower goes down at any point during a shoot, I can just plug this drive into another computer and be up and running with minimal data loss. Worst case scenario I loose images from the last hour. I decided to use a RAID O setup here because I wanted maximum transfer speed and don’t really need to redundancy here. I want the backups to be quick so they don’t get in the way of my shooting.
3. Nightly Backup – Every evening my computer automatically backs up my Shoot Raid to a network server in my studio running an eight drive RAID 6 configuration. It would take three simultaneous drive failures to make this server fail, which is also incredibly unlikely.
4. Weekly backup – As an extra set of protection. Every Saturday, my second set of servers (used to be my main servers but got old and are slow so they became my backup servers when I upgraded) make a backup of the main servers, including my Shoot Raid backup. This rack is located on a different floor in my studio building adding an extra level of protection in the event something catastrophic happened in my studio.
5. Annual Priority Backup – I do a backup of my most important files like my portfolio, stock submissions, and any other really important jobs onto external drives and take them home every year.
After jobs have been published, I move the jobs manually to the corresponding client folders on my archive servers where they will live forever and delete them from Shoot Raid. Every year I try to go threw old jobs and delete extra files I don’t need such as light tests or jobs that are no longer relevant. This keeps the size of the archive more manageable.
I realize my setup is not for everyone and I realize it may be a little extreme. But, the basics of what I am doing can apply to a lot of different photographer’s workflows if even just on a small level. To summarize:
The Basic Rules of Photo Image Backup:
1. Never shoot to a non-redundant drive if you care about your pictures. If using a laptop or iMac, shoot to a fast external mirror (RAID 1) drive so that you have redundancy. Don’t trust that your digital tech or assistant will stay on top of backups. It’s easy to find very affordable RAID drives with every connection imaginable these days: Firewire 800, USB 3.0, eSATA, Thunderbolt, etc. There is no excuse for losing images due to a hard drive crash.
2. Automate your backups. It can be as simple as setting up a drive as a Time Machine backup drive connected to your main computer or as advanced as setting up scheduled backups to a network server. I use Carbon Copy Cloner software for my automatic backups. It’s cheap, it’s reliable, and it works flawlessly for the most part.
3.Find a way to backup your backup. This is where you where you need to ask yourself how far is too far? Do you backup your most important files only or your entire archive?
I hope this incredibly long techie write-up has been helpful. There is a lot of information here and I hope I helped you understand why I do what I do. If you have any questions about my setup please email me I’ll be happy to elaborate.