No Backups, Big Problem
I figured I’d end the year with a big bang of a post.
Whenever I talk about high availability or disaster recovery, I spend a bit of time on backup strategies (among other things). People sometimes give me a weird look like, “You should be talking to me about clustering, mirroring, and everything else, Mr. Clustering MVP. Let’s go!”. The reality is without backups, you have nothing. My blanket piece of advice is always this: if you do one thing right, do your backups well.
I know of customers (some who are even larger) who are still not doing things like:
- Copying backups to another location after they are made
- Testing the procedures to find and acquire the backups (especially if they were multi-tape or disc sets)
- Testing a restore of the backup
- Applying proper organizational techniques
Why are these important? Well, it’s simple. I’m going to use music to tie it back in to SQL Server because the music industry has some interesting parallels. As some of you may know, I’m also a musician. Growing up largely in the digital age, most of the things I’ve done since 1994 have been in the digital domain when I purchased my first DAT (digital audio tape) deck post-college as a graduation present to myself (a Denon DTR-80P). I still have it and it works well. My second DAT deck (a dual well Tascam) was a piece of junk compared to it – and it was a piece of studio gear!
DAT in terms of a music format was killed for copyright reasons in the consumer market, but flourished in the pro/semi-pro market. It’s a direct descendent of things like the Sony PCM-F1, which allowed you to record PCM digital audio direct to Betamax machines. My first All South Jersey Junior High jazz concert was recorded by one of the conductors with this combo. A part of me wants a F1 for the heck of it, because it’s just so cool.
For those of you who care about such things, DAT is a 2-track (i.e. stereo medium). ADAT (Alesis) and DA-88 (Tascam) were the affordable digital multi-track formats for pro recording in the 90s for many digital-based music projects as hard drives were not large or cheap enough yet. There were digital multitrack machines going back to the late 70s/early 80s from Mitsubishi (such as the X80, which was used to record Donald Fagen’s classic The Nightfly, but had an odd sample rate of 50 kHz in early machines), 3M, and the Sony DASH machines.
Laserdiscs came along in the 70s, and were the precursor to the compact disc we all know … and don’t universally love. The CD gave way to DVD, and now Blu-Ray (let’s forget about HD-DVD). In the 30 odd years digital has been a “mainstream” format, we have been promised that digital was supposed to last forever with no problems. Right? Wrong.
Rolling Stone just published an interesting article entitled “File Not Found: The Record Industry’s Digital Storage Crisis” which highlights a big part of the problem (but not all of it). In this digital-based music world, where anything physical is now looked upon as “outdated”, there are big issues. I’m not going to get into a sound quality debate nor am I going to lecture on the importance of legally acquiring music, as that’s outside the scope of the goal of this post. I think you can tell where I probably sit on those issues, anyway.
Where’s The Backup?
Obviously the title is a riff on the old Wendys commercial here in the USA, but it is appropriate. One of the biggest issues the music industry has faced in the past 10 – 20 years is finding the original tapes (mono, stereo, or multitrack) so things like remastering, archiving, surround sound projects, etc. could happen.These tapes are the collective IP, or data, of each of these companies no matter how big or small they are. No tapes, no revenue and bringing things into future media formats.
You may have heard tales where artists, labels, or studios threw out or destroyed tapes years ago just to make room somewhere. Well, a lot of that is true – just read the bit here on the Who song “I Don’t Even Know Myself”. A few years back, Steely Dan didn’t pursue a surround SACD for their seminal album Aja because they couldn’t find the complete set of multitrack masters. We’re not talking about some obscure a non-hit album with cult status; we’re talking one of the most popular and well received albums (arguably) of the past 30+ years by critics and fans alike.
Another problem is that a lot of the music industry is now using a third party like Iron Mountain to store their precious masters and they are not kept onsite. If the tape is labeled wrong, or is just stored without any thought or oversight from the record companies, finding a tape (whether it’s the stereo or a multi-track master) will be like a needle in a haystack. There are famous stories about master tapes being found labeled with words like “do not use”, and safety copies have been used for all of these years for subsequent releases.
Something more of you may relate to is your music collection that you may have culled for your portable player such as an iPod. Whether you are using iTunes or just storing the music files on a hard drive somewhere, many people have amassed large collections which amount to terabytes of music. For those who ripped the music from their own discs and didn’t purchase the music from online stores, you have invested a lot of time. Some of those folks even assume their files are safe, so they dump their physical media and just keep the digital file. Do a quick search using your favorite internet search engine and you’ll see that there are many stories recounted in forum posts, blogs, etc., about how someone’s music player or hard drive died, and they lost their entire collection. It’s not uncommon to now hear people making multiple backups of their files
Now apply all of this to your own world of SQL Server and unerlying Windows systems:
- Make backups frequently that match your stated recovery point objectives (RPOs) and recovery time objectives (RTOs). Can you afford <insert number of hours/days/months> of data loss? If the answer is no and you are making backups less frequently than that, you’ve got a problem. One size does not fit all. Someone may not think your system is mission critical, but if your business will come to a standstill if it’s down, well, do the math on that. Time to make up some new rules, boys and girls.When it comes to system and system state backups, depending on your processes and other factors, sometimes it is easier to just do a bare metal install. These days a base Windows Server 2008/2008 R2 installation is VERY quick. This does not mean you should never back them up or the critical files, but do what is right for your environment.
- If you are doing backups to disk for your SQL Server databases using native tools or third party ones from companies like Redgate, Quest, etc., are the resulting backup files being copied elsewhere? If not, you have a single point of failure. It’s great that you made the backup, but what happens if the drive it sits on dies? No more backup. How does that work for your RTOs and RPOs?
- In most companies, the process for copying the file and archiving it usually the responsibility of another group in IT outside of the DBAs and the Windows folks. Do you know the process for getting that backup? Have you tested it? At the end of the day, it’s your job – not the backup operator – to get the database or system back up and running. If you have not done anything, including having at least a working relationship with the backup operators, you might as well have a pair of dice around to roll when it comes time to get an older restore for whatever reason. If it is due to a major outage, you may even want to have a resume handy. I’m not kidding.This is a bigger epidemic in environments that use network-based backup programs and no local backups are done at all. Things are really out of a DBAs hands there. Good luck to you if you’re in that situation, have a 1TB database, and you’ve got a 15 minute RPO.
- Forget the process for getting the backup – do you even know where it is? Do you keep track of where you store them? Are the locations standardized? Are the names of the backups standardized, and do they include things like dates? I mean, you’ll never do any kind of restore if you can’t find the backup files. You have to know what you’ve got to make critical decisions. I talk about things like this in my older 2005 HA book which was not just a clustering book. My new Denali-based book will have a lot of these types of discussions as well …
- Let’s assume you’ve got backups and you know where they are. Do you test the backups? This is arguably my biggest pet peeve with nearly every customer. Making a backup just generates a backup. It does not mean it is good. The only way to test a backup is to do a restore of it. That may be impractical to do every night, and especially if you’ve got thousands of DBs and/or larger ones, nearly impossible, but for your mission critical systems with low RPOs and RTOs, it’s suicide not to test them at least periodically. Plus, you can answer the age old question from your C-level, “So how long will the restore take?” when you’re in that scenario.
- Do you ensure that your backups are not a single point of failure? As I mentioned earlier, some people generate their backups and they live in one place, and at some point, older ones are deleted to create the new ones. That’s all well and good, but if these backups are not being copied elsewhere, you lose where those backups are or the drive fails, you’ve got nothing. It’s your job to ensure that not only are you protected at the instance and database level with things like clustering and mirroring, but your database backups are just as redundant to a degree.
- Archiving backups is a big issue for some that are regulated to have them for like 10 years. That’s a lot of space, whether it be tape, disk, or some optical medium. Along the same lines, have you planned properly for your current backup storage needs? I find it’s often overlooked in system planning.
Forensic Science
Rolling Stone’s tagline says, “Vinyl and analog tapes last forever, but hard drives fail and digital formats change”. That’s a bit of a fallacy in the music world. Anyone who has worked with analog tape knows that at some point, it can become unplayable. Analog tapes can shed oxide (which essentially holds its “data”). Analog tapes can become moldy if stored improperly, and may not be able to be cleaned (it’s noxious stuff, too). There is a process called baking which will allow the oxide to re-bond with the tape material so it can be played again, but it may be its last play. The point is to get it so that it can be archived it in some other format (digital or analog). That’s the reality of analog. Forever? Not in my world.
Even standard audio cassettes (you remember those, right?) have issues over time: they become brittle, suffer from dropouts, etc. The cassette was a convenient format (much like MP3 has become the new de-facto standard for portable music, much to the chagrin of some), but hardly perfect. It was good consumer format that had a good life.
Digital audio on a tape-based medium has problems as well. For example, it can (and often does) suffer from digital rot. Basically it is when the tape deteriorates (for whatever reason), loses the information and if it is playable, you hear what is akin to white noise. This is somewhat like drops in an analog tape, except because it is 0s and 1s, the information isn’t coming back. This has happened to me; one of my first recording projects was lost to the sands of time because of this. I only discovered it when I tried to digitize it and burn a CD-R in the late ’90s … and we’re talking about a recording session from maybe 5 or so years before.
Two of the jazz albums I recorded between 1997 and 2001 (which will soon be available for sale again, but only as digital downloads; you can contact me or post a comment if you want more info) were recorded on ADAT. In 2005/6, I wanted to remix them and archive the masters. Well, that was easier said than done. I had the original ADATs, but playing them was another story. Some were problematic, but luckily they were playable at least once. There was a very real chance I was not going to be able to do what I needed to do.
This also brings out another problem: formats, playback equipment, and software programs and versions. A lot of the music (as well as video) industry records on formats popular at the time, but later may no longer be in use. The music industry is lacking in overall standards (although some are prevalent, such as MIDI and SMPTE for synchronizing tapes as well as music to video), so that complicates things.
The first thing you need to worry about is having the right hardware to play back the tape, and ensuring it is in good repair. Older formats may no longer be in use anywhere, so locating a machine, let alone parts, can be a tricky AND expensive endeavor. Even ubiquitous formats such as ADAT – only out of favor for a few years – are getting harder to deal with because everybody dumped their machines and jumped into computers and hard drives.
Then we come to software. Many of the modern programs we use today such as Cubase have their roots in the 80s on computers like the old Atari ST. I was an Amiga man myself, but you get the point. Even Macs which have been used in the creative world for a long time have gone through multiple iterations of processors and configurations, so a piece of software designed, say, to work on an old PowerMac won’t necessarily work on a newer Intel one if at all. And the older software may not even be backwards compatible, so files created on an older version, even if salvaged, may not be able to be opened. Add to that the use of things like software-based plugins instead of hardware processors for audio, and that complicates things even more because the plugins may only work with a particular version of its host. This also assumes your tapes/disks/hard drives are still playable, too. Are we having fun yet?
So let’s assume for a moment you’ve solved the format, hardware, and/or software problems, especially for more modern records which have been recorded on computers with things like the infamous ProTools. Do you have extensive session notes about the mixes and which bits of audio are in use? Do you have all of the files needed? Is everything logged and labeled properly? Chances are the answer is a resounding NO since no one thought of that while tracking, mixing, or mastering. Trying to do something with those files later will be a bit like alphabet soup.
This has come to the forefront with games like Rock Band and Guitar Hero where bands are giving these companies their multis to put into the game, but they are either missing master tapes (old analog or digital) or certain audio files (hard drive) are gone and lost to the sands of time. So the bands either do full re-records, or just re-record the missing bits.
A great article on all of this is “Remixing Depeche Mode In Surround” from my favorite magazine, Sound on Sound. Particularly look at the section “Archives & Pre-production”. It touches on nearly every aspect I mention in this blog post.
So what does this mean for you in the non-music, IT world?
- Media dies. Good example: USB keys. They’re convenient, but have a shelf life. Another example: hard drives. Heads can become frozen. Yet another example: tapes of any kind. We’ve all had one unfurl on us. You get the idea. Don’t assume that the medium you are using now will still be playable in 5 years. Test reading from them, especially for milestone backups you need to have around for whatever reason.
- Keep accurate documentation about your backup sets and what is needed to recover your systems or databases completely. I have some scripts which you can download right here on this site from my 2005 book which will generate the RESTORE command for you based on the types of backups you used and what is in msdb. Someone else who had no involvement with the backups may be called on to restore them, so having detailed information is critical.This will also help identify which tapes and disks (or location in a vault) are needed in addition to the files themselves, and where you can locate them. All of this matters in a server down situation where you need to restore from bare metal. If you’re missing stuff, well, you may be out of luck and a resume may be nice to have.
- Like the music industry, there has been no shortage of hardware, backup formats, and software. Do you have the right stuff to restore that 6 year old tape (assuming it’s playable)? Probably not. That tape drive was probably sitting in some storage closet a few years ago and decommissioned, and most likely given away or thrown out at some point. eBay may be your friend, but it won’t always yield what you need. Software may be the hardest part of that equation, as hardware can usually be found to make the restore. You may have a later version, of say, NetBackup, but it may not be backwards compatible with the format of the backup on the tape you’ve got. You may not even have that old version lying around. At that point, chances are you are out of luck. I’m not trying to be harsh, but that’s reality.
So as you can tell, backup and restore is a serious matter, and much more complicated than it appears to be. You need to be vigilant, especially if it’s your butt on the line.