Introduction
Any data that's of value needs some sort of disaster recovery plan.
The trick is to anticipate the type of failure, determine how much "loss
of data" (if any) you are willing to accept, and then develop your
recovery plan accordingly.
A philosophical note: There is no value in performing
backups (Yes, that's what I said!), the value comes from
performing a restore. That might seem a bit corney, but it's a
good thing to keep in mind.
One more philosophical note: A disaster recovery plan that has
never been practiced is almost worthless... practice makes perfect.
Background
What types of disaster are you planning for?
There are a lot of things that can go wrong... and your disaster
recovery plan needs to address several basic types of disasters
- Physical loss: The data center burned down, somebody stole
the servers, etc
- Physical failure: The server or disks have died
- Malicious: A virus or malicious user has erased your
database content
- Software failure: A patch has failed and the server won't boot
- User errors: A user accidently clobbered a gallizion rows
in the table
The reason for pointing all this out is that the recovery plan is
different for each type of disaster. For example, you plan may
have to include installing SQL Server onto a new hardware platform.
How much data loss are you willing to accept?
You manage the amount of risk for data loss by the SQL Server's
"Recovery Model" and the frequency and type of backup.
- None: Requires a 2nd fail over cluster and Full recovery
model
- 5-10 minutes: Use full recovery model and transaction log
backups
- 4 hours: Use bulk logged recovery model and transaction
log backups
- 1 day: Use simple recovery model and database backups
Note: Fail-over clustering and other "hot standby" techniques are
not covered by this article
Database Recovery Models
Each database operates in one of three "Recovery Models"... which are
designed to allow you some flexibility on configuring the amount of disk
space required to keep the database safe from disaster.
- Simple - doesn't use a log file for transactions
- Bulk-logged - records most transactions individually, but others
in "bulk" fashion
- Full (the default) - records every transaction in a log file
When to use Simple Recovery model:
Let's say you have a database where you are only concerned about the
"current" version of the data. This scenario is similar to a
file-system (NTBACKUP) plan... where your major concern is to quickly
get the database back to a usable state... and you're not too concerned
about losing the data that has occurred since the last backup. In this scenario, you'd rarely (if ever) be asked to restore the
database to a particular "point in time"... so the full backups are
sufficient, and you probably won't need transaction logs at all.
This scenario is perfect for a developmental "play" database.
When to use Full/Bulk-logged Recovery model:
The use of the Simple Recovery model in a production database is
extremely rare. The more likely scenario is that you'd be
asked to restore the database to a particular point in time. For example, the Server and the Database itself are running normally,
but somebody accidently deleted a table from inside a database
application. That's different from a "failure recovery"...
because nothing has failed! This Full recovery model requires
transaction logs, so that you could restore the database to the exact
instant before the "user accident" occurred. The
Bulk-logged Recovery model is similar but only supports a restore to the
end of the log backup (not a "point in time" between log backups)
Recovery Model summary:
| Recovery Model |
Requires a large
transaction log file? |
Restores to |
| Simple |
No |
Last database backup |
| Bulk-logged |
Yes* |
Last log
backup |
| Full |
Yes |
Point in time |
* Not quite as much as Full
Backup Scenarios
Let's start with an example backup plan that has the following
components:
- Full database backups daily at midnight, retain for 4 weeks
- Differential database backups every 6 hours, retain for 2 weeks
- Log backups every 15 minutes, retain for 1 week
- External backup (NTBACKUP, etc) of SQL backup files
This plan allows for an archive of the data for 4 weeks (for security
and auditing purposes) and has the ability to perform a "point in time"
restore for practically any time in the last week. The use
of an external backup means the SQL backup files (the *.bak and *.trn
files) are themselves backed up and placed off site.
A note about retention times
Some folks will look at that example and say "Geez, those retention
times are very long, I don't think I've got enough disk space for that".
The problem is that some disasters go unnoticed for a very long time.
For example, if a malicious user damaged your database in a subtle way,
it might take you several days (perhaps several weeks) to discover it.
You need to retain your backups to cover the time it takes to discover
and audit those kinds of problems.
Note : Log files are required for a forensic audit of database
acivity.
Recovery Operations
As I mentioned as a philosophical note, it's the recovery operation
that really matter. So let's look at how you would perform a
recovery using the sample backup plan above.
Consider this scenario: Today is Friday, and you must restore
the database exactly what it looked like at 7:30 PM on Thursday.
The following are the steps required:
- Restore the full backup from Thursday at midnight
- Restore the 6 AM, 12 Noon, and 6 PM differential backups
- Restore the 6:15 PM to 7:30 PM log files
So, that's 1 full, 3 differentials, and 5 log files.... a total of 9
operations
You should always consider the number of operations that are required
for a restore. Remember, during a disaster, there will be chaos...
the phones will be ringing, people will be complaining, and that's not
the time to have to stop and remember "Oh wait, did I restore log file
#22 or was it #23?" or "I forgot to use NORECOVERY and I have to start
all over again". Trust me... simpler is better.
Point in Time Restore
An the name implies, a "Point in Time" restoration allows you to
recovery a database to an exact instance in time... by either by
specifying the time of day or a "named transaction". This
type of recovery requires that the database use the Full recovery model.
SQL Server Maintenance Plans
You'd typically perform this step from inside a Maintenance Plan:
Documentation Links
|