I’ve heard people say many times that they don’t use full or bulk-logged recovery model or backup the transaction log of their database because they do not care about point-in-time (PiT) restores. However, PiT restores are not the only reason to use full-recovery model nor is it the only reason to back up the transaction log if you are using full or bulk-logged recovery model. This question came up again yesterday on the Spiceworks SQL Server forum.
Someone who is not a DBA — his company doesn’t have one — stated that he thought they needed to use full recovery model but did not understand why he needed to perform incremental backups of the transaction log since he is not interested in performing point-in-time restores. Since he had clearly stated he had no knowledge of how backups and transaction logs work, I posted a brief primer on SQL Server backups for him.
SQL Server Backups Primer
In SQL Server there are 3 general categories of backups, full backup, differential backup, and transaction log backup. It’s a much more complex and in-depth topic than what I’ll cover here, but this is a good primer. I’ll explain each:
Full backup: A full backup backs up the entire database (you can do this by just filegroup as well, but we’ll ignore that for now). It backs up all data and settings in the database and includes the minimal amount of the log file that would be required to bring the database to a consistent state.
Differential backup: A differential backup is like a subset of a full backup. There are special tracking pages in a data file that tracks which extents get modified. These pages (called differential map pages) only get reset by taking another full backup. At any given time, a differential backup will back up every extent that has been changed (in any way) since the last full backup. The differential backup will always include certain key extents that contain critical database information (the differential map pages will always show these extents as changed to ensure they get backed up) and enough log to bring the database to a consistent state.
Transaction log backup: A transaction log backup only backs up the active portion of the transaction log. Data files are broken up into 8 KB pages that are then grouped into blocks of 8 pages (64 KB) called extents. Log files have neither of these. Log files are broken down into virtual log files (VLFs). If a single record in a VLF is active, then the entire VLF is considered active and the entire VLF will be backed up. If you have many VLFs that are active, you will have a larger log backup.
When the database is in full recovery model or bulk-logged recovery model, the ONLY way for an active VLF to get marked as inactive (and therefore reusable for additional logging) is to back up that VLF in a log backup. There are misconceptions that a full or differential backup will make this change, but it’s not true. Also, to be marked as inactive, there can’t be any active transactions using the VLF or any VLFs before it. And there can’t be any dependencies that have not processed all records in the VLF (replication, CDC, database mirroring, Availability Groups). One reason you back up the log is to maintain it’s reusability and avoid growing the log file.
Another reason you back up the log file is so that you can replay those transaction if you have a disaster and need to recover the database. You can restore the full backup, any differential backup if exists, and then all log file backups since the last restored backup to bring the database as close to current as possible. It prevents data loss.
With regards to recovery and data loss, there is a term we use called Recovery Point Objective (RPO) which meas the amount of data you are willing to lose in the event of a disaster. If your RPO is 30 minutes, then your log backup frequency needs to be every 30 minutes or less (20, 15, 10, 5, etc). The frequency of the log backup is the potential for data loss if you have to recover from backup.
Log backups also support other functionality such as point-in-time restores and page/file/filegroup level (piecemeal) restores. Point-in-time (PiT) restores are critical for scenarios such as someone accidentally deletes critical data or drops a critical table. You don’t want to restore the whole database over the existing one, so instead you restore the database to a different server or with a different name to the point in time prior to when the delete/drop occurred so you can copy the data back into the live database. A PiT restore requires that the stopping point is in a transaction log backup, not a full or differential.
You would perform a piecemeal restore in a scenario where corruption has occurred and you want to recover without data loss. I can restore a specific page, file, or filegroup from a good backup of the database, but in order for that page/file/filegroup to come back online, it has to be brought current with the rest of the database. This means I am required to restore the transaction log files created since whatever backup I restored the page/file/filegroup from was created. All of them, including taking a tail log backup (backup of the current live database) and restore it. SQL Server is smart enough to only apply transactions to the piece of the database that is offline and ignore the rest.
- To summarize, you back up the transaction log for these reasons:
- To be able to reuse the log file and avoid growth of it
- To avoid data loss
- To support functionality like point-in-time restores and recovery from corruption without data loss