Backup and Recovery
Database backup, restore, and recovery are critical processes for any business in the marketplace.
Database Backup Strategy
The backup strategy depends largely on the amount of time the system is permitted to be offline after any type of system crash.
A well-implemented database backup and recovery strategy is the cornerstone for every deployment, making it possible to restore and recover all or part of a database without data loss.
In the Oracle landscape, there are several options available to support backup strategy. The basic concepts of four well-known strategies are described in the subsections below.
Cold database backup (offline backup)
In order to do an offline backup, it is necessary to shut down the database because the files must remain unchanged. All database files, redo logs, and control files are backed up to a secure medium.
It is not common to use cold database backups as a backup strategy because it is not possible to perform a point-in-time recovery; the database is unavailable during backup, and with the Oracle database shut down, it means that all in-memory information is lost.
Hot database backup
The 'hot backup' strategy supports backing up a running database. It is a three-phase, manual process, where the tablespace is put in backup mode - that is, issue the alter tablespace <TABLESPACE_NAME> begin backup; statement, then back up the data files of the tablespace by OS copy, and lastly, take the tablespace out of backup mode — alter tablespace <TABLESPACE_NAME> end backup;
It should be noted that Oracle will continue to write to the database’s data files even if a tablespace is in backup mode. In this case, Oracle will write out full change blocks to the redo log files. Normally, only deltas (changes) are logged to the redo log files. This is done to enable reconstruction of a block if only half of it was backed up, causing a split block. Because of this, there should be an increase in log activity and archiving during online backups.
When the procedure for all permanent tablespaces is complete, make an archive log switch, create a control file backup, and copy all archive logs and control files to the backup destination. This approach will back up data files, control files, and archive files. During the restore process, the data files affected by the failure and all the associated archive files should be restored.
The advantage of this backup strategy is that you can still work on the database while it is being backed up, meaning that the 'normal' window for backup is expanded time-wise. Hot database backups are also an optimal choice if SAN snapshots are being used as part of the overall strategy.
Database backup using Oracle Recovery Manager (RMAN)
The Oracle Recovery Manager (RMAN) backup tool backs up, restores, and recovers database files; and supports incremental backups. This approach not only optimizes backup times but also improves recovery operations since RMAN intelligently decides what combination of full, incremental, and archive-log backups will produce the optimal recovery path.
Recovery Manager maintains information about the backup files in either the recovery catalog or the control and archive redo log files, thereby freeing the DBA from having to track all the backup copies and archive logs (except for keeping track of tapes). It then uses the stored information about the backups to automate both restore and recovery operations so that there is no chance of accidentally restoring from the wrong backup.
The advantages of using RMAN compared to hot database backups are:
- It is part of Oracle’s core software and it handles the entire backup - in other words, both the reading and writing of data blocks, which makes it more efficient than the manual OS copy methods used for hot database backups.
- It administers the backups from a repository, which is stored in the control file.
- It handles both restore and recovery.
Note: The only reasonable backup strategies for a critical system are online backup methods. Though these methods can run online, they will typically degrade performance. For this reason, backups should ideally be performed during maintenance windows only.
Disk backup
This backup strategy is a complete backup of each individual hard disk. This process could be very time consuming, depending on the number of disks. For this reason, this technique will normally be more suitable as an application backup strategy.
Backing up to disk first and then to tape
It is highly recommended that the database be backed up to a designated disk partition, and afterwards, be written to tape for long-term storage and disaster recovery purposes.
Storing tapes off site
The ideal backup strategy ensures that there are multiple tapes available in the event an initial tape is bad, corrupted, or similar. In addition, tapes should be stored in multiple, off site safety facilities to ensure backup availability at all times.
Test restore frequently
It is good policy to assume that a backup that has not been restored and tested is not a good backup. In other words, it is bad practice to overwrite a tape before having tested that a more recent backup can successfully be restored.
Exclude general OS backup from online Oracle files
Backing up the Oracle database is tricky and should only be done with the provided hot backup scripts or third-party backup tool that has database plug-ins. Attempting to back up Oracle data files while Oracle is running will cause it to shut down fatally if a file is locked by backup software during file copy. Oracle data files can be backed up safely only when Oracle is shut down.
The only exception to this rule is the hot backup script, which uses special functionality provided by Oracle to safely copy files while the database is running.
Backup responsibilities
It is the responsibility of the system administrator to perform the following backup tasks:
- Ensure that the database backup script is run every night (or other agreed-upon schedule) and that the content of the 'backupspace' partition is backed up to tape every night (or other agreed-upon schedule) after the script has finished.
- Make sure that the generated backup log files do not contain any errors that may indicate that the backup was not successful.
- Be certain that the area designated for the hot backup resides on disks that have enough storage capacity to hold the backup.
- Be sure that the hot backup folder includes all files necessary to perform a complete Oracle database restore.
- Certify that all the files in the hot backup folder are backed up to tape, that they can be restored, and that the Oracle database can be completely restored from said tape.
- Restore the contents of the hot backup folder (in the event of a system disaster) to hardware that is similar in configuration to that of the failed system.
Stibo Systems Recommendations
Stibo Systems recommends the following storage layout combined with one of the following backup strategies.
Storage layout
Storage area should be divided into four separate areas:
- Operating System (OS)
- Other software (SW), for example STEP and Oracle
- Database files only (DB)
- Online backups (staging area)
No OS / SW files should be placed on volumes used for DB / staging and vice versa. The staging area must be large enough to hold a full backup of the database – including the archive log files and incremental backups.
Strategies
- Hot backup is performed using the script mentioned in the RMAN backup script section below. The script uses RMAN for performing an RMAN hot backup to disk (staging area). This backup must then be backed up during the file system backup.
- Backup is performed using third-party backup software (BSW). For example, NetVault, Legato, or Veritas. When doing so, backups can be taken in two ways:
- Using built-in features in BSW, which are integrated with RMAN, allow RMAN to perform hot backups directly to tape device. This will enable you to save the extra costs of creating a staging area.
A disadvantage of performing backups this way is the price of the license for using RMAN via BSW, as this license is usually expensive. Another disadvantage of using BSW for backups is that it requires the software to be present on the target environment in case the database is to be cloned / restored on other servers.
A third disadvantage of this approach is that backing up directly to tape will most likely increase the restore time.
- Using an RMAN script to perform hot backups to disk (staging area), and thereafter, utilizing the BSW to back up the OS, SW, and staging area volumes to tape device.
Whether daily / weekly database backups are to be performed as full or incremental backups depends on the size of the database, the backup window, and the time given to restore / recover the database in case of disaster. In general, Stibo Systems recommends RMAN for internal database backups, but in practice, the scheduling and type of backups is to be decided based on the Service Level Agreement (SLA), database size, and backup window.
Note: For special-purpose backups – for example, if a database copy needs to be sent to Stibo Systems for implementation – or support-related activities, the recommended strategy is using STEP's own export / import utility, STEP DB Toolbox, which can be seen in the STEP DB Toolboxes Guide and Examples documentation.
Backup strategies are always subject to discussion since it is dependent on various parameters such as: database size, server architecture, storage capacity, and uptime requirements. Regardless of what backup strategy is chosen, it should always be carefully tested before being considered a reliable backup.
Backup Procedures
Focusing on the STEP system, the backup procedure must be designed to back up enough data from the system to restore any type of stored application data hierarchy.
RMAN backup script
Stibo Systems has written a hot backup script called rman_backup.pl as an alternative to a Client / Customer written one. However, upon first use – including, but not limited to configuration of the Stibo Systems script, the Client / Customer understands and accepts the disclaimer below:
When using the Stibo Systems backup script:
- All backup files are placed in a sub directory on the partition labeled 'backupspace' – also known as the staging area (the default path is /backupspace/<ORACLE_SID>).
- An RMAN restore script is generated.
Important: When using the rman_backup.pl script, the client / customer accepts that Stibo Systems cannot be liable for any use whatsoever of the script. Therefore, it is also the sole responsibility of the client / customer to ensure that the result of any backup is fully operational and useful after changes made to the aforementioned script or the STEP software – including, but not limited to patching, updating, configuring, etc.
File systems that must be backed up
To be able to restore the system, the following file systems must be backed up during cold backup:
|
File System (default) |
Description |
|---|---|
|
Local disk |
System volume |
|
/home /opt |
Contains STEP installation and third-party software |
|
/workarea |
Workarea for STEP |
|
/upload |
Hotfolders for STEP |
|
/backupspace |
Oracle RMAN backup |
The following file systems can optionally be backed up during cold backup:
|
File System (default) |
Description |
|---|---|
|
/database/db1 |
Oracle data files containing the database files and control files |
|
/database/db2 |
Oracle data files containing the database files and control files |
|
/database/db3 |
Oracle data files containing the database files and control files |
|
/database/dbredo1 |
Oracle online redo log files |
|
/database/dbredo2 |
Oracle online redo log files |
Backup frequency
Stibo Systems recommends extending the chosen strategy to include some sort of backup covering the entire system – especially the installed software and its configuration information. The minimum recommended backup schedule is as listed below.
|
Data type |
How often |
Notes |
|---|---|---|
|
OS |
When changed, minimum once a week |
|
|
Applications |
When changed, minimum once a week |
Normally located on the <STEP_HOME> volume (default is /opt). |
|
Configuration |
When changed, minimum once a week |
Normally located on the <STEP_HOME> volume (default is /opt). |
|
Database |
Daily |
Complete including data, configuration, and archive logs. Use hot backup scripts. |
Recovery
There is no easy answer to a recovery strategy since it depends on the type of failure. For example, a disk crash, a corrupt file system, a bad block in the database, etc. can have different solutions.
In any situation where there is a system failure, Stibo Systems strongly recommends contacting Stibo Systems Technical Support in order to recover as quickly and seamlessly as possible. As an additional resource, the backup script mentioned in the RMAN backup script section generates an RMAN restore script for use when restoring and recovering the database is required.