Backup and Recovery

Database backup, restore, and recovery are critical processes for any business in the marketplace.

Database Backup Strategy

The backup strategy depends largely on the amount of time the system is permitted to be offline after any type of system crash.

A well-implemented database backup and recovery strategy is the cornerstone for every deployment, making it possible to restore and recover all or part of a database without data loss.

In the Oracle landscape, there are several options available to support backup strategy. The basic concepts of four well-known strategies are described in the subsections below.

Cold database backup (offline backup)

In order to do an offline backup, it is necessary to shut down the database because the files must remain unchanged. All database files, redo logs, and control files are backed up to a secure medium.

It is not common to use cold database backups as a backup strategy because it is not possible to perform a point-in-time recovery; the database is unavailable during backup, and with the Oracle database shut down, it means that all in-memory information is lost.

Hot database backup

The 'hot backup' strategy supports backing up a running database. It is a three-phase, manual process, where the tablespace is put in backup mode - that is, issue the alter tablespace <TABLESPACE_NAME> begin backup; statement, then back up the data files of the tablespace by OS copy, and lastly, take the tablespace out of backup mode — alter tablespace <TABLESPACE_NAME> end backup;

It should be noted that Oracle will continue to write to the database’s data files even if a tablespace is in backup mode. In this case, Oracle will write out full change blocks to the redo log files. Normally, only deltas (changes) are logged to the redo log files. This is done to enable reconstruction of a block if only half of it was backed up, causing a split block. Because of this, there should be an increase in log activity and archiving during online backups.

When the procedure for all permanent tablespaces is complete, make an archive log switch, create a control file backup, and copy all archive logs and control files to the backup destination. This approach will back up data files, control files, and archive files. During the restore process, the data files affected by the failure and all the associated archive files should be restored.

The advantage of this backup strategy is that you can still work on the database while it is being backed up, meaning that the 'normal' window for backup is expanded time-wise. Hot database backups are also an optimal choice if SAN snapshots are being used as part of the overall strategy.

Database backup using Oracle Recovery Manager (RMAN)

The Oracle Recovery Manager (RMAN) backup tool backs up, restores, and recovers database files; and supports incremental backups. This approach not only optimizes backup times but also improves recovery operations since RMAN intelligently decides what combination of full, incremental, and archive-log backups will produce the optimal recovery path.

Recovery Manager maintains information about the backup files in either the recovery catalog or the control and archive redo log files, thereby freeing the DBA from having to track all the backup copies and archive logs (except for keeping track of tapes). It then uses the stored information about the backups to automate both restore and recovery operations so that there is no chance of accidentally restoring from the wrong backup.

The advantages of using RMAN compared to hot database backups are:

Note: The only reasonable backup strategies for a critical system are online backup methods. Though these methods can run online, they will typically degrade performance. For this reason, backups should ideally be performed during maintenance windows only.

Disk backup

This backup strategy is a complete backup of each individual hard disk. This process could be very time consuming, depending on the number of disks. For this reason, this technique will normally be more suitable as an application backup strategy.

Backing up to disk first and then to tape

It is highly recommended that the database be backed up to a designated disk partition, and afterwards, be written to tape for long-term storage and disaster recovery purposes.

Storing tapes off site

The ideal backup strategy ensures that there are multiple tapes available in the event an initial tape is bad, corrupted, or similar. In addition, tapes should be stored in multiple, off site safety facilities to ensure backup availability at all times.

Test restore frequently

It is good policy to assume that a backup that has not been restored and tested is not a good backup. In other words, it is bad practice to overwrite a tape before having tested that a more recent backup can successfully be restored.

Exclude general OS backup from online Oracle files

Backing up the Oracle database is tricky and should only be done with the provided hot backup scripts or third-party backup tool that has database plug-ins. Attempting to back up Oracle data files while Oracle is running will cause it to shut down fatally if a file is locked by backup software during file copy. Oracle data files can be backed up safely only when Oracle is shut down.

The only exception to this rule is the hot backup script, which uses special functionality provided by Oracle to safely copy files while the database is running.

Backup responsibilities

It is the responsibility of the system administrator to perform the following backup tasks:

Stibo Systems Recommendations

Stibo Systems recommends the following storage layout combined with one of the following backup strategies.

Storage layout

Storage area should be divided into four separate areas:

  1. Operating System (OS)
  2. Other software (SW), for example STEP and Oracle
  3. Database files only (DB)
  4. Online backups (staging area)

No OS / SW files should be placed on volumes used for DB / staging and vice versa. The staging area must be large enough to hold a full backup of the database – including the archive log files and incremental backups.

Strategies

  1. Hot backup is performed using the script mentioned in the RMAN backup script section below. The script uses RMAN for performing an RMAN hot backup to disk (staging area). This backup must then be backed up during the file system backup.
  2. Backup is performed using third-party backup software (BSW). For example, NetVault, Legato, or Veritas. When doing so, backups can be taken in two ways:

A disadvantage of performing backups this way is the price of the license for using RMAN via BSW, as this license is usually expensive. Another disadvantage of using BSW for backups is that it requires the software to be present on the target environment in case the database is to be cloned / restored on other servers.

A third disadvantage of this approach is that backing up directly to tape will most likely increase the restore time.

Whether daily / weekly database backups are to be performed as full or incremental backups depends on the size of the database, the backup window, and the time given to restore / recover the database in case of disaster. In general, Stibo Systems recommends RMAN for internal database backups, but in practice, the scheduling and type of backups is to be decided based on the Service Level Agreement (SLA), database size, and backup window.

Note: For special-purpose backups – for example, if a database copy needs to be sent to Stibo Systems for implementation – or support-related activities, the recommended strategy is using STEP's own export / import utility, STEP DB Toolbox, which can be seen in the STEP DB Toolboxes Guide and Examples documentation.

Backup strategies are always subject to discussion since it is dependent on various parameters such as: database size, server architecture, storage capacity, and uptime requirements. Regardless of what backup strategy is chosen, it should always be carefully tested before being considered a reliable backup.

Backup Procedures

Focusing on the STEP system, the backup procedure must be designed to back up enough data from the system to restore any type of stored application data hierarchy.

RMAN backup script

Stibo Systems has written a hot backup script called rman_backup.pl as an alternative to a Client / Customer written one. However, upon first use – including, but not limited to configuration of the Stibo Systems script, the Client / Customer understands and accepts the disclaimer below:

When using the Stibo Systems backup script:

Important: When using the rman_backup.pl script, the client / customer accepts that Stibo Systems cannot be liable for any use whatsoever of the script. Therefore, it is also the sole responsibility of the client / customer to ensure that the result of any backup is fully operational and useful after changes made to the aforementioned script or the STEP software – including, but not limited to patching, updating, configuring, etc.

File systems that must be backed up

To be able to restore the system, the following file systems must be backed up during cold backup:

File System (default)

Description

Local disk

System volume

/home

/opt

Contains STEP installation and third-party software

/workarea

Workarea for STEP

/upload

Hotfolders for STEP

/backupspace

Oracle RMAN backup

The following file systems can optionally be backed up during cold backup:

File System (default)

Description

/database/db1

Oracle data files containing the database files and control files

/database/db2

Oracle data files containing the database files and control files

/database/db3

Oracle data files containing the database files and control files

/database/dbredo1

Oracle online redo log files

/database/dbredo2

Oracle online redo log files

Backup frequency

Stibo Systems recommends extending the chosen strategy to include some sort of backup covering the entire system – especially the installed software and its configuration information. The minimum recommended backup schedule is as listed below.

Data type

How often

Notes

OS

When changed, minimum once a week

 

Applications

When changed, minimum once a week

Normally located on the <STEP_HOME> volume (default is /opt).

Configuration

When changed, minimum once a week

Normally located on the <STEP_HOME> volume (default is /opt).

Database

Daily

Complete including data, configuration, and archive logs. Use hot backup scripts.

Recovery

There is no easy answer to a recovery strategy since it depends on the type of failure. For example, a disk crash, a corrupt file system, a bad block in the database, etc. can have different solutions.

In any situation where there is a system failure, Stibo Systems strongly recommends contacting Stibo Systems Technical Support in order to recover as quickly and seamlessly as possible. As an additional resource, the backup script mentioned in the RMAN backup script section generates an RMAN restore script for use when restoring and recovering the database is required.