How to do point-in-time restore in multi-archive deployment

{question}

How can I restore a database to specific point-in-time when having more than one Storage Manager (SM) processes in it?
{question}

{answer}

Scenario

In order to illustrate the proposed approach, we'll consider a sample environment having 2 TE and 2 SM processes for a database named prod.

Our example domain is looking like this:

$ nuocmd show domain
...
Databases:
prod [RUNNING]
[SM] sm-1/172.18.0.4:48006 (Default) [start_id = 0] [server_id = host1] [pid = 39] [node_id = 1] MONITORED:RUNNING
[TE] te-1/172.18.0.5:48006 (Default) [start_id = 1] [server_id = host1] [pid = 39] [node_id = 2] MONITORED:RUNNING
[SM] sm-2/172.18.0.6:48006 (Default) [start_id = 2] [server_id = host2] [pid = 38] [node_id = 3] MONITORED:RUNNING
[TE] te-2/172.18.0.7:48006 (Default) [start_id = 3] [server_id = host2] [pid = 39] [node_id = 4] MONITORED:RUNNING

For simplicity, we'll call the SM processes SM1 and SM2. SM1 is serving Archive0 on Host1 and SM2 is serving Archive1 on Host2.

Our archives are looking like this:

$ nuocmd show archives
Archive: [0] <NO VALUE> : /var/opt/nuodb/archive [db = prod] RUNNING
[SM] sm-1/172.18.0.4:48006 (Default) [start_id = 0] [server_id = host1] [pid = 39] [node_id = 1] MONITORED:RUNNING
Archive: [1] <NO VALUE> : /var/opt/nuodb/archive [db = prod] RUNNING
[SM] sm-2/172.18.0.6:48006 (Default) [start_id = 2] [server_id = host2] [pid = 38] [node_id = 3] MONITORED:RUNNING

Table Partitioning and Storage Groups (TPSG) is not used in this scenario.

As per the backup strategy, journal backup is enabled on engines and daily full online backup followed by regular journal online backups are performed on all archives.

 

Prerequisites

The end goal is to restore the database state to a specific transaction id (1412) which was executed during the past day (2019-08-22T13:49:12). According to the backup strategy described above, there should be already backup sets for all archives from the past day.

As a prerequisite, we have two backup sets from the previous day for each archive:

  • /backup/prod/host1/archive0/backup-set-2019-08-22
  • /backup/prod/host2/archive1/backup-set-2019-08-22

There is another article which explains how to enable journal backup - Enabling journal hot copy with NuoAdmin.

 

Preparation

Check available points in time for restore in both backup sets:

$ nuoarchive restore --report-timestamps /backup/prod/host1/archive0/backup-set-2019-08-22
2019-08-22T13:47:22 1156
2019-08-22T13:47:24 1284
2019-08-22T13:49:12 1412
2019-08-22T13:55:19 1755
...
$ nuoarchive restore --report-timestamps /backup/prod/host2/archive1/backup-set-2019-08-22
2019-08-22T13:47:22 1156
2019-08-22T13:47:24 1284
2019-08-22T13:55:19 1755
...

Note that we have the timestamp of interest (2019-08-22T13:49:12) in the backup set taken on Archive0 only.
The reason could be that SM2 process could have been offline/stopped during that time. 

IMPORTANT: When an SM process is offline and started after some time, it performs a SYNC procedure to fetch all modified atoms since it went down. Those atoms will contain the effects of all transactions that completed during the time that it was down.

In the above scenario, restore could be done using backup only from Archive0 backup set as TPSG is not used. This is equivalent to all SMs serving all storage groups (SGs).

 

For more information on above commands - Restoring Data From Backup Sets.

For more information about online backup - Using Online Backup.

 

Restore

For this example, we are going to restore from Archive0 backup set and start a different database named restored.

As the backup set is located on Host1, we start by restoring it into /var/opt/nuodb/production-archives/restored on Host1.

$ nuoarchive restore --restore-snapshot 1412 \
--restore-dir /var/opt/nuodb/production-archives/restored \
/backup/prod/host1/archive0/backup-set-2019-08-22

Above operation should create database archive ready to be served by a new SM process in our new database.

For more information on above commands - NuoDB Archive - Restoring.

 

Restore post steps

Create the restored archive metadata:

$ nuocmd create archive --db-name restored --server-id host1 \
--archive-path /var/opt/nuodb/production-archives/restored --restored

IMPORTANT: In the above command --restored flag indicate that the archive path directory contains valid archive already. 

Create and start a database:

$ nuocmd create database --db-name restored \
--dba-user dba --dba-password dba \
--te-server-ids host1

At this point, the NuoDB administration layer will start one SM process with initialize false option to serve the restored archive.

Something similar should be shown by show domain command:

... 
restored [state = RUNNING]
[SM] sm-11:48006 [start_id = 15] [server_id = host1] [pid = 34] [node_id = 1] [last_ack = 3.02] MONITORED:RUNNING
[TE] te-11:48007 [start_id = 16] [server_id = host1] [pid = 89] [node_id = 2] [last_ack = 0.79] MONITORED:RUNNING
...

If more SM processes are needed, then they could be started with an empty archive directory.

On Host2:

$ nuocmd create archive --db-name restored --server-id host2 \
--archive-path /var/opt/nuodb/production-archives/restored
$ nuocmd start database --db-name restored --incremental

{answer}

Have more questions? Submit a request

Comments