On- and Off-site Data Protection: Difference between revisions

Revision as of 09:17, 7 April 2016

ODPS creates rotational auto-snapshots of a dataset or zvol according to a retention-interval plan, and optionally asynchronously replicates snapshots delta to the local or remote destinations.

Working modes:

Asynchronous replication of rotational-auto-snapshots delta to local or remote destinations, where destination is:

1. Another dataset or zvol within the same ZFS pool.

2. Dataset or zvol on different ZFS pool.

3. Dataset or zvol on a remote node.

Rotational-auto-snapshots on the local server only.

Note: The task definition omits destination node in the create task command.

Retention plans:

The ODPS retention-interval plan consists of a series of retention periods to interval associations: "retention_every_interval,retention_every_interval,retention_every_interval,...".

Example: 1hour_every_10min,3day_every_1hour,1month_every_1day

Rotational auto-snapshots on both source and destination are created according to retention plans. It is possible to have different retention plans for source and destination pool.

Options:

It is possible to attach, list and detach backup nodes (remote nodes used for the asynchronous replication)
It is possible to create a backup task in following modes:
1. Only source node details provided -> snapshots created locally, no replication.
2. Source and destination node details provided -> snapshots created locally and replicated to destination. Un-Kh source and destination keep rotational auto-snapshots.
3. Optionally it is possible to use mbuffer (with parameter mbuffer size) as a tool for buffering data streams.
It is possible to get details of all backup tasks.
It is possible to delete backup task.
It is possible to get status of the ODPS service:
1. Service status
2. Last entries from logs
It is possible to debug backup task - run it in dry mode in order to check what’s wrong.
It is possible to restart all tasks so configuration of tasks will be reset.

Important notes:

ODPS is managed via CLI. The ODPS CLI provides detailed syntax help.
Only GenesisZX can be used as a destination node.
Replication will not be done as long as destination dataset or zvol does not exist. User needs to create destination dataset/zvol manually.
Replication will not be done if dataset or zvol on destination is used e.g. by iSCSI Target with active session. Data from particular snapshot can be accessed only via a clone created from the snapshot.
User snapshots created on destination dataset or zvol will be deleted by ODPS service during rotation round.
User snapshots created on source dataset or zvol are not deleted by ODPS service during rotation round.
Snapshots on both source and destination that are cloned are not deleted by ODPS service during rotation round.
When a replication round fails due to the fact that it is not possible to replicate a snapshot to destination e.g. caused by:
1. Lack of communication between nodes
2. Busy dataset on destination (used e.g. by iSCSI Target)
3. Existence of user’s own snapshot with clone on destination,
4. Existence of user’s own snapshot on destination created before first replication then source snapshots are not rotated. At the next round of replication when conditions to replication are passed then rotation of snapshots on both source and destination is performed.
When nodes have different sets of snapshots (no common snapshot between source and destination) then snapshots on destination are deleted and re-replicated from source.
ODPS service is activated when at least one backup task exist in the system (at pool import and system start).
ODPS service is deactivated when there’s no backup tasks in the system (at pool export).
Replication to remote destination is encrypted (SSH).
ODPS replication processes are killed at pool export/destroy.
Ongoing replication process is not killed when ODPS task is deleted. After finishing this process there’s no more replications.
When backup plan is created like this: 1min_every_10min backup task won't start. Retention time must be always larger than the interval: 10min_every_1min.
The Source snapshot that is being replicated (replication of older snapshot is still in progress) blocks rotation of snapshots on source. New snapshots however are created according to the scheduled plan.
When many destinations used only one replication is performed at given time.
It is not recommended to create own snapshots on destination nodes. This may break synchronization process. If access is needed for the data on destination use snapshots created by ODPS.
Save settings mechanism does not support ODPS tasks (this does not apply for attached nodes). All information connected with ODPS tasks are stored as dataset properties. If the dataset does not exist then ODPS task is not restored.

Important notes (clustered environment):

It is possible to use ODPS in a clustered environment when both source and destination are clusters (two pairs of clusters). Each source node must then attach to each destination node using its physical IP address and tasks must be created with IP of destination cluster "VIP" Virtual IP.
In following configuration: source (cluster) -> destination (single node) when destination node is inaccessible over network then it will not break failover on source (cluster).
Ongoing replication processes are killed when automatic failover is performed or manual move is performed. It applies only for moved pools. replication processes connected with different pools will remain and continue to function.

Known issues:

Source and destination should be of the same type (zvol-zvol, dataset-dataset). It is possible to create task with source for example zvol and destination - dataset but when started it shows error.
Temporary errors on GUI: ODPS can cause temporary inconsistency of internal GUI cache (used by API in Python) with zfs resources. This temporary inconsistency can lead to errors displayed in the GUI that inform about missing snapshots. This issue happens because ODPS refreshes the cache using a hook script, cache is refreshed before and after any snapshot is removed by ODPS. But using a hook script (only possible way of cache refresh when using external software) leaves small window when cache is inconsistent. If GUI requests information about snapshots in time between snapshot removal and cache refresh than cache is inconsistent and GUI gets information about none existing items which may lead to errors. Most common place of this error is volume edition - snapshots are checked in this window in order to lock edition of name if volume has any snapshots. The issue is only temporary and cache is refreshed when window is closed and reopened then the error will not pop up because cache is consistent again.

@@ Line 43: / Line 43: @@
 '''Important notes''':
+#ODPS is managed via CLI. The ODPS CLI provides detailed syntax help.
 #Only GenesisZX can be used as a destination node.
 #Replication will not be done as long as destination dataset or zvol does not exist. User needs to create destination dataset/zvol manually.

Navigation menu

Search

On- and Off-site Data Protection: Difference between revisions

Revision as of 09:17, 7 April 2016