On- and Off-site Data Protection: Difference between revisions

From Scalelogic Wiki
Jump to navigation Jump to search
migrate>Ja-B
No edit summary
No edit summary
 
(12 intermediate revisions by 5 users not shown)
Line 1: Line 1:
<p dir="ltr" style="line-height:1.38;  margin-top:0pt;  margin-bottom:0pt;  text-align: center"></p>
OODP feature creates rotational auto-snapshots of a volume (dataset or zvol) according to user-defined retention-interval plans, and asynchronously replicates snapshots delta to local (on-site) or remote (off-site) destinations.
ODPS creates rotational auto-snapshots of a dataset or zvol according to a retention-interval plan, and optionally asynchronously replicates snapshots delta to the local or remote destinations.


'''Working modes:'''
Optionally, the service can create auto-snapshots only (without replicating to other volume).


*Asynchronous replication of rotational-auto-snapshots delta to local or remote destinations, where destination is:
The backup volumes are the asynchronously created mirrors (copies) of production volumes. With each time interval, snapshot is created and delta of the subsequent snapshots is sent to the destination backup volume.


&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1.&nbsp;&nbsp;&nbsp;&nbsp; Another dataset or zvol within the same ZFS pool.
The data replication starts every defined time interval, the new data changes that took place in the interval time are replicated to the backup volume. Also, after every time interval the age of existing snapshots is checked against its retention time. Snapshots that are older than it has been set in the user-defined retention plan will automatically be deleted.


&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2.&nbsp;&nbsp;&nbsp;&nbsp; Dataset or zvol on different ZFS pool.
As creating snapshots and replicating small deltas of data does not generate any significant system load, the backup process can work round the clock even during heavy load periods.


&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 3.&nbsp;&nbsp;&nbsp;&nbsp; Dataset or zvol on a remote node.
For each source volume we can set up a single backup volume or multiple backup volumes. The interval-retention plan can be separately defined on the source volume and any destination backup volumes.


*Rotational-auto-snapshots on the local server only.
Only Scale Logic NX2 can be used as a backup destination server.


&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Note: The task definition omits destination node in the create task command.
Storage snapshots are not applications memory consistent, but most Windows OS and applications, including MS-SQL, and Linux virtual machines running on protected volumes can start from exported and mounted backup volumes without any problem.
 
If applications consistent snapshots are required, the replication task wizard or CLI will provide option to register VMware vCenter or vSphere servers which are using the protected datastores.
 
Once vCenter or vSphere servers are registered, all virtual machines running on the protected datastores will receive VSS (Windows) or freeze (Linux) the snapshot request via VMware API. The VMware tools have to be installed on virtual machines in order to execute API triggered snapshots.
 
If a virtual machine does not require application consistent snapshot, it must have the following text entered into VM Annotations: ##auto-snap-skip##
 
It is also possible to enter time ranges and days of week for the auto-snap-skip time period.
 
For example, skip weekdays from Monday to Friday, from 8:45 AM to 12:30 PM and from 1:15 PM to 6:30 PM
 
Following syntax options are accepted:
 
##<code>auto-snap-skip##8:45..12:30,13:15..18:30##Mon..Fri##</code>
 
##<code>auto-snap-skip##08:45..12:30,13:15..18:45##Mon..Fri##</code>
 
##<code>auto-snap-skip##08:45..12:30,1:15PM..6:45PM##Mon..Fri##</code>
 
##<code>auto-snap-skip##08:45..12:30,13:15..18:45##Mon,Tue,Wed,Thu,Fri##</code>
 
 
 
It is recommended to put the auto-skip instruction as the first line in the VM Annotations, but it will work also if it is found in any place of the VM annotations.
 
<br/>'''OODP Working modes:'''
 
Asynchronous replication of snapshotsdelta to local or remote destinations, where destination is:
 
#Volume on a remote server pool.
#Volume on different pool within the same server.
#Another volume within the same pool. (not recommended)
 
Auto-snapshots on the local server only.
 
          Note: Auto-snapshots only can NOT be used as full data protection as the data are not replicated.
 
If other 3rd party backup is used, the auto-snapshots can be used for quick access to the previous data versions. To create the auto-snapshots only task, the destination volume is disabled in the task wizard or omitted in the CLI.


'''Retention plans''':
'''Retention plans''':


The ODPS '''retention'''-'''interval '''plan consists of a series of retention periods to interval associations: "retention_every_interval''','''retention_every_interval''','''retention_every_interval''','''...".
The ODP retention-interval plan consists of a series of retention periods to interval associations. It can be intuitively defined in the replication task wizard on the GUI. If CLI is used the retention-interval use following syntax:


Example:&nbsp;&nbsp; 1hour_every_10min''','''3day_every_1hour''','''1month_every_1day
"retention_every_interval,retention_every_interval,retention_every_interval,...".
 
              Example:   1hour_every_10min,3day_every_1hour,1month_every_1day


Un-Kh intervals and retention periods use standard units of time or multiples of them, with full names or a shortcut according to the following list: second|sec|s, minute|min, hour|h, day|d, week|w, month|mon|m, year|
Un-Kh intervals and retention periods use standard units of time or multiples of them, with full names or a shortcut according to the following list: second|sec|s, minute|min, hour|h, day|d, week|w, month|mon|m, year|


Rotational auto-snapshots on both source and destination are created according to retention plans. It is possible to have different retention plans for source and destination pool.
<br/>Rotational auto-snapshots on both source and destination are created according to retention plans. It is possible to have different retention plans for source and destination pool.


'''Options''':
GUI provides a step-by-step wizard to define the replication tasks.
 
 
 
'''If CLI is used to define the replication tasks, the following tasks options are provided''':


#It is possible to attach, list and detach backup nodes (remote nodes used for the asynchronous replication)
#It is possible to attach, list and detach backup nodes (remote nodes used for the asynchronous replication)
#It is possible to create a backup task in following modes:
#It is possible to create a backup task in the following modes:
##Only source node details provided -> snapshots created locally, no replication.
##Only source node details provided -> snapshots created locally, no replication.
##Source and destination node details provided -> snapshots created locally and replicated to destination. Un-Kh source and destination keep rotational auto-snapshots.
##Source and destination node details provided -> snapshots created locally and replicated to destination. Un-Kh source and destination keep rotational auto-snapshots.
##Optionally it is possible to use mbuffer (with parameter ''mbuffer size'') as a tool for buffering data streams.
##Optionally, it is possible to use mbuffer (with parameter mbuffer size) as a tool for buffering data streams.
#It is possible to get details of all backup tasks.
#It is possible to get details of all backup tasks.
#It is possible to delete backup task.
#It is possible to delete the backup task.
#It is possible to get status of the ODPS service:
#It is possible to get the status of the ODP service:
##Service status
##Service status
##Last entries from logs
##Last entries from logs
#It is possible to debug backup task - run it in dry mode in order to check what’s wrong.
#It is possible to debug the backup task - run it in dry mode in order to check what is the issue.
#It is possible to restart all tasks so configuration of tasks will be reset.
#It is possible to restart all tasks - configuration of tasks will be reset.
 
'''Important notes''':
 
#Only Genesis NX2 can be used as a destination node.
#Replication will not be done as long as destination dataset or zvol does not exist. User needs to create destination dataset/zvol manually.
#Replication will not be done if dataset or zvol on destination is used e.g. by iSCSI Target with active session. Data from particular snapshot can be accessed only via a clone created from the snapshot.
#User snapshots created on destination dataset or zvol will be deleted by ODPS service during rotation round.
#User snapshots created on source dataset or zvol are not deleted by ODPS service during rotation round.
#Snapshots on both source and destination that are cloned are not deleted by ODPS service during rotation round.
#When a replication round fails due to the fact that it is not possible to replicate a snapshot to destination e.g. caused by:
##Lack of communication between nodes
##Busy dataset on destination (used e.g. by iSCSI Target)
##Existence of user’s own snapshot with clone on destination,
##Existence of user’s own snapshot on destination created before first replication then source snapshots are not rotated. At the next round of replication when conditions to replication are passed then rotation of snapshots on both source and destination is performed.
#When nodes have different sets of snapshots (no common snapshot between source and destination) then snapshots on destination are deleted and re-replicated from source.
#ODPS service is activated when at least one backup task exist in the system (at pool import and system start).
#ODPS service is deactivated when there’s no backup tasks in the system (at pool export).
#Replication to remote destination is encrypted (SSH).
#ODPS replication processes are killed at pool export/destroy.
#Ongoing replication process is not killed when ODPS task is deleted. After finishing this process there’s no more replications.
#When backup plan is created like this: 1min_every_10min backup task won't start. Retention time must be always larger than the interval: 10min_every_1min.
#The Source snapshot that is being replicated (replication of older snapshot is still in progress)&nbsp;blocks rotation of snapshots on source. New snapshots however are created according to the scheduled plan.
#When many destinations used only one replication is performed at given time.
#It is not recommended to create own snapshots on destination nodes. This may break synchronization process. If access is needed for the data on destination use snapshots created by ODPS.
#Save settings mechanism does not support ODPS tasks (this does not apply for attached nodes). All information connected with ODPS tasks are stored as &nbsp;dataset properties. If the dataset does not exist then ODPS task is not restored.
 
'''Important notes (clustered environment):'''


#It is possible to use ODPS in a clustered environment when both source and destination are clusters (two pairs of clusters). Each source node must then attach to each destination node using its physical IP address and tasks must be created with IP of destination cluster "VIP" Virtual IP.
<br/>'''Important notes''':
#In following configuration: source (cluster) -> destination (single node) when destination node is inaccessible over network then it will not break failover on source (cluster).
#Ongoing replication processes are killed when automatic failover is performed or manual move is performed. It applies only for moved pools. replication processes connected with different pools will remain and continue to function.


&nbsp;
:The OODP CLI provides detailed syntax help.
:Only Scale Logic NX2 can be used as a destination server.<br/>
:Replication will not be done as long as the destination dataset or zvol does not exist. User needs to create the destination dataset/zvol manually.
:Replication will not be done if the dataset or zvol on destination node is used e.g. by iSCSI Target with an active session. Data from a particular snapshot can be accessed only via a clone created from the snapshot.
:User snapshots created on the destination dataset or zvol will be deleted by OODP during rotation round.
:User snapshots created on the source dataset or zvol are not deleted by OODP during rotation round.
:Snapshots on both source and destination that are cloned are not deleted by OODP during rotation round.
:Replication round fails due to the fact that it is not possible to replicate a snapshot to destination e.g. caused by:
::Lack of communication between nodes
::Busy dataset on destination server (used e.g. by iSCSI Target)
::Existence of user’s own snapshot with a clone on destination server,
::If user’s manual snapshot on destination server was created before the first replication, the source snapshots will not be rotated. During next replication the rotation of snapshots on both source and destination is performed, provided that all replication requirements are fulfilled
:If nodes have different sets of snapshots (no common snapshot between source and destination), snapshots on destination server are deleted and re-replicated from the source.
:OODP is activated when at least one backup task exists in the system ( at pool import and system start).
:OODP is deactivated when there are no backup tasks in the system (at pool export).
:Replication to remote destination is encrypted (SSH).
:OODP replication processes are stopped at pool export/destroy.
:Ongoing replication process is not stopped when OODP task is deleted. After finishing this process there are no more replications.
:When the backup plan is created as follows: 1min_every_10min, the backup task will not start. Retention time must always be larger than the interval: 10min_every_1min.
:The Source snapshot that is being replicated (replication of older snapshot is still in progress) blocks the rotation of snapshots on source. New snapshots are created according to the scheduled plan.
:In case many destination servers are used, only one replication is performed at given time.
:It is not recommended to create manual snapshots on destination servers. This may break the synchronization process. If access to the data on the destination server is required, use snapshots created by OODP.
:Save settings mechanism does not support OODP tasks (this does not apply for attached nodes). All information regarding OODP tasks are stored as dataset properties. If the dataset does not exist, the OODP task will not be restored.


'''Known issues:'''
&nbsp;&nbsp;'''Known issues:'''


*Source and destination should be of the same type (zvol-zvol, dataset-dataset). It is possible to create task with source for example zvol and destination - &nbsp;dataset but when started it shows error.
*Source and destination should be of the same type (zvol-zvol, dataset-dataset). It is possible to create a task with source e.g. zvol and destination - dataset but when started, it shows error.
*Temporary errors on GUI: ODPS can cause temporary inconsistency of internal GUI cache (used by API in Python) with zfs resources. This temporary inconsistency can lead to errors displayed in the GUI that inform about missing&nbsp; snapshots. This issue happens because ODPS refreshes the cache using a hook script, cache is refreshed before and after any snapshot is removed by ODPS. But using a hook script (only possible way of cache refresh when using external software) leaves small window when cache is inconsistent. If GUI requests information about snapshots in time between snapshot removal and cache refresh than cache is inconsistent and GUI gets information about none existing items which may lead to errors. Most common place of this error is volume edition - snapshots are checked in this window in order to lock edition of name if volume has any snapshots. The issue is only temporary and cache is refreshed when window is closed and reopened then the error will not pop up because cache is consistent again.
*Temporary errors on GUI: OODP can cause temporary inconsistency of internal GUI cache (used by API in Python) with zfs resources. This temporary inconsistency can lead to errors displayed in the GUI that inform about missing snapshots. This issue happens because OODP refreshes the cache using a hook script. Cache is refreshed before and after any snapshot is removed by OODP. But using a hook script (only possible way of cache refresh when using external software) leaves a small window when cache is inconsistent. If GUI requests information about snapshots in time between snapshot removal and cache refresh, then cache is inconsistent and GUI gets information about non existing items which may lead to errors. Most common place of this error is the volume edition - snapshots are checked in this window in order to lock the edition of name if volume has any snapshots. The issue is only temporary and cache is refreshed when window is closed and reopened. In such case, the error will not pop up because cache is consistent again.


&nbsp;
&nbsp; &nbsp;


&nbsp;
[[Category:Help topics]]

Latest revision as of 13:45, 22 September 2020

OODP feature creates rotational auto-snapshots of a volume (dataset or zvol) according to user-defined retention-interval plans, and asynchronously replicates snapshots delta to local (on-site) or remote (off-site) destinations.

Optionally, the service can create auto-snapshots only (without replicating to other volume).

The backup volumes are the asynchronously created mirrors (copies) of production volumes. With each time interval, snapshot is created and delta of the subsequent snapshots is sent to the destination backup volume.

The data replication starts every defined time interval, the new data changes that took place in the interval time are replicated to the backup volume. Also, after every time interval the age of existing snapshots is checked against its retention time. Snapshots that are older than it has been set in the user-defined retention plan will automatically be deleted.

As creating snapshots and replicating small deltas of data does not generate any significant system load, the backup process can work round the clock even during heavy load periods.

For each source volume we can set up a single backup volume or multiple backup volumes. The interval-retention plan can be separately defined on the source volume and any destination backup volumes.

Only Scale Logic NX2 can be used as a backup destination server.

Storage snapshots are not applications memory consistent, but most Windows OS and applications, including MS-SQL, and Linux virtual machines running on protected volumes can start from exported and mounted backup volumes without any problem.

If applications consistent snapshots are required, the replication task wizard or CLI will provide option to register VMware vCenter or vSphere servers which are using the protected datastores.

Once vCenter or vSphere servers are registered, all virtual machines running on the protected datastores will receive VSS (Windows) or freeze (Linux) the snapshot request via VMware API. The VMware tools have to be installed on virtual machines in order to execute API triggered snapshots.

If a virtual machine does not require application consistent snapshot, it must have the following text entered into VM Annotations: ##auto-snap-skip##

It is also possible to enter time ranges and days of week for the auto-snap-skip time period.

For example, skip weekdays from Monday to Friday, from 8:45 AM to 12:30 PM and from 1:15 PM to 6:30 PM

Following syntax options are accepted:

##auto-snap-skip##8:45..12:30,13:15..18:30##Mon..Fri##
##auto-snap-skip##08:45..12:30,13:15..18:45##Mon..Fri##
##auto-snap-skip##08:45..12:30,1:15PM..6:45PM##Mon..Fri##
##auto-snap-skip##08:45..12:30,13:15..18:45##Mon,Tue,Wed,Thu,Fri##


It is recommended to put the auto-skip instruction as the first line in the VM Annotations, but it will work also if it is found in any place of the VM annotations.


OODP Working modes:

Asynchronous replication of snapshotsdelta to local or remote destinations, where destination is:

  1. Volume on a remote server pool.
  2. Volume on different pool within the same server.
  3. Another volume within the same pool. (not recommended)

Auto-snapshots on the local server only.

          Note: Auto-snapshots only can NOT be used as full data protection as the data are not replicated. 

If other 3rd party backup is used, the auto-snapshots can be used for quick access to the previous data versions. To create the auto-snapshots only task, the destination volume is disabled in the task wizard or omitted in the CLI.

Retention plans:

The ODP retention-interval plan consists of a series of retention periods to interval associations. It can be intuitively defined in the replication task wizard on the GUI. If CLI is used the retention-interval use following syntax:

"retention_every_interval,retention_every_interval,retention_every_interval,...".

              Example:   1hour_every_10min,3day_every_1hour,1month_every_1day

Un-Kh intervals and retention periods use standard units of time or multiples of them, with full names or a shortcut according to the following list: second|sec|s, minute|min, hour|h, day|d, week|w, month|mon|m, year|


Rotational auto-snapshots on both source and destination are created according to retention plans. It is possible to have different retention plans for source and destination pool.

GUI provides a step-by-step wizard to define the replication tasks.


If CLI is used to define the replication tasks, the following tasks options are provided:

  1. It is possible to attach, list and detach backup nodes (remote nodes used for the asynchronous replication)
  2. It is possible to create a backup task in the following modes:
    1. Only source node details provided -> snapshots created locally, no replication.
    2. Source and destination node details provided -> snapshots created locally and replicated to destination. Un-Kh source and destination keep rotational auto-snapshots.
    3. Optionally, it is possible to use mbuffer (with parameter mbuffer size) as a tool for buffering data streams.
  3. It is possible to get details of all backup tasks.
  4. It is possible to delete the backup task.
  5. It is possible to get the status of the ODP service:
    1. Service status
    2. Last entries from logs
  6. It is possible to debug the backup task - run it in dry mode in order to check what is the issue.
  7. It is possible to restart all tasks - configuration of tasks will be reset.


Important notes:

The OODP CLI provides detailed syntax help.
Only Scale Logic NX2 can be used as a destination server.
Replication will not be done as long as the destination dataset or zvol does not exist. User needs to create the destination dataset/zvol manually.
Replication will not be done if the dataset or zvol on destination node is used e.g. by iSCSI Target with an active session. Data from a particular snapshot can be accessed only via a clone created from the snapshot.
User snapshots created on the destination dataset or zvol will be deleted by OODP during rotation round.
User snapshots created on the source dataset or zvol are not deleted by OODP during rotation round.
Snapshots on both source and destination that are cloned are not deleted by OODP during rotation round.
Replication round fails due to the fact that it is not possible to replicate a snapshot to destination e.g. caused by:
Lack of communication between nodes
Busy dataset on destination server (used e.g. by iSCSI Target)
Existence of user’s own snapshot with a clone on destination server,
If user’s manual snapshot on destination server was created before the first replication, the source snapshots will not be rotated. During next replication the rotation of snapshots on both source and destination is performed, provided that all replication requirements are fulfilled
If nodes have different sets of snapshots (no common snapshot between source and destination), snapshots on destination server are deleted and re-replicated from the source.
OODP is activated when at least one backup task exists in the system ( at pool import and system start).
OODP is deactivated when there are no backup tasks in the system (at pool export).
Replication to remote destination is encrypted (SSH).
OODP replication processes are stopped at pool export/destroy.
Ongoing replication process is not stopped when OODP task is deleted. After finishing this process there are no more replications.
When the backup plan is created as follows: 1min_every_10min, the backup task will not start. Retention time must always be larger than the interval: 10min_every_1min.
The Source snapshot that is being replicated (replication of older snapshot is still in progress) blocks the rotation of snapshots on source. New snapshots are created according to the scheduled plan.
In case many destination servers are used, only one replication is performed at given time.
It is not recommended to create manual snapshots on destination servers. This may break the synchronization process. If access to the data on the destination server is required, use snapshots created by OODP.
Save settings mechanism does not support OODP tasks (this does not apply for attached nodes). All information regarding OODP tasks are stored as dataset properties. If the dataset does not exist, the OODP task will not be restored.

  Known issues:

  • Source and destination should be of the same type (zvol-zvol, dataset-dataset). It is possible to create a task with source e.g. zvol and destination - dataset but when started, it shows error.
  • Temporary errors on GUI: OODP can cause temporary inconsistency of internal GUI cache (used by API in Python) with zfs resources. This temporary inconsistency can lead to errors displayed in the GUI that inform about missing snapshots. This issue happens because OODP refreshes the cache using a hook script. Cache is refreshed before and after any snapshot is removed by OODP. But using a hook script (only possible way of cache refresh when using external software) leaves a small window when cache is inconsistent. If GUI requests information about snapshots in time between snapshot removal and cache refresh, then cache is inconsistent and GUI gets information about non existing items which may lead to errors. Most common place of this error is the volume edition - snapshots are checked in this window in order to lock the edition of name if volume has any snapshots. The issue is only temporary and cache is refreshed when window is closed and reopened. In such case, the error will not pop up because cache is consistent again.