Show Navigation | Hide Navigation

You are here:
Geodatabases and ArcSDE > Data management workflows, transactions, and versioning > Managing Distributed Data

Working with geodatabase replication(ArcInfo and ArcEditor only)
Release 9.3 Last modified April 2, 2009	Print all topics in : "Managing Distributed Data"

Geodatabase replication is designed to support many different systems where you need to distribute data. The following is a guide to help you determine how best to use it for your system.

To start, review the Understanding distributed data topic, which describes geodatabase replication as well as other methods for distributing data. The scenarios topic also lists a number of common use cases for which geodatabase replication can be used. If geodatabase replication seems the most appropriate method for your system, your next step is to start creating replicas.

Creating replicas

The following will help you determine the best way to create replicas for your system.

Determine what replicas are needed—In some cases, you may need to create only one or two replicas, while in others, many replicas are needed. For example, many replicas are needed if you are creating replicas for field crews to work with on their field laptops. In cases where you want to keep two enterprise geodatabases synchronized, you may only need one replica. To understand what a replica is and how it works within a geodatabase, read the Replicas and geodatabases topic.

Decide on the type of replication—The replication types topic describes each of the three replication types available. Your system may require you to use one type of replica in one case and another type in another case. For example, you may want to use two-way replication to synchronize with another office and one-way replication to update your map publishing geodatabase.

Choose which set of tools to use to create the replicas—ArcGIS provides several environments in which to work with geodatabase replication. Each environment offers different advantages. The following describes what each environment has to offer:

The Create Replica wizard—The Create Replica wizard is available on the distributed geodatabase toolbar in ArcMap. The wizard has many options and a well-described user interface that is tightly integrated with ArcMap. It is recommended that you use the Create Replica wizard when first experimenting with creating replicas or if you plan to only create a small number of replicas.

The Create Replica geoprocessing tool—The Create Replica geoprocessing tool can also be used to create replicas. The tool has many options but does not offer some of the more advanced options from the Create Replica wizard.

See the Create Replica geoprocessing tool help for more information.

ArcObjects API—An ArcObjects API is also available to support writing code to create replicas in any of several languages. This is useful when you want to customize the create replica experience or need to create replicas with complex options on a regular basis.

Integrate replication into your versioning workflows—Geodatabase replication is built on top of versioning. At replica creation time, a replica version is defined in both the parent and the child replica. This is the version from which you will send and receive changes during synchronization. See the Replica creation and versioning topic for more information.

Define the data to replicate—Geodatabase replication allows you to replicate some or all of the datasets in your ArcSDE geodatabase. It also allows you to define what features or rows to replicate using filters and relationship classes. During creation, filters are always applied first and then relationship classes are used to append additional features and rows. See Preparing for replication for more information.

Terrains and Network Datasets

Consider replica creation options—Some options have been added to make the replica creation process as efficient as possible. These options are designed to work for specific cases and may or may not be applicable to your workflow. Review the following list to see if you can take advantage of these options:

Reuse schema—With reuse schema, you specify a target geodatabase that already has the schema for the data you're replicating. This saves time since schema creation can be skipped when creating a replica. This option only applies for checkout replicas but should be used whenever possible. See Creating replicas to find out how to apply this option.

Schema only—The schema-only option allows you to create a replica where no rows are replicated. Here only the schema is copied during replica creation. This option only applies for checkout replicas. An example of where this is of use is when you are creating a replica for a field crew that plans on only inputting new information. Using this option saves you the time of setting each dataset to schema only in the wizard. See creating replicas to find out how to use this option.

Register existing data—If you are replicating a very large amount of data, you may want to consider using the register existing data option. The option allows you to bypass the data-copying step of replica creation and simply register a new replica. In order to use this option successfully, a specific set of steps must be taken before replica creation. See Creating replicas for a description of how to use this option. Note that this option is not available when using the geoprocessing tools.

Replicate related data—During replica creation, filters are applied first, then relationship classes are processed to determine the data to replicate. You can choose to turn off relationship class processing, which will save time. If you choose to turn off relationship class processing, the relationship classes are still included but are not processed during creation and synchronization. An option is available to turn off all relationship class processing in the advanced sections of the Create Replica wizard and geoprocessing tool. The Create Replica wizard also allows you to turn off processing for specific relationship classes. See Creating replicas for more information.

Consider whether to use a connected or disconnected environment—Replicas can be created in both a connected and a disconnected environment. In a connected environment, creation and synchronization are done while connected on the same network. In a disconnected environment, the network is not used. Creation and synchronization are done by exporting files, such as XML documents, sending them to and importing them on the target. See Connected and disconnected replication for more information.

Synchronizing replicas

Once a replica is created, you can start synchronizing changes between the replica geodatabases. See About synchronization to learn more. To make your system work effectively, it is important to devise a strategy for synchronizing changes. The following should be considered when determining the best strategy for your system.

Synchronization methods—First determine which is the best synchronization method for your needs. The following lists some options:

Manual synchronization—If you are only working with a small number of replicas and plan to only occasionally synchronize changes, consider using the tools provided by ArcGIS. The distributed geodatabase toolbar and the distributed geodatabase context menu in ArcCatalog provide wizards for performing synchronizations. These wizards are available for geodatabase connections as well as geodata server objects exposed through ArcGIS Server in ArcCatalog. This allows you to synchronize both local connections and remote connections over the Internet. There are also distributed geodatabase geoprocessing tools that provide the same functionality.

Automated synchronization using agents—In a system where there are many replicas and/or frequent synchronizations, you should consider building a replication agent. Replication agents work by automatically connecting to replicated geodatabases and performing synchronizations. In this case, end users do not have to explicitly synchronize their databases as synchronization happens automatically. In a connected environment, the following techniques can be used to build synchronization agents:

Synchronization using geoprocessing tools—With geoprocessing tools, you can easily build models to synchronize replicas using either local geodatabase connections or connections to geodataserver objects running on the Internet. These models can be exported to Python scripts and executed through Python. The commands to execute the scripts can be added to scheduling software such as the Windows scheduler so that they can be run on a regular basis. For example, you may want to schedule a synchronization between two enterprise geodatabases once a week at a nonpeak time.

Synchronization using ArcObjects—Synchronization is fully supported through the ArcObjects API. The API allows you to build more sophisticated synchronization agents than those built using geoprocessing tools. For example, you can add functionality to synchronize a field laptop when the operating system detects that the laptop is connected on the network.

disconnected environment

Synchronization and conflicts—If edits made to a replica's data conflict with edits being synchronized from the relative replica, you will need to determine how to resolve the conflict. A reconcile policy can be applied to automatically resolve the conflicts or enable manual conflict resolution at a later time. Review synchronization and versioning to see if this is a concern for your system. One alternative for working with conflicts is to use the ArcObjects API to build a system to process conflicts. In this system, synchronizations use a manual reconcile policy but have a secondary process that runs automatically afterwards to resolve any conflicts that may have arisen.

The data being synchronized—For checkout replicas, all data changes made in the child are synchronized. For two-way and one-way replicas, only changes that satisfy the filters and relationship classes are applied. The replica manager can be used to determine the filters and relationship class rules that have been applied to each replicated dataset. You can also create a replica footprint to store this information locally and to visualize each replica's spatial filter.

Synchronizing topology

Synchronizing related data

Synchronizing geometric networks

Data volume—When you synchronize, only changes made since the last synchronization are applied. ArcGIS filters out any changes that have already been sent and acknowledged. Also, once a change has been sent, it is never returned to the original replica. In this way, data volumes are trimmed to just what is needed.

disconnected environment

The order in which replicas are synchronized—If you are working with several replicas, the order in which they are synchronized may be important. For example, consider the case where you create several two-way replicas from a single ArcSDE geodatabase. One strategy for synchronizing these replicas would be for each child replica to synchronize in both directions with the parent. Here the child sends changes to the parent and then the parent sends changes to the child. Another strategy is for each child replica to first send its changes to the parent. The parent incorporates all the changes and then sends changes back to each child. In the first case, the parent is sending only its changes, while in the second case, it is additionally sending changes incorporated from other replicas. Depending on the requirements of your system, one strategy may be more appropriate than the other.

Schema changes—Geodatabase replication is designed to be tolerant of schema changes. This means that synchronizations will continue to work even if schema changes are made to the replicated data. To a certain degree, you can also apply schema changes across replicas. See Working with schema changes for more details.

Applying schema changes

Working through errors—Errors can occur during the synchronization process for a number of reasons. In a connected system, a computer network may fail or you may try to synchronize a replica that is in conflict. In a disconnected system, it is possible to lose messages or you may mistakenly try to import the messages in the incorrect order. In all of these cases, the system is designed to stay in a consistent state. Changes are rolled back and inappropriate messages are rejected. The replica log can be used to find any errors that have occurred and determine what to do, if anything, to recover. In most cases, the system will automatically recover from errors if you simply continue synchronizing changes. Replicas also contain generation information that indicates how many change sets have been sent and how many have been received. See Managing replicas for more information.

Working with geodatabase replication(ArcInfo and ArcEditor only)

Creating replicas

Synchronizing replicas