Understanding distributed data
Last modified March 25, 2008
Print all topics in : "Managing Distributed Data"
Data distribution involves creating copies of data and dispensing it between two or more geodatabases. It allows two or more offices to be working on the same data in separate locations.
Data is distributed as a means to improve data availability and performance by alleviating server contention and slow network access to a central server. This can help an organization balance the load on their geodatabases between users performing edits and those accessing it for reading operations.
Distributing data is also required for mobile users or contractors who need to take part of their geodatabase into the field to edit, disconnecting from the network entirely for an indefinite amount of time.
There are several ways to distribute your data across multiple geodatabases:
Copy and Paste
Some organizations have achieved a level of data distribution by saving copies of their geodatabases on CDs and DVDs and sending them to other offices. These offices can then work on the data, make edits and send a copy of their updated geodatabase back to the main office. Here edits are compared and coordinated such that the data at the two offices are in sync. This solution may work with careful communication but there are many opportunities for updates to be lost and it is difficult to keep the two geodatabases in sync.
Geodatabase replication is a data distribution method provided through ArcGIS. With geodatabase replication data is distributed across 2 or more geodatabases by replicating all or part of your dataset. When a dataset is replicated two replicas are created: One that resides in the original geodatabase, and a related replica that is distributed to a different geodatabase. Any changes made to these replicas in their respective geodatabases can be synchronized so that the data in one replica matches that in the related replica.
Geodatabase replication is built on top of the versioning environment and supports the full geodatabase data model including topologies, networks, terrains, relationships, etc.... In this asynchronous model the replication is loosely coupled, meaning that each replicated geodatabase can work independently and still synchronize changes with one another. Since it is implemented at the geodatabase level, the DBMSs involved can be different. For example, one replica geodatabase could be built on top of SQL Server and the other on top of Oracle.
Geodatabase replication can be used in connected and disconnected environments. It can also work with local geodatabase connections as well as geodataserver objects which allow you to access a geodatabase on the internet.
DBMSs also have their own replication mechanisms in place which can be used to make copies of and synchronize geodatabase data.
DBMS replication refers to the built-in replication mechanisms provided by the DBMS in which the geodatabase is stored. DBMS replication is not geodatabase aware. This means that geodatabase constructs such as relationship classes and geometric networks are not known by the DBMS. However, DBMS replication can still be configured to work in a limited way with geodatabase data.
DBMS Replication vs. Geodatabase Replication
The following compares Geodatabase replication and DBMS replication: