The archive process
The archive process
|
Release 9.3 |
|
Note: This topic was updated for 9.3.1.
Enabling archiving on a versioned dataset creates and populates the archive class with the current data present in the DEFAULT version. The archive class uses the gdb_from_date and the gdb_to_date to maintain the time that the change was archived.
It is important to understand how ArcGIS represents time when change is recorded. History can be recorded as either valid time or transaction time. Valid time is the actual moment for which a change occurred in the real world and is typically recorded by the user who is applying the change. Transaction time is the time an event was recorded in the database. Transaction times are generated automatically by the system.
ArcGIS uses transaction time, which is based on the current server time, to record change to the data when changes are saved or posted to the DEFAULT version. Transaction time and the time that the event occurred in the real world are rarely the same time. Time will lapse between an event happening in the real world and when the event is recorded in the database. For example, a parcel is sold on May 14, 2006; however, the change is not recorded to the data until June 5, 2006. The transaction time of June 5, 2006, is recorded in the archive class for this change.
When the edit occurs, ArcGIS will archive the transaction to the archive class. The difference between the time of the real-world event and the transaction time may seem insignificant, but it becomes more apparent when queries are performed against the archived information. Backlogs in editing and updating data are not uncommon in production systems, which results in the time difference and lag between valid and transaction time.
The difference between valid and transaction time is also an issue in situations where history is recorded in a multiuser environment with many different users or departments editing the database. The sequence in which changes are performed and logged in the database may not be the same order in which those changes occurred in the real world.
Upon enabling archiving, all rows representing the DEFAULT version for the given class are copied to the archive class with the same time stamp. The gdb_from_date attribute for all rows is stamped with the date and time of the enable archiving operation. The gdb_to_date attribute for all rows is stamped with 12/31/9999. Anytime an attribute has the gdb_to_date 12/31/9999, it is the current representation of the object in the DEFAULT version. When edits are saved or posted to the DEFAULT version, the geodatabase automatically archives the changes to the archive class. This means the following:
- Features created in the DEFAULT version are represented in the archive class as rows, with the attribute value for the gdb_from_date attribute set to the time stamp of the archive operation and the gdb_to_date attribute set to 12/31/9999.
- Features updated in the DEFAULT version update the associated row in the archive by setting the attribute value for the gdb_to_date attribute to the time stamp of the archive operation, and insert a new row with the attribute value for the gdb_from_date attribute set to the time stamp of the archive operation and the gdb_to_date attribute set to 12/31/9999.
- Features deleted in the DEFAULT version update the associated row in the archive class by setting the gdb_to_date attribute value equal to the time stamp of the archive operation.
Updating the archive table is performed within a single database transaction. If any errors are encountered during the transaction, the entire archive operation is rolled back and the save or posting operation is therefore not completed. Once the error has been rectified, perform the save or post operation again.
For each archive operation, the DEFAULT historical marker is updated with the value of the archive operation. This ensures that when you are choosing the DEFAULT historical marker while working with a historical version, the current representation of the archive class is equivalent to the versioned classes representation in the transactional DEFAULT version.
Accessing the archive class can actually consume fewer database resources than working with the equivalent versioned class.
For application developers interested in the event that captures the moment of the archive operation, please refer to the OnarchiveUpdated event on the Iversionevents2 interface of the software developer kit.
Queries on historical versions are on the archive class:
Queries on transactional versions are still on the base and delta tables:
This feature in a cadastral database shows parcel number 116 and its corresponding row in the archive class. The gdb_from_date shows the time and date of creation, while the gdb_to_date shows 12/31/9999 because the feature has not been modified or deleted since enabling archiving.
When a feature, parcel 117, is inserted and the edits are posted to the DEFAULT version, a row is inserted in the archive class with the gdb_from_date updated with the time stamp of this post operation. The gdb_to_date attribute in the new row shows 12/31/9999 because this feature has yet to be updated or deleted.
When a feature is updated, the gdb_to_date is set with the time stamp of the archive operation and a row is inserted to show the current representation of the feature. The gdb_from_date in this new row is set with the time of the archive operation, while the gdb_to_date shows 12/31/9999 since it has yet to be modified or deleted.
The following diagram shows two parcels, 116 and 117, with their corresponding gdb_from_date and gdb_to_date attributes in the archive class prior to performing the update operation.
If the parcel boundary for parcel 117 is extended and these edits are posted to the DEFAULT version, the gdb_to_date is updated with the time stamp of the archive operation and a new row is created. The gdb_from_date attribute in this new row is set with the time and date of the archive operation.
For example, queries that investigate moments prior to the update (7/14/2005 5:34:22 PM) show parcel 117 as it existed prior to the update. Querying moments before 7/9/2005 2:33:43 PM will not show parcel 117 because it had not been created. Any moment queries after the update (7/14/2005 3:45:23 AM) will show parcel 117 in its current representation with the extended boundary.
Learn more about querying the archive class.
When a feature is deleted, the gdb_to_date is updated with the time stamp of the archive operation. The following diagram shows parcels, 116 and 117, with their corresponding gdb_from_date and gdb_to_date attributes in the archive class.
If parcel 117 is now deleted and these edits are posted to the DEFAULT version, the gdb_to_date attribute is updated with the time stamp of the archive operation.
Technical note on archiving
The following scenario can create a time gap in the archive class:
An editor is directly editing the DEFAULT version and deletes an object in an edit session.
The editor then saves the edits, which updates the gdb_to_date attribute of the archive class with the time stamp of the deletion of that object.
If the same object is updated in a child version and reconciled with the DEFAULT version, there will be a conflict.
If during the conflict resolution process the editor chooses to replace the conflict with the updated representation of the row, the row will be restored in the DEFAULT version when the version is posted. The archive operation inserts a new row into the archive class and sets the gdb_from_date attribute to the time stamp and gdb_to_date to 12/31/9999.
Therefore, when the editor looks at the object’s lineage through time, the dates will contain a gap between the gdb_to_date and gdb_from_date when the object did not exist in the DEFAULT version.