Introduction
The geodatabase is a collection of geographic datasets of various types.
In this topic, you can learn about the fundamentals of the geodatabase. These concepts will help provide a foundation for learning about and effectively using geodatabases for your GIS work.
What is the geodatabase?
An ArcGIS geodatabase is a collection of geographic datasets of various types held in a common file system folder, a Microsoft Access database, or a multiuser relational database (such as Oracle, Microsoft SQL Server, PostgreSQL, Informix, or IBM DB2).
Fundamental datasets in the geodatabase
A key geodatabase concept is the dataset. It is the primary mechanism used to organize and use geographic information in ArcGIS. The geodatabase contains three primary dataset types:
- Feature classes
- Raster datasets
- Tables
Creating a collection of these dataset types is the first step in designing and building a geodatabase. Users typically start by building a number of these fundamental dataset types. Then they add to or extend their geodatabase with more advanced capabilities (such as by adding topologies, networks, or subtypes) to model GIS behavior, maintain data integrity, and work with an important set of spatial relationships.
Geodatabase storage in tables and files
Geodatabase storage includes both the schema and rule base for each geographic dataset plus simple, tabular storage of the spatial and attribute data. All three primary datasets in the geodatabase (feature classes, attribute tables, and raster datasets) as well as other geodatabase elements are stored using tables. The spatial representations in geographic datasets are stored as either vector features or as rasters. These geometries are stored and managed in attribute columns along with traditional tabular attribute fields.
A feature class is stored as a table. Each row represents one feature. In the polygon feature class table below, the Shape column holds the polygon geometry for each feature. The value Polygon is used to specify that the field contains the coordinates and geometry that defines one polygon in each row.
A key geodatabase strategy is to leverage the RDBMS to scale GIS datasets to extremely large sizes and numbers of users (for example, to support simple small databases for one or a few users up to instances with hundreds of millions of features and thousands of simultaneous users). Tables provide the primary storage mechanism for geographic datasets. SQL is very strong at query and set processing of rows in tables, and the geodatabase strategy is designed to leverage these capabilities.
The geodatabase supports SQL access to feature geometry in the following DBMSs:
- Oracle (using the ArcSDE SQL type or the Oracle Spatial SQL type, if you use Oracle Spatial)
- IBM DB2
- IBM Informix
- Microsoft SQL Server
- Informix
- PostGreSQL (using the ArcSDE SQL type or the PostGIS SQL type, if you wish to use PostGIS)
The underlying SQL API for ArcSDE is based on the ISO SQL/MM Spatial and OGC's simple feature SQL specifications, which extend SQL with standards for vector geometry types.
Advanced geographic data types extend feature classes, rasters, and attribute tables
Various geodatabase elements are used to extend simple tables, features, and rasters to model spatial relationships, add rich behavior, improve data integrity, and extend the geodatabase's capabilities for data management.
The geodatabase schema includes the definitions, integrity rules, and behavior for each of these extended capabilities. These include properties for coordinate systems, coordinate resolution, feature classes, topologies, networks, raster catalogs, relationships, domains, and so forth. This schema information is persisted in a collection of geodatabase meta tables in the DBMS. These tables define the integrity and behavior of the geographic information.
All GIS users will work with three fundamental dataset types regardless of the system they use. They'll have a set of feature classes (much like a folder full of ESRI shapefiles), they'll have a number of attribute tables (such as dBASE files, Microsoft Access tables, Excel spreadsheets, DBMSs, and so forth), and most of the time, they'll also have a large set of imagery and raster datasets to work with.
Fundamentally, all geodatabases will contain this same kind of content. This collection of datasets can be thought of as the universal starting point for your GIS database design.
As necessary, users will extend their data models to support certain essential capabilities. The geodatabase has a number of additional data elements and dataset types that can be used to extend this fundamental collection of datasets.
See
Extending tables,
Extending feature classes, and
Extending rasters for more information.
ArcSDE geodatabases support versioning and long transactions
In addition to geodatabase support for rich data types, such as annotation, topology, networks, terrains, and address locators, all of which work on extremely large, high-performance databases, the geodatabase also supports a strong transaction framework for managing many data management workflows and operations.
- Many situations require multiple simultaneous editors.
- Check out and check in updates.
- Synchronize multiple copies by sharing change-only updates between replicas that can be in any number of DBMS types (such as Oracle and SQL Server) and need not be connected.
- Create, manage, and use historical archives (for example, analyze and overlay the state of the parcel database on May 1, 2006).
See
Versioning for more information.