Example raster case studies including the storage decisions
Last modified September 22, 2008
Print all topics in : "Designing a raster database"
Below is a discussion of real examples using raster data. Raster datasets and raster catalogs will be discussed along with storage decisions.
The Texas Natural Resources Information System (TNRIS) is a state government division of the Texas Water Development Board. It acts as a clearinghouse for the state’s maps, aerial photography, and digital natural resources data. TNRIS has a significant database of raster data, including both digital elevation and color orthophotography. TNRIS needs a way to manage and view these datasets.
The orthophoto dataset contains approximately 17,000 color infrared orthophotos at 1-meter resolution. The original size of this imagery is roughly 2.7 terabytes. TNRIS chose to build a raster catalog of one-quarter degree mosaics (four mosaics per 1-degree cell) for two main reasons: irregular update cycles and the need to preserve the original extent of the United States Geological Survey (USGS) 1:24,000 topographic map series.
Orthophoto updates are received by TNRIS at irregular intervals from many organizations, including a number of local governments, counties, and collaborative partnerships. A major partnership was established between TNRIS and USGS for data collection to support the National Map. Each organization provides updates according to its own internal data collection timelines. There is no regular, statewide update schedule, but rather a continuous series of updates for various extents.
Given the variable data update cycles, building a single, large mosaic would require significant update time and management. Also, the basemap requirements are based on the existing 1:24,000-scale USGS map sheets. TNRIS specifically chose to receive raster updates in one-quarter degree mosaics (four mosaics per 1-degree cell) that align with the USGS 1:24,000-scale maps.
The TNRIS orthophoto design met the following requirements:
An example of an enterprise design to manage elevation in a raster dataset is the National Elevation Dataset (NED) produced by the USGS.
The NED provides elevation data covering the continental United States, Alaska, Hawaii, and the island territories in a seamless format with a consistent projection, resolution, elevation units, and horizontal and vertical datums. This dataset was mainly built from digital elevation models at a scale of 1:24,000 over the conterminous United States and islands and 1:63,360 for Alaska. A series of quadrangle-based DEMs were mosaicked into a single, continuous national raster dataset.
The original final resolution of the NED was one arc-second (approximately 30-meter pixels) for the conterminous United States and two arc-seconds (approximately 60-meter pixels) for Alaska. In addition to the standard 30-meter data, the NED is continuously updated with 10-meter and 3-meter sources, gradually migrating the nationwide DEM to these finer resolutions.
As of December 2003, approximately 43 percent of the conterminous United States was available in 10-meter resolution, and a coverage over Puget Sound, Washington, was at 3-meter resolution.
Managing the NED as a single raster dataset is a key requirement. The issues that had to be addressed included overlap, compression, and resampling.
DEM data does not overlap. Assembling a continuous raster dataset is much easier.
A lossless compression, such as LZ77, is required. A lossless compression technique means that the data size is larger. However, the raster values are better preserved for analysis.
Resampling type for building pyramids
DEM data is continuous; therefore, cubic convolution should be chosen to build the pyramids because it displays with a smooth, crisp appearance. In certain cases, bilinear interpolation may reveal a smoother appearance, which may be regarded as favorable. However, keep in mind that the primary purpose of DEM data is analysis.
The National Geographic Society scanned all the 1:24,000-scale USGS topographic maps for the entire United States. This resulted in an enormous number of data files and is distributed as the National Geographic TOPO! series.
The project goal for this case study was to mosaic these maps together for each state and deliver them as a raster catalog of mosaicked images to be used as backgrounds for mapping. Because of the size of the mosaics, a raster catalog of mosaicked maps for each state was used to organize all the individual states into a single countrywide database using ArcSDE.
Scanned maps are best used at the scale range for which they were originally created. For example, viewing a 1:24,000 map at a scale of 1:100,000 will not allow you to clearly see all the details on the map or read all of its labels. Also, if viewing this same map at a scale of 1:5,000, it will appear blocky because zooming in to a larger scale does not allow you to see more detail on the map.
Maps should be scanned at an appropriate resolution. Scanning a map at too high a resolution (for example, 1,000 dpi) may not introduce any more information or resolution but will result in a much larger dataset. On the other hand, if you scan the map at too low a resolution (such as 72 dpi), then you may not capture all the information contained on the map clearly and text may not be legible at the target scale.
The original scanned map raster datasets were in a TPS format, and each raster dataset had its own color map. Not every state and map sheet used the same color map. Because each mosaicked raster dataset can have only one color map associated with it, any one chosen color map would not be appropriate for the complete raster dataset. To solve this dilemma, the raster datasets were converted into three-band TIFF files using a RGB color scheme. Because of the change in the color scheme, the file sizes increased. For example, after conversion into RGB raster format, a prototype built using 1,000 scanned maps with an original size of approximately 60 GB increased to 100 GB.
The National Geographic TOPO! datasets were also reprojected to a geographic coordinate system (latitude–longitude). Because the original USGS topographic maps were projected to Universal Transverse Mercator (UTM), text symbology did not appear straight, and a seamless image could not be built to cross UTM zones. Using a geographic coordinate system overcomes these two problems. Subsequently, they were converted to geographic latitude–longitude coordinates using the NAD83 datum.
JPEG compression was used to store the raster datasets with a quality setting of 50. It was determined through prototyping that this still provided an adequate level of detail at the intended map scale while reducing the overall raster dataset file size.
Pyramids were chosen to speed up the raster display. Normally, for continuous data such as this RGB representation, cubic convolution would be the best option. However, in this case, the bilinear interpolation displayed with the highest quality. This is one example of why you should prototype a small portion of your raster dataset before making any permanent decisions.
Details of the raster data in the geodatabase:
A raster catalog can be used to manage a time series of raster datasets. This generally refers to raster data collected as observations at different times at a single location. Examples include collecting satellite imagery over a short period of time to monitor natural disasters, such as flooding or fires, or over longer periods of time to show the patterns of urban sprawl or the changes in a forested area due to cutting and regrowth.
This case study is based on a dataset of satellite imagery captured using Geostationary Operational Environmental Satellites (GOES) collected over 17 days to capture the movements of Hurricane Mitch in 1998.
NOAA of the U.S. Department of Commerce develops and manages the GOES satellites. GOES satellites were originally designed to monitor the earth’s atmosphere and surface over a large region. The first GOES satellite was launched in 1975, and every few years, additional GOES satellites have been launched to replace aging GOES satellites and to provide additional information and coverage. Because GOES satellites are geostationary, each continually collects data over the same area of the earth. This has provided the opportunity to gather continually changing information such as weather system information. GOES can be used to monitor potential severe weather conditions, such as hurricanes and thunderstorms; estimate rainfall or snowfall; map the movement of sea ice; detect forest fires; and monitor volcano plumes.
The dataset used in the case study was downloaded from NOAA’s National Climatic Data Center (http://www.ncdc.noaa.gov/oa/climate/climateinventories.html). It included 33 images, collected at 12:00 A.M. and P.M. coordinated universal time (UTC), from October 14, 1998, until October 31, 1998.
Loading this data into a raster catalog to use in ArcGIS was straightforward. By loading each dataset into a raster catalog, the data could be managed together. New datasets can be appended at any time, and the dataset can be viewed as an animation to show the movement of the hurricane over time. A raster catalog can contain data with various extents, spatial resolutions, and data types; therefore, different raster datasets can be added to enhance the present raster dataset.
The LZ77 compression was used to store the raster datasets in the raster catalog because it is a lossless compression. Although this data is not intended for statistical analysis, storage size was not a primary issue, and high image quality was important for display. Nearest neighbor was chosen for pyramid resampling because it provided the best display.
Because of the raster dataset sizes (under 2 GB combined), they could have been loaded into a personal geodatabase. Loading data into an ArcSDE geodatabase can take longer than loading data into a personal geodatabase but results in much faster display time.
Displaying the raster catalog as a movie is a simple process. A setting in ArcMap on the raster catalog’s Layer Properties dialog box lets you specify to automatically display each raster dataset in the raster catalog in a time sequence.