Key concepts for geoprocessing services
Key concepts for geoprocessing services
|
Release 9.3 |
|
Note: This topic was updated for 9.3.1.
A geoprocessing service contains geoprocessing tasks accessible by Web-enabled clients. Tasks are created by publishing geoprocessing model and script tools.
There are two ways of creating a geoprocessing service in ArcGIS Desktop:
- Publish a geoprocessing toolbox. Each tool in the toolbox becomes a task in the geoprocessing service.
- Publish an ArcMap document containing geoprocessing tool layers. Each tool layer becomes a task in the geoprocessing service.
Geoprocessing services and their tasks are accessed across the public Internet and private intranets and can be used in ArcGIS Desktop, ArcGIS Explorer, and Web applications such as a Web site built using ArcGIS Server Manager. In ArcGIS Desktop, geoprocessing services can be added to the ArcToolbox window as a toolbox, and the tasks become tools within the toolbox. Click here to view illustrations of geoprocessing services in these three clients.
The remainder of this topic is about the key design concepts, rules, and guidelines concerning geoprocessing services and tasks.
Crafting a geoprocessing service
Web clients are lightweight applications—they only know how to send packets of simple data to a server, such as text, numbers, and uncomplicated geographic features. A geoprocessing service takes this simple data and turns it into something extraordinary: the probable evacuation area for a hazardous chemical spill, the predicted tract and strength of a gathering hurricane, a map of land cover within a user-defined watershed, a parcel map with details of ownership dating back 100 years, or a permit for a parade route through downtown. The possibilities are infinite.
Geoprocessing services turn simple input data into useful geographic information. Geoprocessing provides the base of rich and powerful tools from which you craft a service that responds to simple input data. If you currently use geoprocessing, you already understand how you can create tools using models and scripts. What you may need to learn is how to craft your tool to work with the simplest possible input data to reach the largest possible audience of clients.
For example, suppose you have a model that computes an upstream watershed from a set of points, then clips input polygons within the computed watershed. The input parameters for this model are the following:
- A raster digital elevation model (DEM)
- A point feature class defining the pour points of the watersheds
- A polygon feature class that will be clipped to the computed watersheds
This model is appropriate for ArcGIS Desktop but not for a geoprocessing service. It would need to be modified as follows:
- Instead of operating in a study area defined by the user, it would work in a specific study area, such as the southern Sierra Nevada mountain range. Since the model has a known study area and operates on a specific DEM (the southern Sierra Nevada range), the DEM would no longer be an input parameter.
- As a geoprocessing service, clients cannot upload feature classes. Instead, the user would digitize points in the client application that represent watershed pour points. The service would snap these points to the DEM using a precomputed snap distance appropriate for the resolution of the DEM.
- Instead of extracting polygons from an input feature class, the model would operate on a known set of data such as land cover polygons. The output would be land–cover polygons within the computed watershed. The input polygon would no longer be needed.
The modified model takes simple inputs and answers a specific spatial query: "For the southern Sierra Nevada, what is the land cover within the watersheds defined by these points?" The model would be published as a geoprocessing service and used in a Web site meant for land managers in the southern Sierra Nevada mountain range.
This is not to say that you cannot build generic services using ArcGIS Server. The example
buffer points service is a wholly generic service that buffers any set of point features. ArcGIS Server is flexible, and with the use of some advanced techniques or custom programming, you can build generic services that process large datasets submitted by users. But the majority of services are focused on specific geographic areas, answer specific spatial queries, and work on lightweight clients. The design of ArcGIS Server geoprocessing services is driven by the need to build and serve these types of focused services.
Geoprocessing service configurations
Geoprocessing services can be created by publishing two different ArcGIS Desktop resources: a geoprocessing toolbox or an ArcMap document (.mxd) containing tool layers.
- When you publish a toolbox, all tools within the toolbox become geoprocessing tasks within the geoprocessing service.
- When you publish an ArcMap document, all tool layers within the map document become geoprocessing tasks within the geoprocessing service. (Tool layers are created by dragging and dropping tools into the ArcMap table of contents.)
- When publishing an ArcMap document containing tool layers, you can also specify that you want the ArcMap document to become a map service that will be used to draw the output of tasks. A map service that draws task outputs is called a result map service.
These three configurations are illustrated below.
Geoprocessing service from a toolbox
When you publish a toolbox, all tools within the toolbox become geoprocessing tasks. Data output by tasks is transported back to the client.
Geoprocessing services with a source map document
If you have used geoprocessing tools in an ArcMap session, you know that tools can use layers found in the ArcMap table of contents, as illustrated below. For example, the
Point Distance tool can accept either a point feature class on disk or an ArcMap layer that references a point feature class on disk.
In the same way, your geoprocessing task can use layers found in its source map document. The source map document, in this case, acts as a container of layers. You can make layers in the source map document input parameters to your task, as illustrated above, where the Data to extract variable is an input parameter, allowing the user to choose layers in the source map document.
NOTE: Geoprocessing tasks can only access layers found in its source map document—they cannot access layers found in other map services or in the client application.
There are performance benefits to using layers from a source map document in your model or script processes. The illustration below shows a model that uses a
network dataset, StreetsNetwork, to construct a route analysis layer. The StreetsNetwork variable can either reference a layer (which it does in this case) or a dataset on disk. Opening a network dataset is expensive relative to other kinds of datasets because network datasets contain several advanced data structures and tables that must be read and cached. By using the layer instead of the dataset, there is a performance advantage because ArcMap opens the dataset once, caches basic properties of the dataset, and keeps the dataset open. When the model executes, the dataset does not have to be reopened since the source map document already has it opened—a performance boost. Conversely, if the StreetNetwork variable directly referenced the dataset on disk, the dataset would be opened each time the model executes—a performance degradation.
For network analysis, you always want the network dataset as a layer in the source map document and use that layer in model variables. For other kinds of datasets, such as features and rasters, the performance advantage of using layers in the source map document is slight.
Geoprocessing services with a result map service
Geoprocessing services can have a result map service to create a digital map image of task results. Digital maps contain visual representations of geographic datasets that communicate vast quantities of information to your user. Digital maps are transported across the Web as images (such as a .jpeg) and a map image, byte for byte, contains far more human-interpretable information than raw features in a feature class. Map images are also manageable—they are easily compressed, they can be tiled into manageable chunks, and there are established methods for transporting and viewing them across the Web.
Map images are created by an ArcGIS Server map service and are the result of publishing an ArcMap document (.mxd). Because of the characteristics of a map image, you may want to create one for the results of your geoprocessing task and then transport the image across the Web rather than transport the resulting dataset or datasets. Geoprocessing services can have a result map service used by ArcGIS Server to create map images of your output data. A result map service contains a tool layer for each tool.
Result map services are used when
- The result of your task is a (potentially) large dataset.
- The data type of your output is unsupported by the client, such as rasters in ArcGIS Explorer. In this case, you use the result map service to display the output.
- You want to protect the result of your task by allowing it only to be viewed as a map and not downloaded as a dataset.
- You have complex cartography that needs to be drawn by the result map service and not by the client.
When you use a result map service, it's important to realize that there are two services—the geoprocessing service and the result map service. These two services execute independently of each other. When the task executes, ArcGIS Server executes the geoprocessing task first and then executes the result map service to draw the output of the geoprocessing service. Because of this execution order, the result map service needs datasets on disk produced by the geoprocessing service. This means that the output of the tasks in the geoprocessing service must be datasets on disk, not layers or in_memory datasets.
Basemaps for your geoprocessing service
For most geoprocessing tasks, your user will need some sort of basemap to use as a geographic reference. It may be a basemap of roads, populated places, and points of interest. Or it may be a basemap of an electric utility infrastructure: generating stations, powerlines, and substations—features that constitute a map of a world well-known to an electrical engineer.
Your geoprocessing service probably relies on a particular basemap to guide your user when entering locations. For example, if your user inputs a point that must fall within a parcel boundary in the city of Portland, you need a basemap showing these parcel boundaries. Furthermore, your service may only work within a certain study area (such as Portland), as opposed to a service that works globally. The study area can be thought of as the
geoprocessing extent since the service only has knowledge of data within the study area.
A common mistake is to use the result map service to display a basemap for your geoprocessing service. For example, if your user identifies a parcel by the point-and-click method, and your task draws the identified parcel symbolized by some attribute of the parcel, your initial thought might be to use the result map service to display the input parcel data (the basemap) as well as the identified parcel. There are two reasons why you should not use the result map service as a basemap:
- When a result map service is added to the application, all layers in the map service are available for display. These layers include geoprocessing tool layers used to draw output, layers that may contain sensitive data, or layers used by your geoprocessing service but make no sense to the user (like tool layers).
- Basemaps are multiscale and multiresolution. As you zoom in and out, the basemap changes, showing details at large scales and aggregating details into generalizations at small scales (for example, rivers change from line features at small scale to polygon features at large scales). Constructing a multiscale and multiresolution basemap that draws quickly and effortlessly is not something your result map service needs to be dealing with—its job is to draw outputs. You need to keep the design and implementation of basemap map services separate from the design and implementation of result map services.
Returning to the point-and-click–a–parcel application, you should have a map service to display the parcel data (the basemap) and use the result map service to return the parcel for display, perhaps color-coded by some attribute. Both map services use the same parcel dataset (there are no issues with this), and you divide the work of displaying a reference basemap from the work of displaying results.
Another consideration in designing map services is the characteristics of the client. For a Web application, you have full control over what map and geoprocessing tasks will be available in the application, and result map services need not appear as a map layer in the table of contents of the Web application. ArcMap and ArcGIS Explorer clients are a bit more problematic since, in practice, users can browse to any map or geoprocessing service and end up with mismatches between the basemap extent and the geoprocessing extent. When publishing services, there is no option where you can specify, "When adding this geoprocessing service, also add these other map services." You can, however, distribute ArcMap documents (.mxd) or ArcGIS Explorer documents (.nmf) that contain the correct services. You can also
provide built-in task documentation for your geoprocessing services and tasks detailing what map services are needed. Task documentation is accessible by all clients.
Data types and capabilities of the client
ArcGIS Explorer is a lightweight client application, meaning it has a small installation, unlike ArcGIS Desktop. Web applications are Web sites accessed with an Internet browser. Browsers are very lightweight (or thin) clients. Because of the lightweight nature of these clients, the full range of input and output
data types you find on ArcGIS Desktop are not available to these clients (otherwise, the client would be heavyweight, like ArcGIS Desktop). For example, lightweight clients do not support rasters as input to a geoprocessing task.
NOTE: Since processes within your published model or script execute on the server where all data types are available, you can use any data type for model or script processes, as illustrated below. Only the input and output parameter data types are limited. Any type of data that can be accessed by the server can be used by your model or script processes.
The following table summarizes key input parameter data types for the three clients.
Input parameter data type |
Supported on ArcGIS Desktop clients? |
Supported on ArcGIS Explorer client? |
Supported on Web application clients? |
Feature Set |
Yes |
Yes |
Yes |
Record Set |
Yes |
Yes |
Yes |
Feature Class |
No (but Feature Class input is supported indirectly with the Feature Set data type) |
No |
No |
Table |
No (but Table input is supported indirectly with the Record Set data type) |
No |
No |
Raster |
Yes |
No |
No |
Standard types (such as Long, Double, Boolean, Date, String) and Linear Unit (for example, "1,000 meters") |
Yes |
Yes |
Yes |
File (such as a .zip or .xml file) |
Yes |
Yes |
Yes |
Layer (any type of layer; for example, Feature Layer, Raster Layer, Network Analyst Layer) |
Only those layers found in a result map service or source map document. |
Only those layers found in a result map service or source map document. |
Only those layers found in a result map service or source map document. |
Any data type not listed above is either converted to a string data type or not allowed. The topic
Input and output data types goes into greater detail about data types for geoprocessing services.
Very likely, your existing models and scripts take feature classes and tables as input, since these are the most common geographic dataset types. This means your existing models and scripts will have to be modified before publishing them as geoprocessing tasks. If your model or script takes a feature class input, you can modify it to take a
feature set instead. If your model takes a table as input, you can modify your model to take a
record set instead.
The following table summarizes the key output parameter data types for the three clients.
Output parameter data type |
Supported on ArcGIS Desktop clients? |
Supported on ArcGIS Explorer client? |
Supported on Web application clients? |
Feature Class |
Yes |
Yes |
Yes |
Raster |
Yes |
No—can only be displayed in the map through the use of a result map service |
No. Can only be displayed in the map through the use of a result map service. |
Table |
Yes |
No—services that have a table data type as an output parameter will not be shown in the list of available tasks |
Yes |
Standard types (such as Long, Double, Boolean, Date, String) and Linear Unit (for example, "1,000 meters") |
Yes—viewed in the service result found in the Results tab of the ArcToolbox window |
Yesv—viewed in the Task Result. |
Yes |
File |
Yes |
Yes |
Yes |
Concurrent use of data - %scratchworkspace%
A geoprocessing task can be used concurrently (at the same time) by several users. For data that is read by your model or script, there are no concurrency issues—concurrent users can all read the same data. However, data created or updated by your service requires that you understand issues regarding concurrent writers.
Creating new data: Geoprocessing tasks typically create
intermediate data and output data. ArcGIS Server provides a mechanism that insures that there are no concurrency issues with intermediate and output data. When a task is executed, ArcGIS Server creates a unique job folder in a jobs directory. This jobs folder contains a folder named scratch, which further contains a file geodatabase named scratch, as illustrated below. After creating this folder structure, ArcGIS Server sets the geoprocessing scratch workspace
environment variable to the scratch folder (not the scratch geodatabase, but the folder). Your model and scripts can easily discover and use this scratch workspace by placing percent signs around the environment variable name ("%scratchworkspace%"). You
must write your intermediate data and output data to either the scratch folder or the scratch geodatabase.
There are two methods you can use to ensure that data is written to the scratch folder or geodatabase:
- In ModelBuilder, right-click any intermediate data variable and choose Managed.
NOTE: Do not set output variables to Managed, only intermediate variables.
- Use variable substitution ("%scratchworkspace%") for pathnames such as these examples:
%scratchworkspace%/templines.shp
%scratchworkspace%/scratch.gdb/outWatershed
Learn more about the scratch workspace environment
You can write intermediate data to memory rather than disk. Writing data to memory is faster than writing to disk. If you write intermediate data to memory, it is not necessary to make the model variable intermediate or managed because the data will be deleted once the task executes.
Learn more about writing data to memory
Updating existing data: Any task that updates existing data is of particular concern. For example, a task within a service may update an existing table or feature class, adding new rows or features (using the
Append tool, for example) or updating existing attributes (using the
Calculate Field tool, for example). If the geoprocessing service is configured to allow multiple concurrent instances (users), and multiple instances are running, you may have a collision of multiple instances trying to update the same dataset. In these cases, you must limit the number of instances to one when configuring the service. This way, ArcGIS Server will queue the requests and only one instance at a time will access the data. Another problem is locking—if you are updating a dataset and the dataset is also a layer in a map service, the map service will put a lock on the data and the update will fail. Do not create map services that display data that is updated by another service.
You have two choices for drawing a task's output data:
- The client draws the data.
- The result map service draws the data.
When the client draws the output data, two pieces of information are sent to the client: the data and a layer drawing description. The layer drawing description contains the information you specify in the Symbology tab of a layer's Properties dialog box. This information includes how to group data (the layer symbology) and what symbols to use (symbol types). Only certain layer symbologies and symbol types are supported by clients.
When a result map service draws the data, the layer symbology and symbol types found in the corresponding tool layer are used. When using a result map service, you can use any layer symbology and symbol type since ArcMap (running on the server) will be drawing the data and transporting an image of the completed map back to the client. The capabilities of the client do not affect how ArcMap draws the result.
Learn more about defining output symbology for geoprocessing tasks
If you have used geoprocessing tools in ArcGIS Desktop, you have probably seen how geoprocessing tools validate your inputs. One example of validation is a list of fields changing when an input table changes, as illustrated below.
Other examples of validation include the appearance of warnings
and error
messages, changing of default values based on input data, and enabling and disabling of parameters.
NOTE: No validation occurs with geoprocessing tasks.
For example, suppose you create a model that has an input layer parameter and an input field parameter. The list of fields shown in the field parameter is dependent on the value of the layer parameter. In ArcGIS Desktop, if you change the layer, the list of fields changes. This is not the case with a task.
- The list of fields will contain the list at the time you published.
- If the user of your task picks another layer, the list of fields will not refresh.
Your GIS server represents an investment of effort and resources that you want to protect. ArcGIS Server contains security mechanisms that can prevent unauthorized users from accessing your services and applications. You can also use ArcGIS Server to configure tiers of access for different groups within your organization. Securing your system is beyond the scope of this topic.
Click here for more information on securing your system, local connections, internet connections, and Web applications.
There are two properties of a geoprocessing service that provide security.
- You can limit the number of features or records that will be returned by the service. To prevent your user from downloading any features or records, you can set this limit to zero.
- When a geoprocessing service executes, it writes messages that can be viewed by your user. Some of these messages contain pathnames to data, and you may not want your user to view these pathnames. You can turn off messaging for a geoprocessing service.