Tuning and configuring services

ArcGIS Server makes it easy to publish services right away because it sets many of the default service properties for you; however, if hundreds or thousands of users will be accessing your services, or if users will be performing stateful operations such as editing on your services, you'll want to change the default service property values to best accommodate your deployment. This topic provides an overview of some of the properties and techniques that you use to best configure your services.

Pooling

You can modify a service's properties to make it either pooled or non-pooled. Pooled services can be shared between multiple application sessions. Therefore, pooled services should only be used with stateless operations. In contrast, non-pooled services are dedicated to one application session and are used when the application requires stateful operations, such as editing. Non-pooled services should generally only be created for editing data, connecting through an ArcGIS Server Local connection.

Both pooled and non-pooled configurations require you to specify a minimum and maximum number of instances when you add the service. When you start the service configuration, the GIS server pre-creates and initializes the minimum number of instances. When an application asks the server object manager (SOM) for an instance of that service, it gets a reference to one of the pre-created services. If all of the pre-created services are in use, the server creates a new instance, and will do this for each subsequent request until the maximum allowable number of instances for the configuration has been reached, or the capacity of all container machines has been reached, whichever comes first.

Pooled services

An application that uses a pooled service instance only uses it for the amount of time it takes to complete one request (for example, draw a map or geocode an address). After the request is completed, the application releases its reference to the service and returns it directly to the pool. Users of such an application may be working with a number of different instances of a service in the pool as they interact with the application. This fact is transparent to the users, since the state of all the instances in the pool is the same.

For example, a stateless application that wants to draw a certain extent of a map will get a reference to an instance of a map service from the pool, execute a method on the map service to draw the map, then release it back to the pool. The next time the application needs to draw the map, this is repeated. Each draw of the map may use a different instance of the pooled service; therefore, each pooled service must be the same (have the same set of layers, the same renderer for each layer, and so on). If a user changes the state of a pooled service by, for example, adding a layer or changing a layer's renderer, he or she will see inconsistent results while panning and zooming around the map. This is because the instance whose state was changed was returned to the pool, and the user is not guaranteed to receive that particular instance from the pool every time he or she requests a service. It's the developer's responsibility to make sure that the application does not change the state of the instance and that the instance is returned to the pool in a timely manner.

Pooling services allows the GIS server to support more users with fewer resources allocated to a particular service. Because applications can share a pool of services, the number of concurrent users on the system can be greater than that which would be possible if each user held a reference to a dedicated service.

Pooled services can support more users because application sessions share a collection of services in the pool.

Non-pooled services

An application that makes use of a non-pooled service typically holds its reference to the service for the duration of the application's session. When the application releases the instance, it is destroyed and the GIS server creates a new one to maintain the number of available instances. For this reason, the user of a non-pooled service can make changes to the service's underlying data.

With non-pooled services, the number of users on the system can have no more than a 1:1 correlation with the number of running service instances. Therefore, the number of concurrent users the GIS server can support is equal to the number of non-pooled services that it can support effectively at any one time.

With non-pooled services, the number of users on the system can have no more than a 1:1 correlation with the number of running service instances.

Recycling

Service recycling allows services that have become unusable to be destroyed and replaced with fresh services; recycling also reclaims resources taken up by stale services.

Pooled services are typically shared between multiple applications and users of those applications. Through reuse, a number of things can happen to a service to make it unavailable for use by applications. For example, an application may incorrectly modify a service's state, or an application may incorrectly hold a reference to a service, making it unavailable to other applications or sessions. In some cases, services may become corrupted and unusable. Recycling allows you to keep the pool of services fresh and cycle out stale or unusable services. Note that recycling does not apply to non-pooled services because non-pooled services are created explicitly for use by a particular client and destroyed after use.

During recycling, the server destroys, then re-creates each instance in a pooled service configuration. Recycling occurs as a background process on the server. Although you will not see anything on your screen notifying you that recycling is occurring, you can see the events associated with recycling in the log files.

Recycling destroys and re-creates all running instances of a service, regardless of whether those instances are above the minimum specified. To periodically return the number of running instances to the minimum specified the service must be stopped and restarted. A good way to automate this process is to create a python, shell, or windows batch script that executes a custom ArcGIS Server API command line executable. This custom executable would take the server name, service name, service type, and whether the service should be started or stopped as command line arguments. It would be implemented using IServerObjectAdmin.StartConfiguration and IServerObjectAdmin.StopConfiguration. See an example of the code

The time between recycling events is called the recycling interval. The default recycling interval is 24 hours, which you can change in the Service Properties dialog. You can also select the time that the configuration will initially be recycled. From that time forward, recycling will occur each time the recycling interval is reached.

Services are recycled one instance at a time to ensure that instances remain available and to spread out the performance hits caused by creating a new instance of each service. Recycling occurs in random order; however, instances of services in use by clients are not recycled until released. In this way, recycling occurs without interrupting the user of a service.

If there are not enough services available during recycling, a request will be queued until an instance becomes available. If the MaximumWaitTime is reached during this time, the log files will record the same message that they normally would.

If you change the underlying data of a service, this change will automatically be reflected after recycling. For example, if you have a service of type MapServer running and you change its associated map document, you will be able to see the change after recycling occurs. (To see the changes immediately, you can manually stop and start the service.)

Isolation

When you create a service, you specify the minimum and maximum number of instances you want to make available. These instances run on the container machines within processes. The isolation level determines whether these instances run in separate processes or share processes.

With high isolation, each instance runs in its own process. If something causes the process to fail, it will only affect the single instance running in it.

Services with high isolation run in dedicated processes on the GIS server.

In contrast, low isolation allows up to four instances of a service configuration to share a single process, thus allowing the execution of four concurrent, independent requests. This is often referred to as multi-threading.

With low isolation, up to four instances of the same service configuration can share a process.

When more than four instances of a particular service are created, the server starts an additional process for the next four instances, and so on. As instances are created and destroyed, they will vacate and fill spaces in those running processes.

The advantage of low isolation is that it increases the number of concurrent instances supported by a single process. Using low isolation can significantly improve memory consumption on your server. However, this improvement comes with some risk. If a process experiences a shut down or crash, all instances sharing the process will be destroyed. Low isolation also reduces the effectiveness of pool shrinking because all instances in a process must go out of use before pool shrinking can remove the process.

Note that non-pooled services always run in their own process; thus, isolation level does not apply.

Creation time, wait time, and usage time

When services are created in the GIS server, either as a result of the server starting or in response to a request for a server by a client, the time it takes to initialize the service is referred to as its creation time. The GIS server maintains a maximum creation time-out that dictates the amount of time a service has to start before the GIS server will assume its startup is hanging and cancel the creation of the service.

When the maximum number of instances of a pooled or non-pooled service is in use, a client requesting a service will be queued until another client releases one of the services. The amount of time it takes between a client requesting a service and getting a service is called the wait time. A service can be configured to have a maximum wait time. If a client's wait time exceeds the maximum wait time for a service, then the request will time out.

Once a client gets a reference to a service, it uses the service for a certain period of time before releasing it. The amount of time between when a client gets a reference to a service and when it releases it is called the usage time. To ensure that clients don't hold references to services for too long (that is, they don't correctly release services), each service can be configured with a maximum usage time. If a client holds on to a service for longer than the maximum usage time, then the service is automatically released and the client will lose its reference to the service.

Maximum usage time also protects services from being used to do larger volumes of work than the administrator intended. For example, a service that is used by an application to perform geodatabase checkouts may have a maximum usage time of 10 minutes. In contrast, a service that is used by applications that only draw maps may have a maximum usage time of one minute.

The GIS server maintains statistics both in memory and in its log files about wait time, usage time, and other events that occur within the server. The server administrator can use these statistics to determine if, for example, the wait time for a service is high, which may indicate a need to increase the maximum number of instances for that service.

Limiting the load on the server with the Capacity property

Capacity limits the number of service instances that can run on a container machine (SOC). By default, capacity is disabled. However, by setting appropriate capacity values for all of your container machines, you can prevent your container machines from running more instances than you feel they are capable of handling. For information on determining what load your container machine can handle, see Anticipating and accommodating users.

You may set different capacity values for each SOC machine. For example, if one of your SOC machines is considerably more powerful than the others, you might want to set a high capacity value on that machine and a lower value on the other machines.

Once the number of running service instances on a SOC reaches the capacity, the server will not create any new instances on that machine. If all SOC machines have reached capacity, then pool shrinking takes effect.

The server always runs two background processes (logging and directory manager mechanisms) that always count toward capacity. Usually these are allocated to the first two SOC machines listed in the Server.dat file, however circumstances may sometimes cause them to run on other SOC machines. When setting your capacity values, you should allow for the possibility of these background processes running on any SOC machine.

How the server adjusts to demand: Pool shrinking

What happens when a GIS server reaches capacity on all its container machines? ArcGIS Server 9.2 introduces the concept of pool shrinking, which removes service instances of less popular configurations and replaces them with instances of the more popular configurations.

Pool shrinking takes effect when the number of running service instances has reached capacity on each SOC machine. When this has happened and the SOM receives a request for a service, the server creates the requested instance and destroys one instance of the least-recently-used service configuration. The following points explain in more detail the capabilities and limitations of pool shrinking:

Pool shrinking will not destroy an instance of a service that is in use by an application. If all SOCs have reached capacity and all running server object instances are in use, any new requests for instances must wait in a queue.
Pool shrinking respects the minimum number of instances that you set for your service configuration; it will never cause a configuration to go below its minimum number of running instances.
Pool shrinking works the same for both pooled and non-pooled services.
Pool shrinking takes effect after you set capacity values for each SOC machine in your GIS server.
Pool shrinking is less effective with low-isolation service configurations. Low-isolation services can share up to four instances within one SOC process. Since pool shrinking creates and destroys processes, it affects all four instances in the low-isolation process. If even one of these instances is in use, pool shrinking cannot take effect.
Pool shrinking is a helpful mechanism when all service instances are in use; however, a server system that is experiencing this heavy of a processing load could probably benefit from additional hardware if resources are available. Careful coding practices and tuning of service properties may also reduce loads on the server.