Adjust the lag size of the semivariogram model

Geostatistical Analyst

Segment 8 of 18

Top Previous Next

This is the second of six segments that show you how to improve the ozone prediction surface.

In the semivariogram/covariance cloud in Exercise 2, you explored the overall spatial autocorrelation of the measured points. To do so, you examined the semivariogram, which showed the difference-squared of the values between each pair of points at different distances. The goal of semivariance/covariance modeling is to determine the best fit for a model that will pass through the points in the semivariogram (the yellow line in the diagram).

The semivariogram is a function that relates semivariance (or dissimilarity) of data points to the distance that separates them. Its graphic representation can be used to provide a picture of the spatial correlation of data points with their neighbors.

The Semivariogram/Covariance Modeling dialog box allows you to model the spatial relationship in the dataset. By default, optimal parameters for a spherical semivariogram model are calculated. Geostatistical Analyst first determines a good lag size for grouping semivariogram values. The lag size is the size of a distance class into which pairs of locations are grouped to reduce the large number of possible combinations. This is binning. As a result of the binning, notice that there are fewer points in this semivariogram than the semivariogram cloud shown in Exercise 2. A good lag distance can also help reveal spatial correlations. The dialog box displays the semivariogram values as a surface and as a scatterplot related to distance. Then it fits a spherical semivariogram model (best fit for all directions) and its associated parameter values, which are typically the nugget, range, and partial sill.

Try to fit the semivariogram at small lags (distances). It is possible to use different bin sizes and refit the default spherical model by changing the lag size and number of lags.

As a guide, Isaacs and Srivastava* suggest that if the samples are located on a pseudoregular grid, the grid spacing is usually a good lag size. If the sampling is random (as in this case), the average distance between neighboring samples can be used as an initial lag size.

In ArcGIS, the Average Nearest Neighbor tool (located in the Spatial Statistics Tools toolbox under Analyzing Patterns) can be used to obtain the average distance between neighboring points. The average distance between neighboring ozone samples is 18,267.9 meters. For this exercise, a lag size of 20,000 meters will be used. With 10 lags of 20,000 meters each, you can fit semivariograms up to distances of 200 kilometers (125 miles). This will allow you to capture the spatial autocorrelation in ozone concentrations quite well, especially at short distances (which are the most important for interpolation).

To change the lags, Click the Lag text box and type a new Lag size value of 20000. Click in the Number of lags text box and type �10�.

Reducing the lag size means that you are effectively zooming in to model the details of the local variation between neighboring sample points. You will notice that with a smaller lag size, the fitted semivariogram (the blue line) rises sharply, then levels off. The range is the distance where it levels off. This flattening out of the semivariogram indicates that there is little autocorrelation beyond the range.

* Isaacs, E. H., and M. Srivastava, 1989, An Introduction to Applied Geostatistics. New York: Oxford University Press, 146.