Show Navigation | Hide Navigation

You are here:
Geoprocessing tool reference > Spatial Statistics toolbox > Analyzing Patterns toolset > Tools

Multi-Distance Spatial Cluster Analysis (Ripley's k-function) (Spatial Statistics)
Release 9.2 Last modified January 9, 2009	Print all topics in : "Tools"

Related Topics

The Multi-Distance Spatial Cluster Analysis (Ripley's K-function) tool determines whether a feature class is clustered at multiple different distances. The tool outputs the result as a table and optionally as a pop up graphic.

Learn more about how Multi-Distance Spatial Cluster Analysis works.

Usage tips

The output of the tool is a table with two fields named "ExpectedK" and "ObservedK" containing the expected k and observed k values respectively. If a confidence interval option is specified two additional fields named "LowConfEnv" and "HiConfEnv" will be present with the confidence interval information for each iteration of the tool.
When the "Display Output Graphically" option is chosen, a graph showing the expected and observed outputs for each iteration is generated and displayed. The expected results will be represented by a blue line while the observed results will be a red line. Deviation of the observed line above the expected line indicates that the dataset is exhibiting clustering at that distance. Deviation of the observed line below the expected line indicates that the dataset is exhibiting dispersion at that distance.
The Weight Field is most appropriately used when it represents number of incidents or counts.
When no weight field is specified, the confidence envelope is constructed by distributing points randomly in the study area and calculating k for that distribution. Each random distribution of the points is called a "permutation". If "99 permutations" is selected, the tool will randomly distribute the set of points 99 times for each iteration. After distributing the points 99 times the tool selects the k value that deviated above and below the expected by the greatest amount and these values become the confidence interval.
When a weight field is specified, only the weight values are randomly redistributed to compute confidence envelopes; the point locations remain fixed. In essence, when a weight field is specified, locations remain fixed and we evaluate the clustering of feature values in space. On the other hand, when no weight field is specified we are analyzing clustering/dispersion of feature locations.
When no study area is specified, the tool uses a minimum enclosing rectangle as the study area polygon.
The k-function statistic is very sensitive to the size of the study area. Identical arrangments of points can exhibit clustering or dispersion depending on the size of the study area. Therefore it is imperative that the study area boundaries are carefully considered. If no study area feature class is provided, the minimum bounding rectangle of the input features is used. The picture below is a classic example of how identical feature distributions can be dispersed or clustered depending on the area specified.
A study area feature class should only be given if "User-provided Study Area Feature Class" is chosen for the Study Area Method parameter.
If a study area feature class is specified, it should have exactly one single part feature.
If no Beginning Distance or Increment Distance are specified then default values are calculated for you based on the extent of the input feature class.
Points in the input feature class that fall outside the user specified study area are only considered when the "None" edge correction option is selected. Other edge correction techniques compensate for edge issues with simulated points, by reducing the study area, or by weighting edge neighbors higher than non-edge neighbors.
The Simulate Outer Boundary Values edge correction method mirrors points across the study area boundary to correct for underestimates near edges. Points that are within a distance equal to the maximum distance band of the edge of the study area are mirrored. The mirrored points are used so that edge points will have more accurate neighbor estimates. The diagram below illustrates what points will be used in the calculation and which will be used only for edge correction.
The Reduce Analysis Area edge correction technique shrinks the size of the analysis area by a distance equal to the largest distance band to be used in the analysis. After shrinking the study area, points found outside of the new study area will be considered only when neighbor counts are being assessed for points still inside the study area. They will not be used in any other way during the k-function calculation. The diagram below illustrates what points will be used in the calculation and which will be used only for edge correction.
Ripley's Edge Correction Formula checks each point's distance from the edge of the study area and its distance to each of its neighbors. All neighbors that are further away from the point in question than the edge of the study area are given extra weight. This edge correction method is only appropriate for square or rectangular shaped study areas.
Mathematically, the Multi-Distance Spatial Cluster Analysis tool uses a common transformation of Ripley's k-function where the expected result with a random set of points is equal to the input distance. The transformation L(d) is shown below.

where A is area, N is the number of points, d is the distance and k(i, j) is the weight, which (if there is no edge correction) is 1 when the distance between i and j is less than or equal to d and 0 when the distance between i and j is greater than d. When edge correction is applied, the weight of k(i,j) is modified slightly.
The units of the "Beginning Distance" and "Distance Increment" are the units of the input Feature Class' coordinate system.
For line and polygon features, feature centroids are used in computations.
The "Display Output Graphically" parameter will only work on the windows operating system. When set to true it will display the results of the tool graphically.
The environment settings do not have an effect on this tool.

Command line syntax
An overview of the Command Line window
MultiDistanceSpatialClustering_stats <Input_Feature_Class> <Output_Table> <Number_of_Distance_Bands> {0 Permutations - no confidence envelope | 9 Permutations | 99 Permutations | 999 Permutations} {Display_Results_Graphically} {Weight_Field} {Beginning_Distance} {Distance_Increment} {None | Simulate Outer Boundary Values | Reduce Analysis Area | Ripley's Edge Correction Formula} {Minimum Enclosing Rectangle | User provided Study Area Feature Class} {Study_Area_Feature_Class}

Parameter	Explanation	Data Type
<Input_Feature_Class>	The feature class upon which the analysis will be performed.	Feature Class
<Output_Table>	The table to which the results of the analysis will be written.	Table
<Number_of_Distance_Bands>	The number of times to increment the neighborhood size and analyze the dataset for clustering. The starting point and size of the increment are specified in the Beginning Distance and Distance Increment parameters respectively.	Long
{0 Permutations - no confidence envelope \| 9 Permutations \| 99 Permutations \| 999 Permutations}	The confidence envelope is calculated by randomly placing points in the study area. The number of points randomly placed is equal to the number of points in the feature class. Each set of random placements is called a "permutation" and the confidence envelope is created from these permutations. This parameter allows you to select how many permutations you want to use to create the confidence envelope. 0 Permutations: no confidence envelope — Confidence envelopes are not created. 9 Permutations — the tool randomly places nine sets of points. 99 Permutations — the tool randomly places 99 sets of points. 999 Permutations — the tool randomly places 999 sets of points.	String
{Display_Results_Graphically}	Specifies whether the tool will display the results of the Multi-Distance Spatial Cluster Analysis tool graphically. True — The output will be displayed graphically. False — The output will not be displayed graphically.	Boolean
{Weight_Field}	A numeric field with weights that give certain features more influence than others.	Field
{Beginning_Distance}	The distance at which to start the cluster analysis and the distance from which to increment. The value entered for this parameter should be in the units of the Input Feature Class' coordinate system.	Double
{Distance_Increment}	The distance by which to increment during each iteration. The distance used in the analysis starts at the Beginning Distance and increments by the amount specified in the Distance Increment. The value entered for this parameter should be in the units of the Input Feature Class' coordinate system.	Long
{None \| Simulate Outer Boundary Values \| Reduce Analysis Area \| Ripley's Edge Correction Formula}	Method to use to correct for under estimates in the number of neighbors for features near the edges of the study area. None — Points outside the study area are not placed to reduce underestimation. However, if the input feature class already has points that fall outside of the study area, these will be used in neighborhood counts (but not for the k-function calculation). Simulate Outer Boundary Values — This method simulates points outside the study area so that the number of neighbors near the edges is not under estimated. The simulated points are the "mirrors" of points within the study area across the study area boundary. Reduce Analysis Area — This method shrinks the study area such that some points are found outside of the study area. Points found outside the study area are used to calculate neighbor counts but not used in the cluster analysis itself. Ripley's Edge Correction Formula — For all the points (j) in the neighborhood of point i, this method checks to see if the edge of the study area is closer to i or if j is closer to i. If j is closer, extra weight is given to the point j. This edge correction method is only appropriate for square or rectangular shaped study areas.	String
{Minimum Enclosing Rectangle \| User provided Study Area Feature Class}	Specifies the region to use for the study area. Selection of this area is critical as area is part of the equation used by the tool. Minimum Enclosing Rectangle — Indicates that the smallest possible rectangle enclosing all of the points will be used. User provided Study Area Feature Class — Indicates that a feature class defining the study area will be provided in the Study Area Feature Class parameter.	String
{Study_Area_Feature_Class}	Feature class that delineates the area over which the input feature class should be analyzed. Only to be specified if User-provided Study Area Feature Class is specified in the Study Area Feature Class parameter.	Feature Class

Data types for geoprocessing tool parameters

Command line example

MultiDistanceSpatialClusterAnalysis

Scripting syntax
About getting started with writing geoprocessing scripts
MultiDistanceSpatialClustering_stats (Input_Feature_Class, Output_Table, Number_of_Distance_Bands, Compute_Confidence_Envelope, Display_Results_Graphically, Weight_Field, Beginning_Distance, Distance_Increment, Boundary_Correction_Method, Study_Area_Method, Study_Area_Feature_Class)

Parameter	Explanation	Data Type
Input_Feature_Class (Required)	The feature class upon which the analysis will be performed.	Feature Class
Output_Table (Required)	The table to which the results of the analysis will be written.	Table
Number_of_Distance_Bands (Required)	The number of times to increment the neighborhood size and analyze the dataset for clustering. The starting point and size of the increment are specified in the Beginning Distance and Distance Increment parameters respectively.	Long
Compute_Confidence_Envelope (Optional)	The confidence envelope is calculated by randomly placing points in the study area. The number of points randomly placed is equal to the number of points in the feature class. Each set of random placements is called a "permutation" and the confidence envelope is created from these permutations. This parameter allows you to select how many permutations you want to use to create the confidence envelope. 0 Permutations: no confidence envelope — Confidence envelopes are not created. 9 Permutations — the tool randomly places nine sets of points. 99 Permutations — the tool randomly places 99 sets of points. 999 Permutations — the tool randomly places 999 sets of points.	String
Display_Results_Graphically (Optional)	Specifies whether the tool will display the results of the Multi-Distance Spatial Cluster Analysis tool graphically. True — The output will be displayed graphically. False — The output will not be displayed graphically.	Boolean
Weight_Field (Optional)	A numeric field with weights that give certain features more influence than others.	Field
Beginning_Distance (Optional)	The distance at which to start the cluster analysis and the distance from which to increment. The value entered for this parameter should be in the units of the Input Feature Class' coordinate system.	Double
Distance_Increment (Optional)	The distance by which to increment during each iteration. The distance used in the analysis starts at the Beginning Distance and increments by the amount specified in the Distance Increment. The value entered for this parameter should be in the units of the Input Feature Class' coordinate system.	Long
Boundary_Correction_Method (Optional)	Method to use to correct for under estimates in the number of neighbors for features near the edges of the study area. None — Points outside the study area are not placed to reduce underestimation. However, if the input feature class already has points that fall outside of the study area, these will be used in neighborhood counts (but not for the k-function calculation). Simulate Outer Boundary Values — This method simulates points outside the study area so that the number of neighbors near the edges is not under estimated. The simulated points are the "mirrors" of points within the study area across the study area boundary. Reduce Analysis Area — This method shrinks the study area such that some points are found outside of the study area. Points found outside the study area are used to calculate neighbor counts but not used in the cluster analysis itself. Ripley's Edge Correction Formula — For all the points (j) in the neighborhood of point i, this method checks to see if the edge of the study area is closer to i or if j is closer to i. If j is closer, extra weight is given to the point j. This edge correction method is only appropriate for square or rectangular shaped study areas.	String
Study_Area_Method (Optional)	Specifies the region to use for the study area. Selection of this area is critical as area is part of the equation used by the tool. Minimum Enclosing Rectangle — Indicates that the smallest possible rectangle enclosing all of the points will be used. User provided Study Area Feature Class — Indicates that a feature class defining the study area will be provided in the Study Area Feature Class parameter.	String
Study_Area_Feature_Class (Optional)	Feature class that delineates the area over which the input feature class should be analyzed. Only to be specified if User-provided Study Area Feature Class is specified in the Study Area Feature Class parameter.	Feature Class

Data types for geoprocessing tool parameters

Script example

(Enter the scripting example code here. Wrap the entire field with the Code formatting style. Format the comments in your code using Code Comment formatting style.)

Multi-Distance Spatial Cluster Analysis (Ripley's k-function) (Spatial Statistics)

Command line example

Script example