Show Navigation | Hide Navigation

You are here:
Geoprocessing tool reference > Spatial Statistics toolbox > Mapping Clusters toolset > Tools

Hot Spot Analysis (Getis-Ord Gi*) (Spatial Statistics)
Release 9.3 Last modified January 28, 2009	Print all topics in : "Tools"

Related Topics

Calculates the Getis-Ord Gi* statistic for hot spot analysis.

Learn more about how Hot Spot Analysis: Getis-Ord Gi* works

Illustration

Hot Spot Analysis Illustration

Usage tips

This tool honors the environment output coordinate system. Feature geometry is projected to the output coordinate system prior to analysis, so values entered for the Distance Band/Threshold Distance parameter should match those specified in the output coordinate system. All mathematical computations are based on the output coordinate system spatial reference.
Calculations based on either Euclidean or Manhattan distance require projected data to accurately measure distances.
If you will be running several analyses on a single dataset (e.g., analyzing several different fields) or if you have a dataset with more than 3000 features, it is recommended that you construct the spatial weights matrix file prior to analysis.
Given a set of weighted features, the Getis-Ord Gi* statistic identifies spatial clusters of high values (hot spots) and spatial clusters of low values (cold spots).
This tool creates as derived output the Z score and p-value fieldnames.
The output from the Hot Spot Analysis tool is a Z score and p-value for each feature. These values represent the statistical significance of the spatial clustering of values, given the conceptualization of spatial relationships and the scale of analysis (distance parameter).
A high Z score and small p-value (probability) for a feature indicates a spatial clustering of high values. A low negative Z score and small p-value indicates a spatial clustering of low values. The higher (or lower) the Z score, the more intense the clustering. A Z score near zero indicates no apparent spatial clustering.
The Z score is based on the Randomization Null Hypothesis computation. For more information on Z scores, see What is a Z score What is a p-value.
The input field should contain a variety of non-negative values. The math for this statistic requires some variation in the variable being analyzed; it cannot solve if all input values are 1, for example. If you have incident data, and want to analyze incident intensity, consider aggregating your incident data or using Integrate with the Collect Events tool prior to analysis.
Whenever using shapefiles keep in mind that they cannot store null values. Tools or other procedures that create shapefiles from non-shapefile inputs may store or interpret null values as zero. This can lead to unexpected results.
The Conceptualization of Spatial Relationships used for analysis should be based on your understanding of spatial interaction among the features being analyzed. For this tool the fixed distance or contiguity spatial conceptualization methods are generally more appropriate than the inverse distance conceptualization methods.
For the Fixed Distance option, the distance band used for analysis should be based on your understanding of spatial interaction among the features being analyzed. Alternatively, features may be evaluated for a range of distance values or at the specific distance where spatial autocorrelation is maximized.
Use a Conceptualization of Spatial Relationships and/or Distance Band value that will ensure every feature has at LEAST one neighbor. Especially if the input data is skewed (does not create a nice bell curve when you plot the values as a histogram), you want to make sure that the number of neighbors is neither too small (most features have only one or two neighbors) nor too large (several features include all other features as neighbors), because that would make resultant Z scores less reliable. The Z scores are reliable (even with skewed data) as long as each feature is associated with several neighbors (approximately 8, as a rule of thumb). This tool can be applied to skewed data because it is "asymptotically normal".
For Inverse Distance conceptualization options: when zero is entered for the "Distance Band or Threshold Distance" parameter all features are considered neighbors of all other features; when this parameter is left blank, a default threshold distance will be applied.
When the spatial conceptualization is an Inverse Distance method (Inverse Distance, Inverse Distance Squared, or Zone of Indifference) any two points that are coincident will be given a weight of one to avoid zero division. This assures features are not excluded from analysis.
With inverse distance conceptualizations, weights for distances less than 1 become unstable. The weighting for features separated by less than 1 unit of distance (common with Geographic Coordinate System projections), are given a weight of 1.
Analysis on features with a Geographic Coordinate System projection is not recommended with the inverse distance spatial conceptualization methods.
This tool computes the Gi* statistic where each feature is its own neighbor; however, if you specify a Self Potential field in which all values are zero, the tool performs the Gi statistic (local calculations for a feature exclude the feature's own value).
In ArcGIS version 9.2, the "Global" standardization option was removed. Global standardization returns the same results as no standardization. Models built with previous versions of ArcGIS that use the Global standardization option may need to be rebuilt.
See the Modeling Spatial Relationships help page for further explanation of this tool's parameters.
Current map layers may be used to define the input feature class. When using layers, only the currently selected features are included in the analysis.
Learn more about working with layers and table views
When this tool runs in ArcMap, the output feature class is automatically added to the Table of Contents (TOC) with default rendering applied to the Z Score field. The hot to cold rendering applied is defined by a layer file in <ArcGIS>/ArcToolbox/Templates/Layers. You can reapply the default rendering, if needed, by importing the template layer symbology.

Command line syntax
An overview of the Command Line window
HotSpots_stats <Input_Feature_Class> <Input_Field> <Output_Feature_Class> <Fixed Distance Band | Inverse Distance | Inverse Distance Squared | Zone of Indifference | Polygon Contiguity (First Order) | Get Spatial Weights From File> <Euclidean Distance | Manhattan Distance> <None | Row | Global> <Distance_Band_or_Threshold_Distance> {Self_Potential_Field} {Weights_Matrix_File}

Parameter	Explanation	Data Type
<Input_Feature_Class>	The feature class for which hot spot analysis will be performed.	Feature Layer
<Input_Field>	The numeric count field (number of victims, crimes, jobs, and so on) to be evaluated.	Field
<Output_Feature_Class>	The output feature class to receive the Results field and Gi z score.	Feature Class
<Fixed Distance Band \| Inverse Distance \| Inverse Distance Squared \| Zone of Indifference \| Polygon Contiguity (First Order) \| Get Spatial Weights From File>	Specifies how spatial relationships between features are conceptualized. Inverse Distance — The impact of one feature on another feature decreases with distance. Inverse Distance Squared — Same as Inverse Distance, but the impact decreases more sharply over distance. Fixed Distance Band — Everything within a specified critical distance is included in the analysis; everything outside the critical distance is excluded. Zone of Indifference — A combination of Inverse Distance and Fixed Distance Band. Anything up to a critical distance has an impact on your analysis. Once that critical distance is exceeded, the level of impact quickly drops off. Polygon Contiguity (First Order) — The neighbors of each feature are only those with which the feature shares a boundary. All other features have no influence on computations. Requires an ArcInfo license. Get Spatial Weights From File — Spatial relationships are defined in a spatial weights file. The pathname to the spatial weights file is specified in the Weights Matrix File parameter. Polygon contiguity is only available with an ArcInfo license.	String
<Euclidean Distance \| Manhattan Distance>	Specifies how distances are calculated when measuring concentrations. Euclidean (as the crow flies) — The straight-line distance between two points. Manhattan (city block) — The distance between two points measured along axes at right angles. Calculated by summing the (absolute) differences between point coordinates.	String
<None \| Row \| Global>	The standardization of spatial weights provides more accurate results. None — No standardization of spatial weights is applied. This is the default. Row — Spatial weights are standardized by row. Each weight is divided by its row sum.	String
<Distance_Band_or_Threshold_Distance>	Specifies a cutoff distance for Inverse Distance and Fixed Distance options. Features outside the specified cutoff for a target feature are ignored in analyses for that feature. However, for Zone of Indifference, the influence of features outside the given distance is reduced with distance while those inside the distance threshold are equally considered. The value entered should match those of the Output Coordinate System.For the Inverse Distance conceptualizations of spatial relationships: A value of zero for this parameter indicates that no threshold distance is applied; when this parameter is left blank, a default threshold value will be computed and applied.This parameter has no effect when "Polygon Contiguity" or "Get Spatial Weights From File" spatial conceptualizations are selected.	Double
{Self_Potential_Field}	The field representing self-potential: The distance or weight between a feature and itself.	Field
{Weights_Matrix_File}	The pathname to a file containing spatial weights that define spatial relationships between features.	File

Data types for geoprocessing tool parameters

Command line example

workspace e:\project93\data
HotSpot tracts.shp AGE_65_UP tract65.shp 'Inverse Distance' 'Euclidean Distance' None # # #

Scripting syntax
About getting started with writing geoprocessing scripts
HotSpots_stats (Input_Feature_Class, Input_Field, Output_Feature_Class, Conceptualization_of_Spatial_Relationships, Distance_Method, Standardization, Distance_Band_or_Threshold_Distance, Self_Potential_Field, Weights_Matrix_File)

Parameter	Explanation	Data Type
Input_Feature_Class (Required)	The feature class for which hot spot analysis will be performed.	Feature Layer
Input_Field (Required)	The numeric count field (number of victims, crimes, jobs, and so on) to be evaluated.	Field
Output_Feature_Class (Required)	The output feature class to receive the Results field and Gi z score.	Feature Class
Conceptualization_of_Spatial_Relationships (Required)	Specifies how spatial relationships between features are conceptualized. Inverse Distance — The impact of one feature on another feature decreases with distance. Inverse Distance Squared — Same as Inverse Distance, but the impact decreases more sharply over distance. Fixed Distance Band — Everything within a specified critical distance is included in the analysis; everything outside the critical distance is excluded. Zone of Indifference — A combination of Inverse Distance and Fixed Distance Band. Anything up to a critical distance has an impact on your analysis. Once that critical distance is exceeded, the level of impact quickly drops off. Polygon Contiguity (First Order) — The neighbors of each feature are only those with which the feature shares a boundary. All other features have no influence on computations. Requires an ArcInfo license. Get Spatial Weights From File — Spatial relationships are defined in a spatial weights file. The pathname to the spatial weights file is specified in the Weights Matrix File parameter. Polygon contiguity is only available with an ArcInfo license.	String
Distance_Method (Required)	Specifies how distances are calculated when measuring concentrations. Euclidean (as the crow flies) — The straight-line distance between two points. Manhattan (city block) — The distance between two points measured along axes at right angles. Calculated by summing the (absolute) differences between point coordinates.	String
Standardization (Required)	The standardization of spatial weights provides more accurate results. None — No standardization of spatial weights is applied. This is the default. Row — Spatial weights are standardized by row. Each weight is divided by its row sum.	String
Distance_Band_or_Threshold_Distance (Required)	Specifies a cutoff distance for Inverse Distance and Fixed Distance options. Features outside the specified cutoff for a target feature are ignored in analyses for that feature. However, for Zone of Indifference, the influence of features outside the given distance is reduced with distance while those inside the distance threshold are equally considered. The value entered should match those of the Output Coordinate System.For the Inverse Distance conceptualizations of spatial relationships: A value of zero for this parameter indicates that no threshold distance is applied; when this parameter is left blank, a default threshold value will be computed and applied.This parameter has no effect when "Polygon Contiguity" or "Get Spatial Weights From File" spatial conceptualizations are selected.	Double
Self_Potential_Field (Optional)	The field representing self-potential: The distance or weight between a feature and itself.	Field
Weights_Matrix_File (Optional)	The pathname to a file containing spatial weights that define spatial relationships between features.	File

Data types for geoprocessing tool parameters

Script example

# Analyze the spatial distribution of 911 calls in a metropolitan area
# using the Hot-Spot Analysis Tool (Local Gi*)
# Import system modules
import arcgisscripting

# Create the Geoprocessor object
gp = arcgisscripting.create(9.3)
gp.OverwriteOutput = 1

# Local variables...
workspace = "C:\Data\911Call"

try:
# Set the current workspace (to avoid having to specify the full path to the feature classes each time)
    gp.workspace = workspace

    # Copy the input feature class and integrate the points to snap
    # together at 500 feet
    # Process: Copy Features and Integrate
    cf = gp.CopyFeatures("911Calls.shp", "911Copied.shp",
                         "#", 0, 0, 0)

    integrate = gp.Integrate("911Copied.shp #", "500 Feet")

    # Use Collect Events to count the number of calls at each location
    # Process: Collect Events
    ce = gp.CollectEvents("911Copied.shp", "911Count.shp", "Count", "#")

    # Add a unique ID field to the count feature class
    # Process: Add Field and Calculate Field
    af = gp.AddField("911Count.shp", "MyID", "LONG", "#", "#", "#", "#",
                     "NON_NULLABLE", "NON_REQUIRED", "#",
                     "911Count.shp")

    cf = gp.CalculateField("911Count.shp", "MyID", "[FID]", "VB")

    # Create Spatial Weights Matrix for Calculations
    # Process: Generate Spatial Weights Matrix... 
    swm = gp.GenerateSpatialWeightsMatrix("911Count.shp", "MYID",
                        "euclidean6Neighs.swm",
                        "K_NEAREST_NEIGHBORS",
                        "#", "#", "#", 6,
                        "NO_STANDARDIZATION") 

    # Hot Spot Analysis of 911 Calls
    # Process: Hot Spot Analysis (Getis-Ord Gi*)
    hs = gp.HotSpots("911Count.shp", "ICOUNT", "911HotSpots.shp", 
                     "Get Spatial Weights From File",
                     "Euclidean Distance", "None",
                     "#", "#", "euclidean6Neighs.swm")

except:
    # If an error occurred when running the tool, print out the error message.
    print gp.GetMessages()

Hot Spot Analysis (Getis-Ord Gi*) (Spatial Statistics)

Command line example

Script example