Writing Services

A+ Writing Service

Custom Essay Writing Services Custom Essay Writing Services

Custom Essay Writing Services
Hot Spot Analysis
Crime Hotspot Analysis using CrimeStat

Grant Buhay


The concept of Geographical Information Systems and profiling is not new to the

area of Community Policing. Mechanical methods of point pattern analysis and

data separation have been benefiting society ever since Dr. Snow's landmark

discovery of the tainted well ( Waters 1995). The analysis of spatial and temporal

data has been critical to the successes of many criminal investigations. Buzzwords such as "geographic, criminal and psychological profiling" have rejuvenated an interest in analytical geography and the advances in computer technology have given geographers a medium in which to express the capabilities of their science (Waters, 1999, Rossmo, K., httptp://www.ecricanada.com/AboutECRI.htm). Criminal Geographic Targeting is an example of the practical application of academic research," concludes Rossmo, " and is finally putting geographers on the map.( Rossmo, 1997). The purpose of this paper is to explore the spatial statistics program for the analysis of crime incident locations. CrimeStat , which has been developed by Ned Levine, Ph.D., and Associates, under a grant from the National Institute of Justice. In particular the "hotspot analysis" module, which include the K-means clustering, nearest neighbor hierarchical spatial clustering and Local Moran statistics will be examined for their functionality, ease of navigation and interpretability of results. The K-Means analysis will be employed to make the determination of whether the distribution of data points display a clustered pattern or one of complete spatial randomness. One would expect the "hotspot" aspect of the module to highlight areas of high concentration that may not be apparent on maps that simply plot crime locations. It is in this area of "hotspot" identification and analysis that this paper will focus.

Literature Review:

A hot spot has been defined as a condition indicating some form of clustering in a spatial distribution. However, not all clusters are hot spots because the environments that help generate crime, the places where people live, also tend to be in clusters. So any definition of hot spots has to be qualified. Sherman (1995) defined hot spots "as small places in which the occurrence of crime is so frequent that it is highly predictable, at least over a 1-year period." According to Sherman, crime and location are approximately six times more concentrated than it is among individuals. Therefore the locational aspect is extremely important.

However, there seems to be confusion surrounding the hot spot issue, especially when it comes to defining the difference between spaces and places. Block and Block (1995) pointed out that a place could be a point, such as an apartment building, or an area, such as a census tract. However, buildings are generally considered places, and census tracts as spaces. Concentrations of criminal activity locations may be easily identified on a relatively simple point-map of crime locations, however, this becomes problematic when multiple crimes that occur at a single address are displayed by a by a single point on a pin map ( Sadler 1998). So there seems to be some academic debate as to an explicit definition of hot spot except for programs with procedures that self-define hot spots, such as, CrimeStat. Hot spots are specific to their local conditions. In Baltimore County, Maryland, for example, hot spots are identified according to three criteria: frequency, geography, and time. At least two crimes of the same type must be present. The area and the timeframe is small a 1- to 2-week period. Hot spots are generally monitored by crime analysts until they become inactive (Canter, 1997). Although the definition may be elusive, common sense and objective analysis should clearly define hotspots.

Data Acquisition and Manipulation:

The analysis will be performed on crime data, originally retrieved from Tetrad Computer Applications Inc. Crime Analysis internet site, (http:// www. tetrad. com/new/crime.html#Profile) and was published in the Vancouver Sun, September 16, 1995. The data was in a graphical map format that displayed the locations and modus operandi for unsolved murders in Vancouver spanning twenty four years from 1970 to 1994, figure 1.

Figure 1. Unsolved Murders 1970-1994

The defined study area is immediately South of Stanley Park in Vancouver and the original data consists of 135 crime locations. The point data displayed on the map has associated to it the particulars of the victim, modus operandi and the actual street address of the crime, table 1. The task of geo-referencing the street network map and the crime locations appears, at first glance, deceptively simple. That is, until you try to locate a georeferenced city map of Vancouver, or portion


Patricia 1160 Haro St. 19811125 Strangled F 33

Table 1. Victim and Crime Particulars

of, in a digital format without having pre-approved financing from a major lending institution. Therefore, it was necessary to take advantage of a corporate demo subject area data base and graphical user interface, referred to currently as; VIP MapGuide, compliments of Canadian Pacific Railway, figure 2.

This intranet site, (http://mgpc/ ctnvip/home.html) is in development to be used to locate and link customer shipments to the rail network and its attributes. This application allowed me to locate and assign the lat / long co-ordinates associated to each data point in an ASCII text file to be imported into Crime Stat and Idrisi32 for visual analysis. This was a labourious and time consuming task done for each data point. Where very tight clusters of data points were encountered, it was difficult to distinguish individual points. As a result, the data set was reduced to 103 locations, marginally affecting high concentration areas.

Figure 2. Common Track Network VIP Mapguide


The nearest neighbor hierarchical spatial clustering routine groups points together on the basis of spatial proximity. The user defines a significance level

associated with a threshold distance, a minimum number of points that are required for each cluster, and an output size for displaying the clusters with ellipses. Clustering is hierarchical in that the first-order clusters are treated as separate points to be clustered into second-order clusters, and the second-order clusters are treated as separate points to be clustered into third-order clusters, and so on. Higher-order clusters will be identified only if the distance between their centers are closer than the new threshold distance. The results can be saved to a text file, output as a '.dbf' file, or output as ellipses to ArcView '.shp', MapInfo '.mif' or Atlas*GIS '.bna' files. The cluster output size can be adjusted to display the number of standard deviations defined by the ellipse, from one standard deviation, the default value, to five standard deviations. Defining a minimum number of points that are required can control restrictions on the number of clusters. The default is 10. If there are too few points allowed, then there will be many very small clusters. By increasing the number of required points, the number of clusters will be reduced.

The K-means clustering routine is a procedure for partitioning all the points into K groups. Where K is a number assigned by the user. The default K is 5. The routine finds K seed locations in which the distance between points within clusters are small ( minimum within) but the distances between seed locations are large ( maximum between). If K is small, the clusters will typically cover larger areas. Conversely, if K is large, the clusters will typically cover smaller areas. The results can also be saved to a text file, output as a '.dbf' file, or output as ellipses to ArcView '.shp', MapInfo '.mif' or Atlas*GIS '.bna' files.

Method of Analysis Nnh:

The first output to be examined is that of the nearest neighbor hierarchical spatial clustering ( Nnh). In doing so, it is necessary to restate our research objectives by adhering to the following six steps.

Step 1. Hypothesis Statement

The first step is to establish null hypothesis statement regarding the CSR. In this case we can state: Ho; there is no statistically significant difference between the observed and expected values, therefore the distribution of points constitutes a random pattern. Ha; there is a statistically significant difference between the observed and expected values, therefore the distribution of points constitute a clustered pattern.

Step 2. Choice of Test: The statistical choice of test in this case appears to be the t-test, ( T-value) although traditional NNA employs a z-statistic( standard normal deviate).

Step 3. Sample Size and Significance Level:

The sample size for all tests regarding this analysis is 103 data points and a significance level of 0.05. The threshold distance is adjusted by the significance level. Distances smaller than the threshold are candidates for clustering. The larger the alpha-level chosen, then clusters will cover larger areas with larger ellipses. The smaller the likelihood, then clusters will cover smaller areas with smaller ellipses. However, the higher the alpha-level chosen, the greater the likelihood that clusters could be chance groupings.

Step 4. Sampling Distribution:

Since we are employing a t-test to determine statistical significance, we can assume we are dealing with a t distribution.

Step 5. Region of Rejection:

Based on our sample size of 103, an alpha value of 0.05, with n-1 df and employing a two-tailed test, we know from the t-tables that the region of rejection is greater than +/- 1.96.

Step 6. Decision Rule:

Based on our hypothesis statement and the region of rejection we must then accept or reject the null hypothesis. If our calculated t-value is greater than our critical t-value we must reject the null hypothesis Ho and accept Ha. Since our

Calculated T-value of 1.671 is less than our critical T-value of 1.96, we cannot reject Ho and therefore state: there is no statistically significant difference between the observed and expected values, therefore the distribution of points constitutes a random pattern.

Results Nnh:

The results...
The rest of the paper is available free of charge to our registered users. The registration process just couldn't be easier. Log in or register now. It is all free!
You should cite this paper as follows:

MLA Style
Hot Spot Analysis. EssayMania.com. Retrieved on 12 Oct, 2010 from