Contributed by David Richard Steinmetz. He takes the NYC Data Science Academy 12 week full time Data Science Bootcamp program from July 5th to September 22nd, 2016. This post is based on their first class project – the Exploratory Data Analysis Visualization Project, due on the 2nd week of the program. You can find the original article here.
Do Higher Population Densities Increase Crime?
Crime, particularly violent crime, is always prevalent in the public consciousness. At the same time, the UN reported in 2014 that population densities and the prevalence of urban areas continue to increase, with more than half the world’s population living in urban areas fo….
The relationship between crime rates and population density is unclear from an intuitive standpoint. It seems likely that crime rates increase as population densities increase. You don’t shoot your neighbor in the country, right? But when you are traveling alone at night, having a higher population density makes it more likely to have people in the vicinity, which lowers your chances of being mugged. There tends to also be a stronger tax base, allowing for more police who simultaneously have less area to patrol. So which is it: do crime rates, which measure the number of incidents per 100,000 people, go up or down with increasing population density?
It turns out that the answer to this question is rather complex. Over the years population density has increased throughout the state, while crime rate has consistently gone down. Nevertheless, there continues to be a correlation between density and crime. How could this be? That is the question we will answer in this blog post by looking at publicly available data from New York State.
The population density in NYC counties is an order of magnitude higher than the rest of New York’s counties
Since we are interested in how the population density is related to crime rates, we first look to see what the population density is in New York State and how it has changed over time. It is important to consider population densities when considering geo-spatial metrics and not simply absolute population. The population density in New York City is nearly an order of magnitude higher than in the rest of New York counties, which makes graphical comparisons more difficult.
The NYC counties and the New York counties outside NYC will be investigated separately in light of the strong disparity of population densities and differing availability of data for NYC counties. The data was obtained through Data.gov and is published by the State of New York and maintained by OpenData NY. Click here for the dataset. For each county over the years of 1990-2015 it includes population and both the absolute number of crimes and crime rate (incidents per 100,000 people) for four types of crime metrics: index, property, violent, and firearm. These metrics are collected by the FBI through the National Uniform Crime Reporting Program (UCR).
As reported by Data.gov: “The UCR reporting system collects information on seven crimes classified as Index offenses which are most commonly used to gauge overall crime volume. The Index Rate includes the violent crimes of murder/non-negligent manslaughter, forcible rape, robbery, and aggravated assault; and the property crimes of burglary, larceny, and motor vehicle theft.”
Information for the NYC counties of the Bronx, Kings, Manhattan, Queens and Richmond was only provided between 1990-2001, supporting the decision to investigate the NYC counties separately from the rest of New York counties.
data(df_pop_county)
ny_choro <- df_pop_county[1829:1890,]
ny_choro$value <- (subset(crime$Density.sq.mi, Year==2000))
county_choropleth(ny_choro,
title = "New York Population Density by County",
legend = "Density",
num_colors = 1,
state_zoom = c("new york"))
The mean population density has risen nearly 8% in New York counties outside NYC as well as in NYC since 1990
Keep in mind the axes on the graphs above are different. Population density clearly increases over the timespan provided by the available data for both New York counties outside NYC as well as in NYC counties. The upward trend of population in New York State outside NYC, seen on the left, has continued steadily over the last 25 years with the exception of two spikes: one before the millennium and subsequent decrease in the following year, and one in 2010 with a subsequent increase. In NYC, the population density was flat with a sharp upward spike beginning in 1999, the same year a similar trend was seen in New York State outside NYC.
stden <- crime_nys %>%
select(Year, Density.sq.mi) %>%
group_by(Year) %>%
summarise(Density.sq.mi=mean(Density.sq.mi))
p9 <- ggplot(stden, aes(x=Year, y=Density.sq.mi)) +
geom_line() +
ggtitle(‘Population Density in New York State (minus NYC)’) +
xlab(‘Year’) +
ylab(‘Population Density (ppl/sq.mi)’) +
theme_minimal()
p9
Four measures of crime rate have all decreased in the past 25 years in New York State
The Index Crime Rate for NY counties outside NYC has dropped over 1/3 in the last 25 years. Because the property crime rate has often been nearly an order of magnitude higher than the violent crime rate, the change in the index rate is largely shaped by the change in the rate of property crime, shown in green. The change in the violent crime rate shown in blue and the firearm crime rate, which is a subset of the violent crime rate and is shown in purple, are difficult to distinguish from this graph but also decrease over the 25 year time span. If you expected crime rates to increase with increasing population density, the trends in the last two sections begin to cast doubt on your assumption.
stavg <- crime_nys %>%
select(Year, Index.Rate, Property.Rate, Violent.Rate,
Firearm.Rate) %>%
group_by(Year) %>%
summarise(Index.Rate=mean(Index.Rate),
Property.Rate=mean(Property.Rate),
Violent.Rate=mean(Violent.Rate),
Firearm.Rate=mean(Firearm.Rate))
stavg_melt <- melt(stavg, id='Year', value.name = 'Value')
names(stavg_melt)[2] <- 'Crime.Type'
p3 <- ggplot(stavg_melt, aes(x=Year, y=Value)) +
geom_line(aes(color=Crime.Type),size=1) +
ggtitle(‘Trend of Crime Rates in New York State (minus NYC)’) +
xlab(‘Year’) +
ylab(‘Crime Rate (Incidents per 100,000 ppl)’) +
theme_minimal()
p3
Four measures of crime rate have all decreased in the past 25 years in New York State
It turns out that New York City was also a beneficiary of decreasing crime rates over the 11-year period for which data was reported in this dataset. The index rate fell more than 1/2 over the 1990’s alone, indicating that factors outside of population density, which increased over the same period, have a strong impact on crime rates. As a percentage, the crime rates in NYC decreased more than those in NY counties outside NYC over the years 1990-2001.
ctysum <- crime_nyc %>%
select(Year, Index.Rate, Property.Rate, Violent.Rate,
Firearm.Rate) %>%
group_by(Year) %>%
summarise(Index.Rate=mean(Index.Rate),
Property.Rate=mean(Property.Rate),
Violent.Rate=mean(Violent.Rate),
Firearm.Rate=mean(Firearm.Rate))
ctysum_melt <- melt(ctysum, id='Year', value.name = 'Value')
names(ctysum_melt)[2] <- 'Crime.Type'
p3 <- ggplot(ctysum_melt, aes(x=Year, y=Value)) +
geom_line(aes(color=Crime.Type),size=1) +
ggtitle(‘Trend of Crime Rates in New York City’) +
xlab(‘Year’) +
ylab(‘Crime Rate (Incidents per 100,000 ppl)’) +
theme_minimal()
p3
The crime rates rose with population density outside NYC when it remains under 500 ppl/sq.mi
In NY counties outside NYC the index crime rate shows an increasing trend with increasing population density up to 500 ppl/sq.mi. The trend can be hard to see, so a least squares regression line was added solely for visual aid. Each point on this scatterplot represents the population density of one county in one year. All years between 1990-2015 are shown. The decreasing crime rates during that period account for a large part of the variance at a given population density. For these less densely population areas the trend for each individual year is essentially the same.
p11 <- ggplot(data=u500nys, aes(x=Density.sq.mi, y=Index.Rate)) +
geom_point() +
geom_smooth(method = 'lm') +
ggtitle('NY counties outside NYC with population density < 500 ppl/sq.mi') +
xlab('Population Density (ppl/sq.mi)') +
ylab('Index Crime Rate (Incidents per 100,000 ppl)') +
theme_minimal()
p11
The index crime rate rises with population density in NYC
Similar to the New York counties outside NYC with a population density below 500 people per square mile, the counties in NYC show an increasing crime rate with increasing population density. Since the overall index crime rate for both sets of counties decreased over time, but the index rate still increases with increasing population density, two possible explanations are: 1. the crime rate decreased uniformly over all counties, or 2. the crime rate decreased more in counties with lower population density, maintaining the upward correlation.
p11 <- ggplot(data=oe500nys, aes(x=Density.sq.mi, y=Index.Rate)) +
geom_point(aes(colour=County)) +
geom_smooth(method = 'lm', se=FALSE) +
ggtitle('NY counties outside NYC with population density >= 500 ppl/sq.mi') +
xlab('Population Density (ppl/sq.mi)') +
ylab('Index Crime Rate (Incidents per 100,000 ppl)') +
theme_minimal()
p11
Crime rates did not increase with population density above 500 ppl/sq.mi outside NYC
The main insight is that the index crime rate decreased with increasing population density above population densities of 500 people per square mile in counties outside NYC. The counties in the above scatterplot with the highest population densities are Nassau, Westchester, and Rockland counties, all of which are directly adjacent to NYC counties. The decrease is unusual, considering the upward trend seen amongst counties with population densities below 500 people per square mile and the NYC counties. In the context of the data, this unusual behavior should be investigated further, such as the relative change of population densities and crime rates in these counties compared to the others.
p11 <- ggplot(data=oe500nyc, aes(x=Density.sq.mi, y=Index.Rate)) +
geom_point(aes(colour=County)) +
geom_smooth(method = 'lm', se=FALSE) +
ggtitle('NYC counties') +
xlab('Population Density (ppl/sq.mi)') +
ylab('Index Crime Rate (Incidents per 100,000 ppl)') +
theme_minimal()
p11
akeaway
- Population density in NY state has increased over the last 25 years
- Crime rates in NY state have decreased over the last 25 years
- In NYC counties, crime rates increase with population density
- In counties outside NYC, an increase in crime rates appears with an increase in population density, but only up to 500 ppl/sq.mi, where it then appears to decrease
- The first three points can all be true if the crime rates decreased uniformly for all counties or more for less densely populated counties
The Road Ahead
Household income, racial diversity, age, and education are all variables which intuitively could impact crime rates for which there is available data. Expanding the investigation to account of those variables would shed more light on the factors affecting crime rates and help guide policy decisions.