My da|ra Login

Detailed view

metadata language: English

Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Arson 1979-2019

Resource Type
  • Kaplan, Jacob (University of Pennsylvania)
Publication Date
Free Keywords
arson; Uniform Crime Reports; crime; arrests; crime rates; law enforcement; fire
  • Abstract

    For any questions about this data please email me at If you use this data, cite it.

    Version 8 release notes:
    • Adds 2019 data.
    • Note that the number of months missing variable sharply changes starting in 2018. This is probably due to changes in UCR reporting of the column_2_type variable which is used to generate the months missing county (the code I used does not change). So pre-2018 and 2018+ years may not be comparable for this variable.
    Version 7 release notes:
    • Adds a last_month_reported column which says which month was reported last. This is actually how the FBI defines number_of_months_reported so is a more accurate representation of that. Removes the number_of_months_reported variable as the name is misleading. You should use the last_month_reported or the number_of_months_missing (see below) variable instead.
    • Adds a number_of_months_missing in the annual data which is the sum of the number of times that the agency reports "missing" data (i.e. did not report that month) that month in the card_2_type variable or reports NA in that variable. Please note that this variable is not perfect and sometimes an agency does not report data but this variable does not say it is missing. Therefore, this variable will not be perfectly accurate.
    Version 6 release notes:
    • Adds 2018 data
    Version 5 release notes:
    • Adds data in the following formats: SPSS and Excel.
    • Changes project name to avoid confusing this data for the ones done by NACJD.
    Version 4 release notes:
    • Adds 1979-2000, 2006, and 2017 data
    • Adds agencies that reported 0 months.
    • Adds monthly data.
    • All data now from FBI, not NACJD. See here for the R code I used to read in the files and clean data, and the setup files made to read them in.
    • Changes some column names so all columns are <=32 characters to be usable in Stata.
    Version 3 release notes:
    • Add data for 2016.
    • Order rows by year (descending) and ORI.
    • Removed data from Chattahoochee Hills (ORI = "GA06059") from 2016 data. In 2016, that agency reported about 28 times as many vehicle arsons as their population (Total mobile arsons = 77762, population = 2754.
    Version 2 release notes:
    • Fix bug where Philadelphia Police Department had incorrect FIPS county code.
    This Arson data set is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains information about arsons reported in the United States. The information is the number of arsons reported, to have actually occurred, to not have occurred ("unfounded"), cleared by arrest of at least one arsoning, cleared by arrest where all offenders are under the age of 18, and the cost of the arson. This is done for a number of different arson location categories such as community building, residence, vehicle, and industrial/manufacturing structure.

    The yearly data sets here combine data from the years 1979-2018 into a single file for each group of crimes. Each monthly file is only a single year as my laptop can't handle combining all the years together. These files are quite large and may take some time to load. I also added state, county, and place FIPS code from the LEAIC (crosswalk).
    All the data was is from the FBI and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here.
    A small number of agencies had some months with clearly incorrect data. I changed the incorrect columns to NA and left the other columns unchanged for that agency. The following are data problems that I fixed - there are still likely issues remaining in the data so make sure to check yourself before running analyses.
    • Oneida, New York (ORI = NY03200) had multiple years that reported single arsons costing over $700 million. I deleted this agency from all years of data.
    • In January 1989 Union, North Carolina (ORI = NC09000) reported 30,000 arsons in uninhabited single occupancy buildings and none any other months.
    • In December 1991 Gadsden, Florida (ORI = FL02000) reported that a single arson at a community/public building caused $99,999,999 in damages (the maximum possible).
    • In April 2017 St. Paul, Minnesota (ORI = MN06209) reported 73,400 arsons in uninhabited storage buildings and 10,000 arsons in uninhabited community/public buildings and one or fewer every other month.

    When an arson is determined to be unfounded the estimated damage from that arson is added as negative to zero out the previously reported estimated damages. This occasionally leads to some agencies have negative values for arson damages. You should be cautious when using the estimated damage columns as some values are quite large. Negative values in other columns are also due to adjustments (zeroing out the error) from month to month. Negative values are not meant to be NA in this data set.

Temporal Coverage
  • 1979-01-01 / 2019-12-31
    Time Period: Mon Jan 01 00:00:00 EST 1979--Tue Dec 31 00:00:00 EST 2019
Geographic Coverage
  • United States
Sampled Universe
Smallest Geographic Unit: Police agency
This study is freely available to the general public via web download.
  • Is version of
    DOI: 10.3886/E103540

Update Metadata: 2020-10-21 | Issue Number: 1 | Registration Date: 2020-10-21

Kaplan, Jacob (2020): Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Arson 1979-2019. Version: 8. ICPSR - Interuniversity Consortium for Political and Social Research. Dataset.