Uniform Crime Reporting (UCR) Program Data: County-Level Detailed Arrest and Offense Data

Resource Type
Dataset : administrative records data
  • Kaplan, Jacob (University of Pennsylvania)
Publication Date
Free Keywords
crime; violent crime statistics; crime; victimless crimes; national crime statistics (USA); ucr; Uniform Crime Reports; arrest; arrest; arrest rates
  • Abstract

    !!! Important Note: There are a number of flaws in the imputation process to make these county-level files. Included as one of the files to download (and also in every zip file) is Maltz & Targonski's 2002 paper on these flaws and why they are such an issue. I very strongly recommend that you read this paper in its entirety before working on this data. I am only publishing this data because people do use county-level data anyways and I want them to know of the risks. Important Note !!!

    The following paragraph is the abstract to Maltz & Targonski's paper:
    County-level crime data have major gaps, and the imputation schemes for filling in the gaps are inadequate and inconsistent. Such data were used in a recent study of guns and crime without considering the errors resulting from imputation. This note describes the errors and how they may have affected this study. Until improved methods of imputing county-level crime data are developed, tested, and implemented, they should not be used, especially in policy studies.
    Version 3 release notes:
    • Adds a variable to all data sets indicating the "coverage" which is the proportion of the agencies in that county-year that report complete data (i.e. that aren't imputed, 100 = no imputation, 0 = all agencies imputed for all months in that year.). Thanks to Dr. Monica Deza for the suggestion. The following is directly from NACJD's codebook for county data and is an excellent explainer of this variable.
      • The Coverage Indicator variable represents the proportion of county data that is reported for a given year. The indicator ranges from 0 to 100. A value of 0 indicates that no data for the county were reported and all data have been imputed. A value of 100 indicates that all ORIs in the county reported for all 12 months in the year.
        • Coverage Indicator is calculated as follows: CI_x = 100 * ( 1 - SUM_i { [ORIPOP_i/COUNTYPOP] * [ (12 - MONTHSREPORTED_i)/12 ] } )
          • where CI = Coverage Indicator
          • x = county
          • i = ORI within county
    • Reorders data so it's sorted by year then county rather than vice versa as before.
    Version 2 release notes:
    • Fixes bug where Butler University (ORI = IN04940) had wrong FIPS state and FIPS state+county codes from the LEAIC crosswalk causing it to be counted in the wrong location. This agency has been remvoed entirely from the county data. Thanks to Dr. Wade Jacobsen for finding this bug.
    The agency-level data used to make these files are the Offenses Known and Clearances by Arrest 1960-2017 (https://www.openicpsr.org/openicpsr/project/100707/version/V10/view) and Arrests by Age, Sex, and Race 1974-2016 (https://www.openicpsr.org/openicpsr/project/102263/version/V7/view) data that I have released. For the code I used to create these files please see here: https://github.com/jacobkap/crime_data/blob/master/R/county_data.R.
    This data aggregates agency-level crime and arrest data into county-level counts. Which county each agency is in is based on the FIPS state-county code in the LEAIC (crosswalk) file which is already joined with the agency-level data. I also add a column with the county name based on the census data set Annual Survey of Public Employment & Payroll (ASPEP) (https://www.openicpsr.org/openicpsr/project/101399/version/V5/view). For agencies that do not report, or report fewer than all 12 months of the years, I use the following imputation procedure designed by NACJD. The imputation process is the same as NACJD's process except while they exclude offenses with zero months reported I do include them.
    • Agencies reporting between 3 and 11 months have their crimes/arrests multiplied by 12/number of months reported. Such that an agency that reports only 6 months out of the year and says there were 10 murders would be estimated to have had 20 murders in the years (10 murders * 12/6 months reported = 10 * 2 = 20).
    • Agencies reporting fewer than 3 months would simply have the average (mean) number of arrests for agencies in that state, year, and population group (e.g. cities population 250,000+, cities population 10,000-24,999). This average is generated only by agencies that reported all 12 months of the year! Such that if an agency reported 15 murders and only reported 2 months of the year, that agency would get the average number of murders for similar sized agencies (same population group) in that state during that year.
    • Agencies with a population of 0 (common in special agencies such as state police, universities, park police) and fewer than 3 months reported are dropped as they have no population group to match to.
Temporal Coverage
  • 1960-01-01 / 2017-12-31
    Time Period: Fri Jan 01 00:00:00 EST 1960--Sun Dec 31 00:00:00 EST 2017 (1960-2017 for crime data, 1974-2016 for arrest data)
Geographic Coverage
  • Counties in the United States
Sampled Universe
Smallest Geographic Unit: County

Update Metadata: 2019-02-22 | Issue Number: 1 | Registration Date: 2019-02-22