Jacob Kaplan's Concatenated Files: Uniform Crime Reporting (UCR) Program Data: Hate Crime Data 1991-2018
- Kaplan, Jacob (University of Pennsylvania)
AbstractFor any questions about this data please email me at firstname.lastname@example.org. If you use this data, please cite it.
Version 6 release notes:
- Adds 2018 data
- Adds data in the following formats: SPSS, SAS, and Excel.
- Changes project name to avoid confusing this data for the ones done by NACJD.
- Adds data for 1991.
- Fixes bug where bias motivation "anti-lesbian, gay, bisexual, or transgender, mixed group (lgbt)" was labeled "anti-homosexual (gay and lesbian)" prior to 2013 causing there to be two columns and zero values for years with the wrong label.
- All data is now directly from the FBI, not NACJD. The data initially comes as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. For the R code used to clean this data, see here. https://github.com/jacobkap/crime_data.
- Adds data for 2017.
- Adds rows that submitted a zero-report (i.e. that agency reported no hate crimes in the year). This is for all years 1992-2017.
- Made changes to categorical variables (e.g. bias motivation columns) to make categories consistent over time. Different years had slightly different names (e.g. 'anti-am indian' and 'anti-american indian') which I made consistent.
- Made the 'population' column which is the total population in that agency.
Version 3 release notes:
- Adds data for 2016.
- Order rows by year (descending) and ORI.
- Fix bug where Philadelphia Police Department had incorrect FIPS county code.
Each row indicates a hate crime incident for an agency in a given year. I have made a unique ID column ("unique_id") by combining the year, agency ORI9 (the 9 character Originating Identifier code), and incident number columns together. Each column is a variable related to that incident or to the reporting agency.
Some of the important columns are the incident date, what crime occurred (up to 10 crimes), the number of victims for each of these crimes, the bias motivation for each of these crimes, and the location of each crime. It also includes the total number of victims, total number of offenders, and race of offenders (as a group). Finally, it has a number of columns indicating if the victim for each offense was a certain type of victim or not (e.g. individual victim, business victim religious victim, etc.).
The only changes I made to the data are the following. Minor changes to column names to make all column names 32 characters or fewer (so it can be saved in a Stata format), changed the name of some UCR offense codes (e.g. from "agg asslt" to "aggravated assault"), made all character values lower case, reordered columns. I also added state, county, and place FIPS code from the LEAIC (crosswalk) and generated incident month, weekday, and month-day variables from the incident date variable included in the original data.
- Adds 2018 data
1991-01-01 / 2018-12-31Time Period: Tue Jan 01 00:00:00 EST 1991--Mon Dec 31 00:00:00 EST 2018
Is version of
Update Metadata: 2020-12-02 | Issue Number: 1 | Registration Date: 2020-12-02