My da|ra Login

Detailed view

metadata language: English

Census of Population and Housing, 2000 [United States]: Selected Subsets From Summary File 1, States

Resource Type
Dataset : census data
  • United States Department of Commerce. Bureau of the Census
  • Inter-university Consortium for Political and Social Research
Other Title
  • Archival Version (Subtitle)
Collective Title
  • Census of Population and Housing, 2000 [United States] Series
Publication Date
Funding Reference
  • National Science Foundation
Free Keywords
census data; demographic characteristics; employment; ethnicity; families; household composition; housing; housing conditions; occupations; population
  • Abstract

    Prepared by the Inter-university Consortium for Political and Social Research, this data collection consists of selected subsets extracted from the CENSUS OF POPULATION AND HOUSING, 2000 [UNITED STATES]: SUMMARY FILE 1, STATES (ICPSR 3194). Summary File 1 data contain information compiled from the questions asked of all people and of every housing unit enumerated in Census 2000: sex, age, race, Hispanic or Latino origin, type of living quarters (household/group quarters), household relationship, housing unit vacancy status, and housing unit tenure (owner/renter). The information is presented in 286 tables, one variable per table cell, plus additional variables with geographic information. Cases in the summary file data are classified by levels of observation, known as "summary levels" in the Census Bureau's nomenclature, which served as the selection criteria for the subsets. Each subset comprises all of the cases in one of two summary levels: whole census tracts (summary level 140) and census tracts in places (summary level 158). The latter covers whole tracts completely within places and portions of tracts that cross place boundaries. Five files are provided for each subset. There is a file for each of the four census regions (East, Midwest, South, and West) and a combined national file. Puerto Rico is included in the national and South files.
  • Table of Contents


    • DS0: Study-Level Files
    • DS1: Whole Census Tracts, the Nation
    • DS2: Whole Census Tracts, Northeast Only
    • DS3: Whole Census Tracts, Midwest Only
    • DS4: Whole Census Tracts, South Only
    • DS5: Whole Census Tracts, West Only
    • DS6: Census Tracts in Places, the Nation
    • DS7: Census Tracts in Places, Northeast Only
    • DS8: Census Tracts in Places, Midwest Only
    • DS9: Census Tracts in Places, South Only
    • DS10: Census Tracts in Places, West Only
    • DS100: Data Dictionary
    • DS101: SAS Data Definition Statements
    • DS102: SPSS Data Definition Statements
    • DS103: Stata Data Definition Statements
Temporal Coverage
  • Time period: 2000
  • Collection date: 2000
Geographic Coverage
  • United States
Sampled Universe
All persons and housing units in the United States.
Collection Mode
  • (1) The original Summary File 1, States data comprise 2,080 files. For each state (District of Columbia and Puerto Rico included), there is one column-delimited file that contains geographic identifiers (the geographic header record file or "Geo" file), plus 39 comma-delimited table files, each with a subset of tables in the data. For reasons of confidentiality, table files 12-36 do not contain data below the tract level. Consequently, they have fewer records than the Geo file and table files 1-11 and 37-39. The subsets were produced piece by piece, one state at a time. Initial steps in the production of a subset for a state involved sorting its Geo file and 39 table files in ascending order of the common identification variable LOGRECNO, reformatting the Geo file as a comma-delimited file, removing records with data below the tract level from table files 1-11 and 37-39, and stripping off the first five identification variables from each of the 39 table files (FILEID, STUSAB, CHARITER, CIFSN, and LOGRECNO). Next, the reformatted Geo file was merged with the stripped table files so that corresponding records in the Geo and table files were joined as a single record in the merged file. The state subset was generated by extracting from the merged file all cases with a given value for SUMLEV, the variable that identifies the summary level. Separate subsets were generated for summary levels 140 and 158. After subsets were produced for every state, the national and regional files were compiled by concatenating component state subsets in ascending order of their state Federal Information Processing Standards (FIPS) codes. (2) The following states are included in the four regional files. Northeast: Connecticut, Massachusetts, Maine, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, and Vermont. Midwest: Iowa, Illinois, Indiana, Kansas, Michigan, Minnesota, Missouri, North Dakota, Nebraska, Ohio, South Dakota, and Wisconsin. South: Alabama, Arkansas, District of Columbia, Delaware, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, West Virginia, and Puerto Rico. West: Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, New Mexico, Nevada, Oregon, Utah, Washington, and Wyoming. (3) The implied decimal places in variables INTPTLAT (latitude) and INTPTLON (longitude) were made explicit in the subsets. In addition, the values of all Geo variables were enclosed in quotes, except for variables AREALAND, AREAWATR, POP100, HU100, INTPTLAT, and INTPTLON. (4) The data definition statements were tested with SAS 8, SPSS 10, and Stata/SE 7.0. (5) The codebook documents data collection procedures, concepts, and individual variables in the original Summary File data as well as the ICPSR-produced subsets, but not the layout and structure of the subsets. That information is contained in the data dictionary file provided with this collection. In particular, the "Data Structure and Segmentation" section in chapter 2 of the codebook and the variable locations shown in chapter 7 do not apply to the subsets. Every subset file record begins with the Geo variables in their original order. The Geo variables are followed by the 6th to last variables in table file 1, then the 6th to last variables in table file 2, and so on up to the 6th to last variables in table file 39. (6) The codebook is provided by the principal investigator as a Portable Document Format (PDF) file. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided on the ICPSR Web site.

2006-01-18 File CB13395.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads. Funding insitution(s): National Science Foundation (SES 0137019).
This version of the study is no longer available on the web. If you need to acquire this version of the data, you have to contact ICPSR User Support (
Alternative Identifiers
  • 13395 (Type: ICPSR Study Number)
  • Is previous version of
    DOI: 10.3886/ICPSR13395.v1

Update Metadata: 2015-08-05 | Issue Number: 6 | Registration Date: 2015-06-16

United States Department of Commerce. Bureau of the Census; Inter-university Consortium for Political and Social Research (2002): Census of Population and Housing, 2000 [United States]: Selected Subsets From Summary File 1, States. Archival Version. Census of Population and Housing, 2000 [United States] Series. Version: v0. ICPSR - Interuniversity Consortium for Political and Social Research. Dataset.