GEOSPEX Custom Banner

GEOSPEX | geospatial analysis | urban planning | environmental issues

NYC Census Tracts – Idiosyncrasies & Vagrancies

Thematic mapping with U.S. Census data is of course a very common, valid approach to socioeconomic analysis across time and geography. The census tract areal unit is often an appropriate geography for both dense urban cities as well as less dense suburban and rural locations. But there are situations where complications arise, several in particular that hamper thematic mapping across greater NYC composed by its 5 boroughs- Kings county, Queens county, Bronx county, New York county and Richmond county.

The first issue arises from the contiguous nature of census tracts germane to the TIGER (Topologically Integrated Geographic Encoding and Referencing) format developed by and for U.S. Census data. The geographic component for census tracts works well for instances wherein topography is uniform; for example, rural areas in the midwest where land is relatively uninterrupted and uniform. This is not the case in dense urban areas that abut interrupting water bodies. In greater NYC, the Hudson, East River and various harbor waterways pose a real complication to the contiguous nature of TIGER. In effect, the TIGER geography is too inclusive in these situations as it does not factor the difference between land (where people live) and water (where people usually don’t live).

To address this issue, city planning utilizes an amended file essentially clipped to the shoreline. However, this clipping process introduces another complication; but on balance for general thematic mapping its usually the preferred problem. This issue is an ‘overhang’ of census tracts that really belong in one borough, but end up stranded along the shoreline in another borough. In the following case, a Manhattan tract is stranded across the East River in Brooklyn.

Census Tract ‘overhang’ across East River  ]

The second issue– certainly not unique to changing cities – is tract change over time, in this case 2000 to 2010. This presents a significant complication to accurate longitudinal mapping. If the areal geometry and/or attribute identification between the early and later tract states is in any way different, an ‘apples to apples’ analysis cannot proceed in bulk. Many tracts may remain stable, but those that change do so in predictable be variable ways- enough to demand alternative mapping approaches. In the following, the blue 2010 highlights are locations where a change has taken place in the 2010 tract; the orange represents change that has taken place in the 2000 tract:

2000 – 2010 census changes  ]

The U.S. Census does a good job typifying these changes through online documentation. The first is an overview of the 2000-2010 changes; the second resource is the actual relationship files themselves. With the relationship files in hand (.csv) the changes notated by the U.S. Census can be tagged to the GEOID for the expression of the 2010 tracts. The following shows tracts that have changed in some way (light blue) vs. those that have remained stable across 2000 – 2010 in both their geometry and attributes (dark blue):


Tract changes essentially fall into several predictable categories. Interestingly, there are approximately a dozen geometry changes that the U.S. Census relationship file does not tag but do in fact exist as 2000-2010 changes in greater NYC. These particular changes can be typified generally as REVISIONS- usually relatively small changes along the edges of a tract. The following example occurs at Holy Cross Cemetery in Brooklyn wherein a half block of building(s) are brought into the 2010 expression of tract 085200:

Tract 085200 Revision  ]

The second change involves the consolidation of 2000 tracts into 2010 tracts, what the U.S. Census terms a MERGE:

Census Tract Consolidation  ]

The third change is simply a SPLIT of a 2010 tract, often due to a significant increase in population density within the 2000 tract geometry from 2000 to 2010:

Census Tract Split at 027400 2000 Tract  ]

In greater NYC, these tract changes can be further typified via GIS analysis utilizing a tabulated intersection. The results can be joined with the 2010 tract geometry to quantify the percentage of change. This works well for REVISIONS and MERGES; but does not capture SPLITS as the input zone feature – 2010 tracts – does not have a lesser class feature by which to quantify percentage of change. Of the 2010 tracts count of 2166, approximately 7% are REVISIONS and 10% MERGES. If the U.S. Census relationship files are taken into account for unaltered tracts, approximately 65% of tracts are UNALTERED, leaving approximately 18% as SPLITS. These are very approximate numbers resulting from loose SQL selections and reliance on the U.S. Census relationship file categorizations.

Regardless the exact breakdown of no change, revisions, merges and splits, its clear that there’s a lot of discrepency between the 2000 and 2010 census tracts for greater NYC. If one wants to map change over time, what’s the best way to proceed? There’s several workarounds, all with their own +s and -s.

The first option is to conduct analysis and mapping using the smaller, often considered more stable U.S. Census block. This gives a finer-grained unit of analysis useful for both picking up generalized patterns at smaller scales, as well as capturing important ‘street level’ differences at larger scales. This is indeed the approach taken by CUNY to such great effect:

Important to note however is that changes between 2000 and 2010 at the census block level have indeed occurred (the absolute difference in named blocks alone is a 2076 – approximately 5.5% – 2010 gain over 2000); its just that the overall effect for mapping – especially at a smaller scale – is less of an overall burden than with census tracts. Regardless, one is still presented with the same normalization challenges if the goal is to map longitudinally within census blocks and forego the ‘side-by-side’ approach adopted by CUNY.

Given this lingering normalization issue regardless of the particular U.S. Census areal unit, a second option is to conduct areal interpolation from one census areal unit into another, and proceed with a 2000 interpolation comparison with 2010 census tracts or blocks. This approach has its own baked in accuracy issues, but it will allow for the normalization of 2000 data into 2010 geometry, and forego the main ‘apples to apples’ analysis challenge thus far.

A third and likely easiest/best approach may be to rely on the good work of Spatial Structures in the Social Sciences at Brown University and utilize their open data for longitudinal mapping analysis. This is a great resource that is easily accessible to 2010 census geometries for mapping as far back as the 1970s. Further, the tools provided by the program allow for utilization of a user’s own data to supplement the essential socioeconomic variables currently available.

Finally, a great resource for ‘snapshots’ of particular census geographies (having little to do with longitudinal analysis per se) is Census Reporter using predominately American Community Survey data from 2012.