Category Archives: Census

More on population-weighted density

Some time ago I did two posts on population-weighted density, which is the mean of the densities of small areas such as census tracts weighted by their populations. In the first post, About population-weighted density, I described this alternative to traditional, conventional density and discussed some of the issues surrounding the use of this measure. The second post was Population-weighted density and urban sprawl, which argued that both conventional density and population-weighted density were appropriate measures of the extent of urban sprawl, relevant to different consequences associated with sprawl.

Doing these posts got me more interested in population-weighted density and led to my writing a full paper exploring this alternative density measure. In addition to expanding on the topics addressed in those two earlier posts, the paper provides much new information. Two of the highlights are the demonstration of the relationship of population-weighted density to conventional density and the comparison of conventional densities and population-weighted (actually housing-unit-weighted) densities across the 59 large urban areas in my urban patterns research.

In the first post I said that to the extent that people are more heavily concentrated in some tracts with higher densities than in others, those higher density tracts will be given more weight in calculating the population-weighted average. This will cause the population-weighted density to be greater than the simple or conventional density. In the paper, I move beyond this qualitative statement to deriving the mathematical relationship between population-weighted density and conventional density. It is actually quite simple: population-weighted density is equal to conventional density plus the variance in density across the subareas divided by conventional density. To the best of my knowledge, this is the first time this relationship has been demonstrated in this manner.

Looking at conventional and housing-unit-weighted densities for the large urban areas, the distribution of the weighted densities is more highly skewed towards higher values, with New York being an extreme outlier on weighted density. The housing-unit-weighted densities were more strongly related to the size of the urban areas, including size in earlier years, suggesting that the presence of areas of concentrated high densities was established in some urban areas decades ago.

Much more information on can be found in my paper, “On Population-Weighted Density” which can be downloaded here.


Have we had our last good census?

As readers of this blog know, I make extensive use of census data in my research. Also, I am very concerned (some would say obsessed!) with the details of the census and the data I am working with. Given this, I am concerned about the prospects for the 2020 census and am wondering whether 2010 saw our last good census. A combination of the general state of the nation combined with the actions (and inactions) of the current administration is producing this concern.

Obtaining an accurate census counts depends fundamentally on the cooperation of the entire population. And this cooperation is dependent upon people having a basic level of trust in the government. I don’t think I need to elaborate on the general decline in such confidence. I would especially note such things as the the rhetoric against various groups, the travel ban, the cancellation (at least until now) of the program for the dreamers, and many other things. If I were Hispanic, an immigrant, or especially if I were undocumented, I don’t believe I would respond to the census.

Actually, I might respond while taking the risk into account. And I’m sure that this would be the case for others as well. Because of the importance of the census for redistricting and the allocation of funds, perhaps I would choose to respond but not truthfully and with lots of kids to up the count in my area.

The cuts in funding for the census will weaken its preparation and ability to plan and conduct effective outreach. Especially given the shift this census to conducting the bulk of the enumeration online, I am concerned that this could especially affect the count of lower-income, less-educated persons. Those with higher levels of online sophistication will have no problem filling out the forms. Others may fall between the cracks, census follow-up efforts notwithstanding.

Finally, there is the pernicious effort on the part of the current administration to include a question on citizenship on the census. First off, there is no reasonable need for this information from the census. The American Community Survey asks this question and provides all of the detail needed for any conceivable purpose (and in a more timely fashion than the census as well). I can see only two reasons that the administration might be pushing for the inclusion of this question. The more “benign” (!) purpose would be to discourage non-citizens, especially those who are non-documented, from participating in the census. Even more ominous would be an intent to violate the confidentially of the census to use the information for immigration enforcement purposes. I believe that just the proposal to include the question, even if it does not become part of the census, will further erode trust and participation. Inclusion of the question in the census would be a disaster.

Some urban researchers are careless…and wrong

I have read a number of scholarly articles in which the authors were using census Urbanized Area data from 2000 or later in which they described those areas as consisting of territory with a population density of 1,000 or more. And that is incorrect. The density threshold for adding blocks or other small areas to an Urbanized Area (or Urban Cluster) is 500 persons per square mile. I’m not into naming and shaming and won’t. But come on! If you can’t even describe the data you are using accurately, why should anyone trust anything else you are saying?

I know where the error comes from. Starting with the 2000 census, the Census Bureau dramatically changed how they defined the notion of “urban” and Urbanized Areas (for the most part greatly improving the definition). Under the old definition, it was the case that a small area had to have a population density of at least 1,000 persons per square mile to be included in an Urbanized Area. An excellent summary of how the census definition of “urban” has evolved can be found here.

I assume that a researcher making this error had read earlier articles that described Urbanized Areas as consisting of areas with densities of 1,000 or more (either correctly, if referring to pre–2000 Urbanized Areas or incorrectly, if referring to the later areas). I expect this would be the source, not the census definition of the earlier Urbanized Areas, for if these authors were too careless and lazy to look up the definition for their current work, they likely would not have done so in the past either.

The current Urbanized Area density minimum plays a key role in the definition of urban areas for my urban patterns research. And of course I am continuing to read new articles that are published that deal with urban patterns, including those using Urbanized Area data. The first few times I read articles referring to the 1000-person-per-square-mile cutoff for 2000 or 2010 Urbanized Areas, I panicked. Did I make a mistake in understanding the definition and get it wrong? (It is a complex definition.) Each of those times I went back and re-read the formal notices on urban area criteria for 2000 and 2010  in the Federal Register. After having assured myself several times that I was correct, I no longer have to repeat this.

Technical note

The 2000 and 2010 urban area criteria do make use of a population density minimum of 1,000 persons per square mile in the first stage of the delineation process. An urban area core is defined that includes small areas with population densities of 1,000 or more. Then additional areas are added with densities of 500 persons per square mile and above. The existence of an initial urban area core meeting the higher density threshold will not be an issue for Urbanized Areas.

On the choice of Combined Statistical Areas

Last year, I wrote a post discussing why I chose to use the larger Combined Statistical Areas (CSAs) for my urban patterns research rather than the commonly used Metropolitan Statistical Areas (MSAs). I followed this up with a second post giving examples of how the sharing of transportation infrastructure–commuter rail and airports–could be an indicator of the integration of areas that should be considered together as a single, larger metropolitan area.

This decision to use the CSAs is of such fundamental importance to my research that I felt it deserved more extended, formal treatment. I prepared the paper “On the Choice of Combined Statistical Areas” that provides greater background, covers the topics addressed in those blog posts in more detail, and addresses some other implications of the the choice of CSAs over MSAs. It also shows how the CSAs are comparable in extent to MSAs as they had been defined earlier for the 2000 census. This last topic was also addressed in an earlier post.

The paper is posted on the Research page of the website and can also be downloaded here.

Data for studying urban patterns over time

I want to study urban spatial structure over an extended period of time. Here are my data requirements: Data for population or housing units that can show the level of urban development. Data for small areas that enable the definition of the extent of urban development and the examination of distributions within the urban areas. Data for multiple points in time–as many as possible. Data for the same small areas at each point in time, to allow examination of changes in those areas over time.

My dataset begins with a unique resource, the Neighborhood Change Database created by the Urban Institute and Geolytics. This dataset includes census tract data from the 1970 through 2000 censuses, with the data for the years from 1970 through 1990 normalized for the 2000 census tract boundaries. So that’s 4 points in time. The block data from the 2010 census for population and housing units can be aggregated to the year 2000 tract boundaries, giving another year.

While many studies use population and population densities to study urban patterns, I have chosen to use housing units (as have others). They are more fixed and I think better represent the pattern of urban development. (The Census Bureau uses a minimum population density threshold to define urban areas. It is literally possible for an area to go from rural to urban from one census to the next without any new housing being developed. All it would take is an increase in population, for example, some babies being born.)

Using housing units also provides the opportunity to extend the data back in time. The census and the Neighborhood Change Database include the distribution of housing units by the year in which they were built. One can use this information for 1970 to estimate numbers of housing units that existed in earlier years. There are errors, as this approach cannot take into account changes to the stock of the older units that have occurred in the interim. I did an analysis that considered the extent of the error and concluded that it was reasonable to estimate housing units for the tracts back two decades, to 1950 but not further. This is discussed in a note Year-built Estimates Analysis on the Research page.

A remaining question involved which areas to examine and what would be their extent? As I noted in an earlier post, I believe Combined Statistical Areas (CSAs) better represent the extent of metropolitan areas than Metropolitan Statistical Areas (MSAs). I am choosing to examine urban patterns within the 59 CSAs (or MSAs, for areas not included in a CSA) that had populations over 1,000,000 in 2010.

Documentation of the urban patterns dataset is provided in a note Urban Patterns Dataset Description on the Research page.

Problems with the urban and metropolitan area definitions

The previous post described how the changes to the metropolitan area definition resulted in the splitting of numbers of large Metropolitan Statistical Areas (MSAs) as they were delineated for the 2000 census into 2 or more MSAs in 2003. This raised the obvious question, what was it about the new definition that produced these changes? The answer proved to be complex and bizarre, and the situation was only made worse by a horrible decision made by the Census Bureau with the 2010 Urbanized Area (UA) definition.

A major change made in the MSA definition first used in 2003 was to begin the delineation with the Urbanized Areas, including all counties with substantial portions of a UA in the MSA. After that, commuting to those central counties was used to add outlying counties to the MSA. The definition included provisions for merging adjacent MSAs using the same commuting criterion, but this made it highly unlikely that adjacent large MSAs could ever be merged. So as a result, the general extent of MSAs was determined by the extent of the UAs. If an area of contiguous urban settlement were split into 2 or more UAs, this would likely produce multiple MSAs.

So now we have to go to the UA definition. The new UA definition for the 2000 census provided for the splitting of large urban agglomerations into multiple UAs using the MSA (and CMSA and PMSA) boundaries as delineated for the census. So the extent of the MSAs depended on the UAs, and the extent of the UAs depended on the MSAs! It is a circular definition!

So what happens with the UAs for 2010? The Census proposed to maintain the status quo, keep the largest UAs with populations over a million the same, but not split smaller urban agglomerations, so contiguous UAs would be merged. This generated opposition from those in areas that would be merged, as this could affect the receipt of federal funding. The. Census response was to surrender and make the decision that the set of UAs that were delineated in 2000 would be frozen! Each 2000 UA would continue to be a UA in 2010! And since the MSAs continued to be based on the UAs, they would be largely frozen as well.

Since the beginning of the UA and MSA definitions in the mid-twentieth century, the areas had been allowed to evolve, with areas being combined as formerly separate areas grew together and were more reasonably considered a single entity. For example, Dallas and Fort Worth started as separate UAs and MSAs but for decades have been considered to be a combined area. What the Census Bureau did with the 2010 UA definition was to say that the delineation of urban and metropolitan America would be frozen as it was in 2000 and would not be allowed to further evolve. This was a truly horrible decision. And the Census Bureau knew it, as was clear from the misleading obfuscation of what they were doing in the Federal Register notice for the 2010 UA definition.

Far more detail is provided in a research note discussing the problems with the urban and metropolitan area definitions that is posted on the Research page and can also be downloaded here.

The effect of the changed definition of Metropolitan Statistical Areas

In an earlier post I explained why I chose to use the larger Combined Statistical Areas (CSAs) for my urban patterns research rather than the more common and familiar Metropolitan Statistical Areas (MSAs). I felt that in some cases the MSAs did not encompass what I felt was the whole metropolitan area. Exhibit 1 was the New York MSA, which did not include any of the Connecticut suburbs.

Before this, I had no occasion to systematically look at the extent of all of the large MSAs. But my recollection was that the MSAs were not always this limited. For example, the New York MSA used to include areas in Connecticut, and Raleigh and Durham had been a single MSA. I decided to start digging to find what had happened. It turns out that the Office of Management and Budget (OMB) made major changes to the MSA definition in 2000, which was first used to delineate new MSAs in 2003. This is also when the CSAs were introduced.

I decided to do a systematic comparison of the last MSAs delineated under the old definition, which were used in reporting the 2000 census, and the areas delineated in 2003 using the new standards. I looked at the 49 MSAs (and CMSAs, which were nothing more or less than MSAs for which subdivisions had been delineated) with populations over a million in the 2000 census. For a majority of the 2000 MSAs, the 2003 MSAs produced with the new definition were similar, varying only in the outlying counties included. But 18 of the 2000 MSAs were split into 2 or more MSAs in 2003, in one instance, into 6 different MSAs. These included New York and Raleigh-Durham.

For those areas where CSAs had been delineated in 2003, I compared their extent to the 2000 MSAs. In nearly all cases, the CSAs were quite comparable to the 2000 MSAs. They included the multiple new MSAs produced by the splitting of the older areas. No wonder I found the CSAs more reasonable than the MSAs. OMB thought those larger areas better represented the extent of metropolitan areas up through 2000!

A research note providing the complete results of these comparisons is posted on the Research page and can also be downloaded here.