Tag Archives: CSA

On the choice of Combined Statistical Areas

Last year, I wrote a post discussing why I chose to use the larger Combined Statistical Areas (CSAs) for my urban patterns research rather than the commonly used Metropolitan Statistical Areas (MSAs). I followed this up with a second post giving examples of how the sharing of transportation infrastructure–commuter rail and airports–could be an indicator of the integration of areas that should be considered together as a single, larger metropolitan area.

This decision to use the CSAs is of such fundamental importance to my research that I felt it deserved more extended, formal treatment. I prepared the paper “On the Choice of Combined Statistical Areas” that provides greater background, covers the topics addressed in those blog posts in more detail, and addresses some other implications of the the choice of CSAs over MSAs. It also shows how the CSAs are comparable in extent to MSAs as they had been defined earlier for the 2000 census. This last topic was also addressed in an earlier post.

The paper is posted on the Research page of the website and can also be downloaded here.

Defining exurban areas

For the urban patterns research, in addition to delineating the urban areas for each year, I wanted to delineate exurban areas beyond the urban areas that could reasonably be considered to be parts of the metropolitan area related to the urban core. Unlike the census Urbanized Areas, however, there is no accepted standard definition for exurban areas. Fortunately, a thorough review of past studies of exurban areas and how they were defined has been provided by Berube and others (Finding Exurbia, Brookings, 2006).

A minimum population or housing unit density–obviously much lower than the urban density threshold–was the most common criterion used in defining exurban areas. Other factors were also considered, especially commuting to the urban area. Data are not available over the entire period of the urban patterns dataset to allow the use of commuting. However, the maximum extent of the exurban area would be limited to the area of the Combined Statistical Area (CSA) or Metropolitan Statistical Area (MSA), which at a minimum guarantees interaction with the urban area for 2010 for the counties as a whole, if not for individual tracts.

I decided to define exurban areas as the sets of contiguous tracts that were adjacent to the urban areas and had housing unit densities greater than some value. The minimum density levels used to define exurban areas in various studies varied widely, from 40 acres per housing unit down to about 10 acres per unit. (For studies using the lowest densities, the extent of the exurban areas was most often limited by the commuting criterion rather than density.) I approached the problem by mapping the tracts meeting different minima in 2010 to make a judgment as to what looked reasonable.

The very low minimum density thresholds of 30 or 40 acres per unit frequently resulted in all or most of the CSA or MSA being considered exurban, with the tracts meeting these levels extending far beyond those areas, especially in the eastern U.S. On the other hand, a density minimum of 10 acres per unit produced much smaller exurban areas than seemed reasonable and consistent with personal observation.

The choice came down to thresholds of either 15 acres per unit or 20 acres per unit. The resulting exurban areas generally looked appropriate for most areas. The final choice of 15 acres per unit came down to a number of specific situations where the lower density level produced areas that seemed too large. I’ll give two examples: The exurban area for Indianapolis in 2010 would have extended south at least halfway to Louisville, through area I would never consider exurban. And the Portland exurban area would have encompassed a large portion of the Willamette valley.

A further check reinforced my decision on the minimum density for exurban areas of 15 acres per unit, which is one-fifth of the urban density theshold. For CSAs or MSAs adjacent to other CSAs or MSAs, it was not uncommon for both exurban areas to extend to the common boundary. But for areas not adjacent to others, the extent of contiguous exurban density tracts was generally either confined within the boundaries of the CSA or MSA or extended beyond the boundary at only one or two points, with a string of exurban density tracts along a highway. (This is much like census Urbanized Areas, which frequently have such tendrils of urban development extending outward.) So the density threshold for exurban areas seems consistent with the areas of significant metropolitan interaction as indicated by the CSA and MSA boundaries.

Data for studying urban patterns over time

I want to study urban spatial structure over an extended period of time. Here are my data requirements: Data for population or housing units that can show the level of urban development. Data for small areas that enable the definition of the extent of urban development and the examination of distributions within the urban areas. Data for multiple points in time–as many as possible. Data for the same small areas at each point in time, to allow examination of changes in those areas over time.

My dataset begins with a unique resource, the Neighborhood Change Database created by the Urban Institute and Geolytics. This dataset includes census tract data from the 1970 through 2000 censuses, with the data for the years from 1970 through 1990 normalized for the 2000 census tract boundaries. So that’s 4 points in time. The block data from the 2010 census for population and housing units can be aggregated to the year 2000 tract boundaries, giving another year.

While many studies use population and population densities to study urban patterns, I have chosen to use housing units (as have others). They are more fixed and I think better represent the pattern of urban development. (The Census Bureau uses a minimum population density threshold to define urban areas. It is literally possible for an area to go from rural to urban from one census to the next without any new housing being developed. All it would take is an increase in population, for example, some babies being born.)

Using housing units also provides the opportunity to extend the data back in time. The census and the Neighborhood Change Database include the distribution of housing units by the year in which they were built. One can use this information for 1970 to estimate numbers of housing units that existed in earlier years. There are errors, as this approach cannot take into account changes to the stock of the older units that have occurred in the interim. I did an analysis that considered the extent of the error and concluded that it was reasonable to estimate housing units for the tracts back two decades, to 1950 but not further. This is discussed in a note Year-built Estimates Analysis on the Research page.

A remaining question involved which areas to examine and what would be their extent? As I noted in an earlier post, I believe Combined Statistical Areas (CSAs) better represent the extent of metropolitan areas than Metropolitan Statistical Areas (MSAs). I am choosing to examine urban patterns within the 59 CSAs (or MSAs, for areas not included in a CSA) that had populations over 1,000,000 in 2010.

Documentation of the urban patterns dataset is provided in a note Urban Patterns Dataset Description on the Research page.

The effect of the changed definition of Metropolitan Statistical Areas

In an earlier post I explained why I chose to use the larger Combined Statistical Areas (CSAs) for my urban patterns research rather than the more common and familiar Metropolitan Statistical Areas (MSAs). I felt that in some cases the MSAs did not encompass what I felt was the whole metropolitan area. Exhibit 1 was the New York MSA, which did not include any of the Connecticut suburbs.

Before this, I had no occasion to systematically look at the extent of all of the large MSAs. But my recollection was that the MSAs were not always this limited. For example, the New York MSA used to include areas in Connecticut, and Raleigh and Durham had been a single MSA. I decided to start digging to find what had happened. It turns out that the Office of Management and Budget (OMB) made major changes to the MSA definition in 2000, which was first used to delineate new MSAs in 2003. This is also when the CSAs were introduced.

I decided to do a systematic comparison of the last MSAs delineated under the old definition, which were used in reporting the 2000 census, and the areas delineated in 2003 using the new standards. I looked at the 49 MSAs (and CMSAs, which were nothing more or less than MSAs for which subdivisions had been delineated) with populations over a million in the 2000 census. For a majority of the 2000 MSAs, the 2003 MSAs produced with the new definition were similar, varying only in the outlying counties included. But 18 of the 2000 MSAs were split into 2 or more MSAs in 2003, in one instance, into 6 different MSAs. These included New York and Raleigh-Durham.

For those areas where CSAs had been delineated in 2003, I compared their extent to the 2000 MSAs. In nearly all cases, the CSAs were quite comparable to the 2000 MSAs. They included the multiple new MSAs produced by the splitting of the older areas. No wonder I found the CSAs more reasonable than the MSAs. OMB thought those larger areas better represented the extent of metropolitan areas up through 2000!

A research note providing the complete results of these comparisons is posted on the Research page and can also be downloaded here.

Major transportation infrastructure and metropolitan extent

For defining both MSAs and CSAs, the Office of Management and Budget specifies (different) minimum commuting thresholds. Obviously the choice of the exact value is somewhat arbitrary. But OMB has to make definitions that use data that the Census Bureau collects for the entire country and must use a uniform, consistent standard.

CSAs involve the combination of MSAs (and Micropolitan Statistical Areas) into a single area. This got me to thinking whether other factors might be considered in judging whether areas should be combined and be considered part of a larger metropolitan area. I would like to offer one suggestion: Combined or shared major transportation infrastructure, specifically commuter rail systems and commercial airports.

Going through the major CSA combinations I discussed in the previous post, New York is connected by commuter rail service to its Connecticut suburbs. San Francisco has commuter rail service down the peninsula to San Jose and beyond. And there is additional service from San Jose extending to the east. The three major cities of the Riverside-San Bernardino-Ontario MSA are served by no less than three commuter rail lines, two to downtown Los Angeles and one to Orange County, a part of the Los Angeles MSA.

For the two large combinations about which I initially expressed some skepticism: Commuter rail lines connect Washington with Baltimore and Boston with Providence.

When a single airport provides all or most commercial airline service to two MSAs, it is reasonable to consider them to be part of a single, larger metropolitan area. Raleigh-Durham International Airport serves its two namesake MSAs, of course, as does Greenville-Spartanburg International. The Piedmont Triad International Airport serves the Greensboro, Winston-Salem, and High Point MSAs, hence the name.

The Provo and Ogden MSAs are combined with Salt Lake City into a single CSA. Provo and Ogden each have an airport providing service to just a handful of destination (3 for Provo, 1 for Ogden). So it seems clear that the Salt Lake City airport provides much of the airline service to those MSAs.

The very largest MSAs and CSAs can be served by multiple airports, so one would not expect a single airport to serve both MSAs in those situations. But there are at least two examples of airports located in one MSA that are also seen as providing significant service to another MSA in the CSA. The first one I’d mention is Baltimore’s airport, Baltimore-Washington International. The name says it all. And I’ve flown into it multiple time when going to Washington.

In the greater Los Angeles area, Ontario International is obviously located in the Riverside-San Bernardino-Ontario MSA. Ontario International is literally owned and operated by Los Angeles World Airports, the authority that is also responsible for LAX and one other airport in the Los Angeles MSA. (As an aside, ownership and control of Ontario International is due to be transferred back to the City of Ontario and San Bernardino County later this year. They fought for the change believing that Los Angeles World Airports was favoring LAX over the Ontario airport.)

Common or shared major transportation infrastructure should not necessarily be the sole basis for determining that two areas should be considered part of a single, larger metropolitan area. But I believe it is a strong indicator.

Combined Statistical Areas instead of MSAs

Metropolitan Statistical Areas (MSAs) are by far the most commonly used units for reporting data and conducting analysis for urban or metropolitan areas. They are defined on a consistent basis by the Office of Management and Budget and are used throughout the federal government and by many, many others.

In my research project on changes in urban patterns in large urban areas in the U.S. over time, I delineate urban and exurban portions of the areas using census tracts. To do this, I needed a starting-point definition of metropolitan areas to indicate which tracts were within and beyond the area, which tracts belonged to a given area as opposed to a neighboring one, and which urban areas that may have developed separately should now be considered to be part of a single urban area.

MSAs were an obvious choice. They indicate the extent of the metropolitan area, of course. They provide boundaries between adjacent areas, for example, where the New York area stops and the Philadelphia area begins. (Of course, these boundaries are somewhat arbitrary as MSAs have been created using county boundaries, but at least they provide a basis for making the choice.) And finally, MSAs can specify that two or more previously separate urban areas should be considered to be part of a single, unified area, such as Dallas and Fort Worth.

But the OMB definitions provide an alternative unit, Combined Statistical Areas (CSAs). CSAs are combinations of Core-Based Statistical Areas (CBSAs), MSAs and the small Micropolitan Statistical Areas. As with the determination of which counties are to be included in an MSA, the combination of CBSAs to form a CSA is based on commuting interchange. For adding a county to an MSA, a 25 percent commuting threshold must be met. To combine two CBSAs into a CSA, a 15 percent commuting threshold must be met. The commuting considered is slightly different in the two cases, but the idea is the same: Areas will be combined and considered to be part of a single area if there is significant cross-commuting. The major difference between the MSAs and the CSAs is the level of such commuting required.

After much time spent considering the alternative areas, I have chosen to use the CSAs as the units for the delineation of my areas. I based my decision on the reasonableness of the areas created using the MSA and CSA definitions. I will describe my thinking and in doing so suggest that using CSAs rather than MSAs may make sense for other analyses as well. (Note that not all MSAs are combined into CSAs because no adjacent areas meet the commuting threshold. In those cases, I use the MSAs.)

In many of the cases, the areas combined with a large MSA to form a CSA are fairly small areas. This obviously extends the sizes of the areas. But this was not of great importance to my choice of CSAs over MSAs. I will address the combinations that I did see as being significant.

First, the New York CSA includes the Connecticut suburbs of New York. (The New York MSA does not extend into Connecticut.) To me, these suburbs are so obviously a part of the New York area that this one is a no-brainer.

The San Jose MSA is combined with the San Francisco-Oakland MSA. The peninsula from San Jose to San Francisco is an uninterrupted stretch of intense urban development. There are all the stories of Google and other Silicon Valley employees who work in the San Jose MSA living in San Francisco and taking company buses to work (and being resented by other San Francisco residents). And while the San Jose MSA is associated with Silicon Valley, Facebook and the venture capitalists on Sand Hill Road are located in the San Francisco MSA.

The Riverside-San Bernardino-Ontario MSA is east of and adjacent to the Los Angeles MSA. Urban development is continuous from east of San Bernardino to the Pacific Ocean, as anyone flying into LAX from the east will have observed. Rush-hour commuting on the three freeways extending east from Los Angeles make the ties clear.

Moving to somewhat smaller areas, residents of the Raleigh and Durham MSAs have long considered themselves to be part of a single area. The Research Triangle Park is located in the middle, with Chapel Hill in the Durham MSA being the third point of the triangle. The area was a combined single MSA for a number of decades until the tweaking of the definition in 2003. So having them in a single CSA is highly appropriate. Staying within the Carolinas, Greensboro, Winston-Salem, and High Point, North Carolina, and Greenville and Spartanburg, South Carolina MSAs combine into two CSAs.

The two large combinations that were a little more surprising to me were Baltimore with Washington and Providence with Boston. But the urban areas have certainly grown together. And they meet the commuting threshold. So maybe not so unreasonable. Way back in 1967, when I participated in a march from Baltimore to Washington, much of the area between was already developed. But that is another story…