Wednesday, August 25, 2010

The house history, Part II

Last post, we looked at the total number of house seats that both parties have held since 1988. This post, we look at what percentage each party got.

If parties got votes, which they EMPHATICALLY do not, as I have tried to patiently explain to a very smart British scholar who seems to miss the point of American politics.
But lets pretend for a minute that they do. This is what percentage of the votes candidates for both parties have gotten since 1988:
This shows much the same trend as the last chart: in 1994, the Republican Party got a slight advantage in the vote, probably due to a demographic shift with union voters and southern voters, which then reversed in 2006 and 2008, which could have been either a demographic shift, or an event-driven shift.
What it shows that is even more significant is how tenuous both party's hold on the house is, percentage wise. Most significantly, the 1994 "Republican Revolution" was not caused by an upswing in the Republican's percentage, but a downswing in the Democratic percentage. In 1994, the Republicans got 47.8% of the vote, hardly a "revolution". In fact, in this graph, there is only one year in which the Republican Party got more than 50% of the house vote, and that was 2004. Not that the Democratic Party has done much better, although the last two elections have had pretty good numbers.

There is a number of things that can happen this November, but based on this graph, I am going to guess that neither party will have much of an advantage in percentage, or in seats. A lot depends on whether the last two elections were about changing demographics, or about swing voters reacting to events.

In any case, my own guess based on both of these graphs, is that the election will go something like 50.1% to 49.9% (although, of course, numbers won't add to 100), with either party capturing the high number. Also, the party with the higher percentage might not win the house. Whichever party does win the house will win by a small margin of seats, probably around 5 seats.

And, the national media being what it is, the week after the election, whoever wins will be spun as a new NARRATIVE. If the Republicans manage to get 100,000 extra voters and 5 extra seats, it will be a NARRATIVE that the mainstream of the United States has refudiated BIG GOVERNMENT. If the Democratic party manages to swing those 100,000 votes and hold on to a 5 point advantage, the NARRATIVE will be that the mainstream of the United States has refudiated TEA PARTY EXTREMISTS. In either case, no such thing will have happened, and a new narrative will be found.

Saturday, August 21, 2010

Back with a brief update

First, I am going to get around to deleting spam comments.

Second, I will be posting more data. At some point. I have had other things on my mind.
But! Here is something to tide you over.
As we look towards the 2010 house elections, there are two major trends: first, the demographic trend starting in the late 1980s, where the union voters in the midwest and the traditionally Democratic voters in the south switched to the Republican party. Second, there is the event driven growth of the Democratic Party in 2006 and 2008. The last two elections could also be demographic, as urban and educated voters switch to the Democratic Party.

So which is the "true" trendline? Ask me in a little under three months!

Saturday, June 26, 2010

I haven't forgotten.

I haven't forgotten this blog.

There will be more.

Maybe even soon.

Tuesday, May 4, 2010

And since I have already left the path: a bar graph

Once I have freed myself from the tyranny of confining myself to scatter plots, interesting questions have raised themselves. This one is demonstrated by a bar graph.
One of my conclusions about the "Core Electoral Votes" was that they showed that the election was indeed becoming more polarized. After thinking of ways to express this, I decided to make bar graphs showing the number of electoral votes that Obama scored a certain percentage of the popular vote in (This sentence does make sense). I then tried to find another electoral winner who had a similar-sized victory, and tried to break down their electoral votes down similarly. Although it is far from a good comparison, the best comparison to Obama's '08 victory was probably the first Bush, in '88. (2 candidates, similar numbers in popular and electoral votes).p some of the conventional wisdom:
This diagram confirms some conventional wisdom: elections are indeed becoming more polarized, at least as far as different regions of the country having different values. Bush in 1988 got more electoral votes than Obama did in 2008, but a great of them were in a narrow band between 50% and 55%. On either side of that, he fell off, getting below 45% in only a handful of states, and also getting over 60% in only a few states. The bar graphs form a triangle, of sorts. Obama, on the other hand, scored all over. His largest category for electoral votes was actually over 60%, and there are three separate peaks with two separate valleys. Even given his good popular vote and large electoral vote margins, there were areas that didn't move at all. The middle ground, between 45-55%, is much smaller than sense says it should be.

I think this does speak of polarization: in 1988, Massachusetts and Oklahoma might have different results, but the results would move together. Now, that is not the case.

Monday, May 3, 2010

Since I seem to have abandoned the "daily" thing, lets abandon the scatter plots as well.

So, I slowed down updating this, and haven't updated this in two weeks.

This is mostly because I feel I have exhausted a lot of what I wanted to talk about. Actually, I should at some point summarize what I have learned, but I don't have a lot more to say about most of what I was covering.

But, I did, on a lark, think of a question that can be used with another type of graph. And so, I present a non-scatterplot to you.

In the US election system, the president is chosen by electoral votes. The electoral system can greatly magnify a candidates success or failure. Also, electoral votes can be won by plurality, meaning a candidate can win the presidency in a landslide without actually winning any of the states by a majority. Clinton did this in 1992, only winning 9 electoral votes (Arkansas and DC) by majority, but getting 370 Electoral Votes.

So I decided to look at the history of how many Electoral Votes were won by over 60% of the vote. These show states that were won with what could be seen as a strong consensus. So from 1960 until 2008, here is the fate of both party's ability to truly capture states:
Much as with my scatterplots, the first lesson to be learned from this graph is unpredictability. The biggest lesson seems to be that holding on to Core Electoral Votes is very difficult. Even after the biggest landslides ('64, '72 and '84), the amount of core electoral votes drops. This also leads some credence to the belief that elections are more about candidates and circumstances than they are about the deep seated philosophical leanings of the electorate. Was Reagan's victory in 1984 a sign of a deep seated conservative bent in the US? According to this diagram, it would seem not, because just 4 years later, the amount of states showing a really strong commitment to support the Republican candidate shrunk down to just a few in strongholds. Of course, the same could be said of the movement from 1964 to 1968.

Another trend that actually shows up from popular political discourse in this is that some of the "Red State/Blue State" and "polarization" seems to have some evidence for it. In previous elections, one party might get a lot of Core Electoral Votes, or both parties might get none or close to none... but only since 2000 have both parties managed to have strongholds. So there is some truth to it: the current political situation is one where, regardless of candidate, Massachusetts will probably go over 60% for the Democrat, and Oklahoma will go over 60% for the Republican.

Of course, since the unexpected is expected, I bet 2012 will have some interesting changes to make to this chart.

Saturday, April 17, 2010

Education and the election by region: The Northeast

And we come to what would be our final region, the Northeast, or what would be our final region besides it seems that I didn't actually include the West when I posted earlier.
But the Northeast, (Maine, New Hampshire, Massachusetts, Vermont, Connecticut, Rhode Island, New York, New Jersey, Pennsylvania, Delaware and Maryland) even though it is declining in importance, is still one of the most important regions in the country, both electorally and otherwise. It has 114 electoral votes, and has a large manufacturing base, and some of the countries largest and most ethnically diverse cities. It is home to much of the country's educated elite, and is also very strongly and consistently Democratic. Other than a curveball plurality decision in New Hampshire, every state in this region has voted Democratic since 1992.
This area does have some conservative counties, but they are mostly in the rural areas of Appalachia. There is a smattering of educated counties that McCain carried, and most of them are in suburban New Jersey. Other than that, all of the counties over the 30% mark voted for Obama. However, there is a pretty good chance that a lot of this is due to race: as we can see from the lower right of this diagram, this area's ethnically diverse population was probably just as important to Obama's success as its college-educated population. The fact that there is a conservative middle class in New Jersey is a small chink in the armor of Democratic dominance in this region.

Thursday, April 15, 2010

Education and the election by region: The Midwest

The next region presented is the rolling heartland of America, the Midwest, which (for my purposes) consists of Minnesota, Wisconsin, Iowa, Missouri, Illinois, Indiana, Michigan and Ohio. This region is very diverse, having both some very large metropolises, and many rural areas. It has a manufacturing base and an agricultural base. This area also gave 85 of its 96 electoral votes to Obama, with losing Missouri by a few thousand votes.
As with many areas, we have a large bulk of counties in the lower left, mostly small, rural counties. There is also some counties extending out to the lower right, but not extending too far. In this region, there was many rural, white, low education counties that Obama won, although he won them by smaller margins. He won by 60-40 in rural Wisconsin and lost by 40-60 in rural Missouri. Other than that, the pattern seems to be to be tied between the 20 and 30 mark, and then for education to become a pretty strong factor in favor of Obama above the 30% mark, although there are a few wealthy suburban counties that voted for McCain.

Wednesday, April 14, 2010

Education and the election by region: Coastal South

Our next region is the Coastal South, which compromises Virginia, North Carolina, South Carolina, Georgia, and Florida. Once again, this is a somewhat artificial region, since I believe South Carolina and Georgia might have more in common with Alabama and Mississippi than they do with the metropolitan states of the coast. However, that is a running caveat, and since this entire thing has gone on for quite a while, lets look at what we have:
This area does have some high education counties, and actually quite a few of them, but they seem to be mostly located in Virginia (NoVA, to be precise), with a few in North Carolina. The high education suburbs that voted strongly for McCain are mostly in Georgia. Although Florida is included here, none of its counties are strong outliers, which may be significant. There is also not quite as many heavily minority, low-education counties as there were in the interior South.

Monday, April 12, 2010

Education and the election by region: Interior South

Due to a string of computer problems, I haven't been able to update people fully on the wonders of demographics and politics. But now I am able to do so, and I am going to move to the next region, the Interior South. I originally just did "The South", but that was too large of a region, so I decided to break it down into the Interior South and the Coastal South. The Interior South consists of Arkansas, Louisiana, Mississippi, Alabama, Tennessee, Kentucky and West Virginia. As always, this is a somewhat artificial group of states, but I think it makes sense in some ways.That being said, lets see what we have:
There are probably three major stereotypes that apply here: that compared to other regions, the Interior South is less educated, more conservative, and the vote is split more along racial lines than in any other region. And this plot shows that that is indeed the case.
There are only 11 counties with over 30% college graduates, and those counties voted for McCain over Obama 7-5. The most educated county in this region voted for McCain, and that seems to also be unique. Also, while the less-educated counties the Great Plains and western regions hovered around the 15% mark, here they seem to be clumped around the 10% mark.
Even the counties that did vote for Obama that have higher education rates seem to be the most African-American, and so the education here is mostly an artifact.
So:
In the interior south, race and not education is probably the defining mark of the electorate.

Saturday, April 3, 2010

Education and the election by region: Texas

After having done the great plains, I reach my 4th region: Texas.
Yes, Texas gets to be a region all by itself. Popular belief (especially amongst Texans) would have it that Texas is unique amongst the states, and it is not an unfounded belief. Because Texas has some very urban areas, some very rural areas, some very Hispanic areas, and has aspects of being both a Southern, a Great Plains, and a Western state, as well as the fact that it has lots and lots of counties, I decided to put it in a category of its own.
Texas has some of the patterns we know and (love), but has some patterns of its own. We have a lot of counties in the lower right that are strongly minority (most of which are Hispanic, but I think some African-American counties along the TX/LA border might be there too). We have a lot of rural, white, conservative counties in the lower left. We have some conservative suburban counties in the upper left. And we have our single "college town" county, Travis (home of Austin and the University of Texas) in the upper right.
One thing that is interesting is that while the conservative suburban counties in the upper left are slightly less conservative than the conservative rural counties in the lower left, they are still, compared to many other areas of the country, pretty conservative. In California, Orange County slipped through with a 2 point margin for McCain. Here, we have many counties over the 30 mark with big margins.
Second thing to notice: notice how close Dallas (Dallas), Bexar (San Antonio) and Harris (Houston) are to each other. These aren't the best educated counties in Texas, but they are three of the biggest. So it looks like the big urban areas are moving to the Democratic column, but I suspect that has to do with minority population: their better educated suburbs are still more conservative than them. Tarrant and Hays, which are both suburban counties, with good college rates, seem to be following the liberal pattern, though.
Also, notice in the bottom, there is a big hole in the middle. Low education counties in Texas are strongly conservative (if they are White) or strongly liberal (if they are Hispanic), but none of them seem to be contested.
I think the bottom line in Texas is that while the Black and Hispanic vote make it somewhat competitive, a state where the well-educated suburbs are still strongly conservative is fated to be conservative for a while longer.

Thursday, April 1, 2010

Education and the elections by region: The Great Plains

Since I have already done the mountain/Pacific Northwest states previously, I am skipping to the next region: the Great Plains. The Great Plains, which constitutes North Dakota, South Dakota, Nebraska, Kansas and Oklahoma, is the smallest region in terms of population that I will be looking at. But I feel that it is indeed a unique region, because unlike the Western States, which tend to have geographically larger counties, the counties in this area tend to be smaller, and have no barriers between them. Thus, there are much fewer places where counties have distinctive demographies. This is also the case in the Midwest, but unlike in the Midwest, the Great Plains has fewer enclaves that are either ethnically diverse or home to manufacturing. Thus, it makes sense to treat the Great Plains as a region.
The Great Plains region is also, electorally speaking, not very diverse or interesting.
As with other regions, the lower right corner is composed of ethnic communities, which in this region are all Native American. Otherwise, there is a big clump of counties between -80 and 0, and between 10 and 20, and then a smattering of counties running upwards, and ever so slightly to the right, above the 20 mark. However, there is on the face of it very little tie between college education and voting patterns. However, this is another place where the population difference of counties makes a big difference. Douglas County, Nebraska, has 1/4th of the population of Nebraska, making it just as important for Nebraska as Los Angeles County is for California. Douglas County, Nebraska also contains an entire congressional district, and under Nebraska's almost-unique system of giving electoral votes, was Obama's sole electoral vote in this region.

From the information here, I can't see if there is a trend, and how much of a trend there is, between education and voting patterns in the Great Plains.

Wednesday, March 31, 2010

Education and the elections by region: the Hispanic Southwest.

Now I set about the continuation of my task to separate the relationship between education and elections down into regions.
One of the first things to be mentioned about this is that separating the United States into regions is never a cut and dried task. Parts of California obviously have much more in common with parts of Oregon than they do with the Arizona/Mexico border, and yet I will be grouping all of California and Arizona together. I think that most of the groupings I made make sense, but I might have to look at the results closely before I decide. I am grouping states together based not just on demographics, but also on the number of counties that state has, which can skew results wildly. Texas and California have similar demographics, but it doesn't show up in a scatterplot because the 200 small, rural, white counties that Texas has make it look like they are quite different.

So I put together California, Nevada, Arizona, and New Mexico, and this is what I got:

The first thing to notice is that in opposition to the national map, this area has only two counties over the 30% mark that voted for McCain, and both of those were only barely over the 30% mark, and only narrowly voted for McCain. This might be significant, or it might be caused by the area's larger counties. In an another state, where Orange County would be two or three counties, one of them might be much more wealthy, educated and Republican. But because of California's large counties, Orange County combines many different demographics.
At least, that is one theory.
Other than that, We have kind of the expected points: highly educated urban areas in the upper right, minority communities (mostly Hispanic) in the lower right, and rural, white counties in the lower left.
As I have pointed out before, counties make a bad unit sometimes. I specifically marked San Diego and Los Angeles Counties, because while they are not outliers, it is important to remember that most of the population in these data points is in only a few of the data points.

Monday, March 29, 2010

Incidentally: the ACS versus census census data

After doing my gigantic plot, I found out that there was actually newer census data than the 2000 census data I was using.

That data comes from the American Community Survey (ACS), which is done pretty much continuously, and generates new estimates of key demographics also pretty much continuously. Some of these 2008 numbers are different than the 2000 numbers!

However, having made this mistake, I will justify on it on several grounds: first, the census data, while older, is more complete. The census really does try to count every single person in the country --- while the ACS just estimates it. Also, the ACS doesn't seem to try to take a picture of every single county, which is one of the purposes of my plots.

Although, I am certainly waiting for the 2010 census data to come out.

The same, but different: How McCain did

So we looked at the total situation, we looked at the Obama states, and now it is time to look at the McCain states.


Once again, we have an almost random-looking assortment of dots, that when properly examined, show the breadth and depth of American politics and demographics. And, as I've said on the last two posts, a few things can be deceptive about this diagram. For one thing, the big dense ball below the 20% mark on the left: mostly rural counties across the south and east. Again, some of this may be clearer when I break down by regions.

Other than that:

In states that McCain won, McCain did pretty good in most counties, as well. Which is a fairly intuitive result. However, above the 30% mark, the counties started to even out, and above 40%, Obama won more counties than McCain.

What is especially interesting to me is the counties above the 35% mark. On the left, they seem to mostly occur in the south: Alabama, Georgia, Tennessee, and Texas. The two Kansas cases are somewhat nominal, since they occur pretty close to the line. On the right side, we have a few Southern middle class African-American counties (DeKalb, Fulton in Georgia), a few college counties (Clay in South Dakota, Boone in Missouri, Gallatin in Montana) and a few resort communities (Blaine in Idaho, also Gallatin in Montana). Oh, and also Travis County Texas, of course.

So, what this tells us is that while the demographics of the middle-class and upper-class south are still probably pretty Republican, those same groups of the mountain west and great plains may be becoming more liberal. Is the rest of the northwest following Oregon and Washington solidly into the Democratic camp? Will the south stay the south?

And to those questions, I don't have an answer.

Saturday, March 27, 2010

Breaking down the megapost: Obama's states, and college graduation rates.

I said I would break down the plot of counties in 49 states by region, but before I did that, a simpler way to break them down is simply by states that Obama won versus states that McCain won. This also didn't involve much extra work, just some cutting and pasting, and "voila!"


This shows the margin vs. college numbers in the states that Obama won, that represent a great deal of America's population, but less than half of its counties. (Quite a few of America's counties are in the south and east).
Much of this graph looks like the larger graph, especially the right side: many of the counties that Obama won biggest in are also in states that Obama won. The top of the graph looks much the same, as well. Of America's counties with over 50% college graduation rates, all of them are in states that Obama won. At the 40 to 50 percent mark, the situation is not so clear cut, although Obama still leads there. From 30 to 40, the situation becomes less muddy, but it is only at 20 to 30 that the situation becomes more muddy.

Since (as I have said many times) Obama's success was based on him managing to win both an urban/minority vote and the "educated urban/suburban" vote, the place on the left side, above 40 shows his weakest spot. Highly educated counties that voted for McCain are located in Indiana, Ohio, Colorado, New Jersey and Virginia (and possibly Wisconsin). If suburban, educated (and white) voters do deflect from Obama in 2012, those are all likely places that it could happen. Although I believe some of those places are more likely than others.

The bottom left quadrant seems to be mostly rural counties in Urban states: while Obama might have some trouble in suburban Florida, he has a bigger problem in rural Florida. However, these rural areas in urban states make up a relatively small share of the electorate.

The bottom right quadrant is also interesting: although some of the low education, high Obama counties are still there, (and are mostly black, Hispanic or Native American areas), many of them aren't. Much of Obama's low-education support comes from rural counties across the deep South, or Hispanic or Native American areas inside otherwise conservative states. So those areas don't show up on this diagram.

Wednesday, March 24, 2010

I forgot how easy Oregon was: time for a break

After all that time where I was punching in data from all those southeastern states with lots of repeating counties...
I forgot how easy it is to put in data for Oregon's 36 counties.
So, as a little sidetrack from what I was doing:

Demographic data, as I have said sooooo many times, usually does not paint a clear picture. One of the clearest and most unfortunate pictures I have seen painted is that Hispanics are becoming America's new underclass. (I believe that America does have a class system, although I know many people disagree). Unfortunately, armed only with census data and open office, I can only diagnose the problem. Fixing it would take effort. And acknowledgment.
Ah, and after giving my speech, I will make a technical comment: high school rates also seem to have a lot to do with urbanization, (inversely), but here we see that urbanization doesn't seem to have a large impact: Multnomah and Clackamas counties are two of the largest counties, but they are far ahead of the rural, Hispanic counties in the upper left corner.

Tuesday, March 23, 2010

The 2008 election and college graduation rate: every county in the country.

I have done many plots in the past, showing the connection between Obama's margin in the 2008 election, and college graduation rates. After having done a number of these, I thought it would be worthwhile to go the whole distance, and get the data for every one of the United States' 3300+ counties, and see what it revealed.

As I wrote in the summer of 2008, the window of attack available on Obama during the campaign was narrow, and McCain's campaign chose not to portray Obama as a populist demagogue appealing to uneducated minorities. Instead, they did the more acceptable thing of painting Obama as a "celebrity" favorite of the Latte-sipping crowd. And even within the liberal camp, this was kind of accepted as truth: Obama was the favorite of the young, the college educated, and the urban, as well as of the poor and minorities.

But what does the data say? By looking at county-level data, we have a tremendous amount of datapoints to look at. This many data points deals with much (but not all) of the limitations of working with only two variables. One of the biggest problems with looking at college graduation numbers is that they often occur in urban areas that also have large numbers of poor, minority voters. Did New York County (also known as Manhattan) have a wide margin for Obama because it is one of the best-educated counties in the country, or because it is one of the most ethnically diverse? With enough datapoints, the answer emerges. (Although, of course, the answer is the somewhat predictable, and somewhat disappointing, "both"). Of course, looking at county level data has its own problems: my data includes Loving County, Texas, population 56, and Los Angeles County, California, population 10,000,000, and both are given equal weight. This is especially a problem because certain areas of the country, (mostly in the south and east), have lots and lots of little counties, whereas many of the Western states have much fewer counties. For that reason, I will be breaking these figures down into regions in upcoming posts.
Also, of course, the fact that as a counties college educated population goes up, that it becomes more politically liberal, doesn't "prove" that it is the college educated people that are becoming more liberal. However, with 3300+ data points, it certainly is a hard case to argue against.

With those caveats in mind:


The first thing this shows is what every student of American politics should know: there are few fast and easy rules of American politics. There are counties that Obama won that are above the national mean for college graduation (25%), counties below it, and the same is true for McCain.
However, Obama did seem to do better than McCain amongst the most elite-educated counties. Of the 11 counties where more than 50% of the people are college graduates, McCain won only one: Douglas County, Colorado, and that was a relatively narrow victory. Above the 40% line, and even the 30% line, the situation is a bit more murky, although Obama still seems to be ahead. This is especially interesting because college graduation rates often coincide with affluence, meaning that these counties should not be all that difficult for Republicans to win, based on economic self-interest. Also, the high-education counties that McCain did win by large margins tend to cluster in one region of the country: the south, with some in the midwest. (Which will be explained further when I break this down by region).

Of the low education counties that Obama won, the most extreme examples, located in the lower right, tend to be heavily minority. They include Hispanic areas (Starr and Zavela, in the Texas border region), Native American counties (Shannon and Buffalo, in South Dakota), and African-American counties: (The Bronx, Baltimore City). There actually are some low-education white areas that Obama won as well: Eliot County, Kentucky and Anaconda/Butte, Montana, for example. Also, large chunks of the upper midwest are not highly college-educated, are white and rural, but tended to go for Obama.

This diagram also reveals some of the political constraints that are put on Obama's policy decisions. Obama won through a coalition of some of the most educated, affluent people, and some of the least educated and least affluent people. Even if we are cynical enough to discard the idea of being President for all the people, Obama (and the Democratic Party), have to somehow manage to keep a coalition that includes Pitkin County, Colorado (one of the country's wealthiest counties) and Baltimore City happy. There are often large differences in economic interests, worldview, and values between the high and low-educated counties of the US.

The Republican party also has the same problem: it certainly wants to win back those affluent suburban counties, but it has to at the same time keep its base (that gigantic sea of data points below the 20% mark), which is largely white, rural and low-education, happy.

And this is one way that 2012 will be fought.

Monday, March 22, 2010

The master plan is done, but it will take some time to tease it all out:

I have actually been making a scatterplot every day, I just haven't been posting them here.

But finally I put them all together:
I will be explaining what this is about, and then breaking it down, in upcoming entries.

Wednesday, March 10, 2010

Education in the Midwest:

It took me a while to put this together, but I put together a plot of high school and college rates in the Midwest (Minnesota, Wisconsin, Iowa, Illinois, Indiana, Michigan & Ohio).

I figured that these states make a good cohort, since they have relatively small geographical counties, usually without significant geographical boundaries between them, and they have a wide mix of agriculture and industry. Of course, as with any grouping, they aren't a perfect fit, but I think they work well enough.

(The picture should probably be clicked on, its a big picture).
After doing all this work, there isn't a lot of new information in this picture. Unlike in the Western states, there are quite a few counties with less than 10% college graduates. I think this might be an artifact of the small geographic size of midwestern counties: Western counties, because of their large geographic size, pretty much have to have certain facilities (such as hospitals) in every county. Midwestern counties, since there are no geographic barriers, don't. That is one possible explanation, at least.
Other than that, there is not a lot of surprises: big metropolitan areas tend to hang to the left, and the top right has four elite counties, three of which are college counties, and the other of which is a wealthy suburb of Indianapolis.

Monday, March 1, 2010

Race and education in Alabama

One thing I have wondered about is the generally low levels of education, both high school and college, in Southern states. I also know that many Southern states have high African-American populations, and this is a group that often is (statistically speaking) weak in education.

I bet the reader doesn't have to think very far for the various uncomfortable conclusions that could possibly be derived from looking at this.

But actually, when you look at things closely, it is better. Especially in this case:

Alabama is a good test bed for this, because it has a large number of counties, and they vary from 1 to 70% African-American. And across these counties, as you can see, the education tends to be fairly uniform: and for that matter, uniformly bad. Jus' sayin'. There are counties in Alabama between 60 and 65% high school graduation rate that are almost totally white, as well as counties with those numbers that are majority black. The one thing that is absent is any counties in the upper right: there are no majority African American counties with high high school graduation rates. Although Montgomery and Jefferson do come close.

What is interesting about this is there could be an assumption that is actually backwards. The reason why African-Americans, in the US, have lower education rates might actually be an artifact of the fact that many of them live in the rural south, where EVERYONE has low education rates. At least in part.

Sunday, February 28, 2010

Not a crescent: education patterns in the North East

This blog is becoming increasingly about two numbers: the high school graduation rate and college graduation rates of counties in the US. This could just be my peculiar obsession, but these two numbers together do tell a lot about a county.
It is harder to add them up across states, since "counties" have many different meanings in different states. In the eastern part of the US, counties are much smaller in area, and often much smaller in population.

After I did the Western states, I went through and entered the information for the New England/Middle Atlantic states. Which, for my purposes, are: Maine, Massachusetts, New Hampshire, Vermont, New York, Connecticut, New Jersey, Pennsylvania, Delaware, and Maryland. These states all have relative small numbers of counties, and they also share common demographics, which makes them a good set to compare amongst.
The first thing of interest is that this is not a crescent: the Northeast doesn't seem to have many areas that have high highschool rates and low college rates. It seems to have more of a traditional X=Y relationship. I am also wondering if perhaps I chose the best grouping of states: it seems that what does occur in the lower right might be based heavily around rural Pennsylvania.
Another obvious thing is that there is one major grouping of states, and then a bunch of points to the left. The points to the left make up some really significant outliers. They consist of Sussex County (Boston), three of New York City's burroughs, Philadelphia County and Baltimore City. As could be expected, big metropolitan areas pull in both college educated people, and non-high school educated people.
Also, notice the trio at the top right: two counties in Maryland, just outside of Washington, DC, and Thompkins County, home of Cornell University.

Thursday, February 25, 2010

Hispanic education in California's counties:

As I mentioned in the New Mexico post, sometimes while scanning through data, I notice certain patterns, and then later on I get curious enough to do the data entry and see if they are true.
One pattern I have noticed is that counties with high Hispanic populations also have low high school graduation rates. So I decided to actually do the data and see if it proved true:The data lines up surprisingly well, which data very rarely does. There are different ways to analyze this data, and some of them are pretty politically and socially charged. Another is that often this is an artifact: heavily Hispanic counties tend to be big, economically active counties that attract workers of every stripe.
And while you are thinking about that, we can also think about this:Here, there is not a lot of real correlation. The biggest story here is that the cluster of counties that were at the bottom right are now at the bottom left. Much of California is like the rest of the Western states, where you have a lot of counties with high high school rates and low college rates. So again, as with so much, we have three quadrants filled up in the college diagram.

Anyway, while this is marinating, I am still working on my MASTER PLAN.

Wednesday, February 24, 2010

Making these things is hard: a special treat for my viewers. Both of you.

One of the problems with scatterplots is that for the most part, you can only plot two variables.

But, to quote RZA: "The fourth dimension is time, it comes alive, when the chakras energize up the back of your spine"

Or, in this case, the third dimension comes alive with a BADLY MADE ANIMATED GIF.
I am so proud of my animated gif, or rather proud of the idea and ashamed of the execution, that I don't know tooooo much to say about this, besides WTF California?

Tuesday, February 23, 2010

New Mexico, and the ALMOST joy of having a wildly counterintuitive result

While entering some other data, I had a chance to see that on paper, much as in real life, New Mexico is perhaps one of the most nuanced and diverse of states. Specifically, New Mexico has a high percentage of people who do not speak English at home. And in many places, a low percentage of people who are foreign born. I wanted to look at these numbers more closely, and I got:
It would be awesome if I could say there is a negative correlation between English speaking and native born people, but actually there is...no correlation. Don't let the blue line fool you, its very gradual slope is close enough to flat.
If I knew more about New Mexico, I could probably make more sense out of this.

Sunday, February 21, 2010

Sneak preview!

I have been working on something.
Here it is!Okay, it is pretty obvious what this probably is, but more later!

Friday, February 19, 2010

Not a crescent

In the Grand Crescent post, I postulated that Denver, as an outlier, was where it was because Denver County (which is also Denver City) would be following more the pattern of a large metropolis, then following the pattern of a county in a western state.

I just made that up when I typed it, but it sounded good.

But of course, then I started wondering, so I wanted to plot the high school versus college rates of some of the US' biggest cities. I chose 30 as my number (mostly so Portland could be on there), and started digging for data.
Cities, as demographic units, are not very good. There tends to be lots of artifacts in the data, depending on how the city borders are drawn. Two metro areas might have similar demographics, but the largest city in both of them might exclude or include suburbs. For example, Detroit metro and Portland metro might be more similar than someone would guess, but whereas many of Portland's wealthy areas (The West Hills for example) are included in the city, in Detroit those areas are, I believe, separate suburbs. So this data has problems. All data has problems.
The first thing to notice about this data:
NOT A CRESCENT.
It has a more predictable X=Y shape, although one that is spread out irregularly.
There seems to be a little bit of evidence that parts of the Western States are both well educated, and egalitarian about it. Not a lot, though. Although the five cities in the upper right do have two things in common: they are smaller, and they are outside the most traditional urbanized areas of the United States.
Actually, the most obvious thing that jumps out at me is size, which I should probably do another plot for. NYC, LA, Chicago and Houston, the 4 biggest cities, are all clustered pretty close together. They all have low education levels, and within what they do have, are more "elitist" in the sense of having lots of college graduates for the amount of high school graduates.
I actually am probably going to run plots on these numbers for a number of different factors. Sometimes soon!

Sunday, February 14, 2010

Drivers licenses and rurality.

An intuitive conclusion to draw is that people are more dependent on cars in rural areas. According to anecdotes, New York City is one of the few places where it is normal for an adult to not drive.
Ah, but what does the data say about this intuitive idea?As usual, it says that this idea is not quite that absolute. There is a trend in that direction, but it is quite outdone by other things. New York and Connecticut are a good example: both are urbanized states, with much of Connecticut laying inside the NYC metro area. And yet there numbers of licensed drivers seems quite heavy.
(For whatever reason, Vermont and Alabama have more licensed drivers than they have driving age population, which could be an artifact of keeping their system updated, or could mean there are people who have fraudulent licenses)

Saturday, February 13, 2010

Because Mouse is mad:

So I needed to update this, because I was informed of what "Daily" meant. I was confusing it with "Daly", as in "At the Daly Mansion, you can visit the world of yesterday today".
So I just clicked around on Statemaster until I could find some data to debunk one of those old stereotypes, about taxes and liberalism.

As people have pointed out, "Taxachusetts" is a stereotype. Massachusetts has a tax burden a penny less, per $10, then average. States' tax burden don't seem to correlate with their national politics, at least from this data.
What is interesting about this data is that it doesn't show as much regionalism as could be expected. On most scatterplots, North Dakota and South Dakota show up pretty close to each other, for example. But not here! And although it isn't clear because I haven't labeled enough data points, Appalachia and the Prairie/Mountains are all mixed and matched up, instead of being separate and unequal like they usually are.

Thursday, February 11, 2010

It has been a while, but only because I have more exhaustive detail then ever before:

So when I did the political correlation for all of those Western states, I also did the high school/college numbers. And then, having those numbers, I also had the Colorado numbers, from previously. I added the Utah numbers in, and ended up with a bunch of data points. All the data points together showed something to me: that data entry is a lovely and fun hobby. They also showed me this:There are 297 data points there. I didn't bother labeling many of them. I did label Denver, because it is a major metropolitan area, and also because its unusual place shows that it has a pattern different from most of what you would find in the Western States. This type of (relatively) high-college, low-highschool speaks of an urban area that attracts less skilled workers, and is more common in the urbanized east than in the Western states.What is most interesting about this diagram for me is that there aren't a lot of outliers. And that it has a specific shape. For some reason, in my mind, not a lot of outliers would make more sense on an X=Y curve. In this case, we have this complicated crescent pattern, that seems to hold true across 297 counties in seven states.
One of the things I wondered is if this was actually several different graphs layed out on each other. Did the three parts of the crescent represent three different types of counties?
So what I did was sort these counties by "Rural Urban Continuum" code. This is a set of codes put out by the Economic Research Service of the USDA that sorts counties by how urban and rural they are. As with any demographic measure, they are not perfect, but they are a useful tool.
So here is the plot for counties in metropolitan areas of some sort, define as RUC codes 1-3
There is a lot of diversity in these counties, since some "metropolitan" counties can be fairly small in population. After all, the summit of Mt. Hood is a "metropolitan" area by this reckoning. In other words, Owyhee County and King County maybe shouldn't belong in the same plot.
But! Despite the fact that this diagram is more spread out, the shape remains. The four counties in the upper right are also not the most urbanized counties. By contrast, the three counties that are the most urbanized (Denver, King, Multnomah) all hang to the left, because like most urban counties, they have lots of college graduates, but also attract less educated workers as well.

So next we will look at counties with codes from 4 to 7: counties that are not metropolitan, but have some urban population. As with above "some urban population" can mean many different things.

Two things here: although once again the picture is somewhat blurred, it is also again, vaguely crescent shaped. Secondly, there is pretty big gap in this diagram. Most of the counties seem to be bunched up right below the 20% mark for college graduation, with a few over 20%. Then, between 30 and 40%: only two counties. Above 40%, there is a lot of counties showing up. From what I know about those counties over 40%, they seem to be mostly resort communities. Gallatin, Montana, for example, is the county adjacent to Yellowstone Park, and so has had a big influx of wealthy residents in the past few decades.

Finally, lets look at the truly rural counties, those considered to have no urban population whatsoever: codes 8 and 9.


And once again...crescent. In fact, if I do say so myself, this is the prettiest of all the crescents we have seen so far. I can't think of anything particularly interesting to say about this crescent, besides its pretty, and how about those San Juans?

Sunday, January 31, 2010

Agriculture, agriculture, agriculture:

So, as with most of my digging into dry, obscure facts, the inspiration for this came from reading something that made me so annoyed that I was gritting my teeth. Said thing was an article written by a representative of a Montana trade group talking about how "backyard chickens" and "organic farms" couldn't really feed America.
Which, in part, I am sure he is right: all this hippy agriculture could indeed just be a fantasy, and a fantasy that seems to have some disconcerting implications, mostly involving dinosaur-riding.But, as long as we are talking about agricultural fantasies, it is also fair to talk about large chunks of rural America playing cowboy. I knew that California was the country's largest agricultural state, and I guessed that much agriculture actually went on in big, mainly urbanized states, rather than in the mythical "heartland". But!
The thing to do is to actually look at data.There seems to be a loose relationship between population and agricultural output, although a lot of that has to do with California and Texas.
But that chart is just a warm up, since of course population doesn't have a lot to do with agricultural output: most agricultural output goes on far away from cities.
So the next chart shows us the percentage of a state's population that is rural versus its per capita agricultural output.
Suddenly, the Dakotas become way more important: per capita, they are generating over 12,000 dollars of agricultural income. Other likely suspects, such as Nebraska and Iowa, are also performing quite well. However, up in the top left, notice a four-pointed triangle of states that otherwise don't have a lot in common: Wyoming, Montana, Vermont and Mississippi. Both are highly rural, but have a rather modest agricultural output per capita. California is quite lonely down in the corner.
But!
Of course it doesn't make much sense to look at states in terms of overall per-capita. All those stylists in Hollywood aren't adding much to California's agricultural output. So what if we just look at output per RURAL capita?


Even though everyone moves up with this, the effect is relatively different. California, with a very small rural population, manages to create 48,000 dollars of agricultural income per rural resident. Massachusetts has a similar effect, but that is largely an artifact. As would be Rhode Island and New Jersey, who have infinite output for every rural resident.
This also puts the Midwest/Great Plains in a slightly more modest perspective, and shows that our four-pointed triangle is quite disappointing. Considering how much of Montana is rural, rural Montana (or Wyoming) is not actually producing that much agriculture.

So if organic farms are an illusion, agribusiness being an actual economic force in Montana is doubly so.

Of course, a lot of these charts are dealing with things that are very hard to operationalize. Many "rural" areas are not actually given to serious agriculture, and much agriculture goes on in urban areas (such as Fresno county, which is a metropolitan area, Class 2, and has by its self as much money from agriculture as all of Montana). And of course the amounts of arable land, and how much it can be used, vary greatly from state to state. (It is amazing that North Dakota could even have 1/10th of the agricultural output of California, since it snows in North Dakota from September to May). There are lots of different ways to operationalize this, but my first suspicion was correct: heavy duty agriculture goes on mostly in a number of states, some of which are quite populous and urban.

Saturday, January 30, 2010

I have way too much time on my hands: every county in five states, Obama, and education

There are two things wrong with this blog: first, I haven't updated every day, like I used to. Second, I seem to be get more and more wrapped up in political minutiae, despite my occasional efforts to the contrary.

Well, sorry.

So I have followed even further down my obsessive path by plotting the college graduation rates of every single county in five states: Washington, Oregon, Idaho, Wyoming and Montana, against Obama's results there. I did this because I had already done three of those states, and noticed a recurring pattern. So I wondered what would happen if all 197 of those counties were plotted against each other.

And the results were:

What this diagram shows to me is that in these five states, the politics are not quite as different as could be imagined. Living in a county with lots of college educated people will probably make you more liberal in Idaho, just as it will in Washington. The only difference is that Washington state has a lot more of those counties. Also, notice the three quadrants: there is only one county, Ada, Idaho, with over 30% of college graduates that went Republican. There are, however, lots of Obama counties that are under 30%.
For those not familiar with the multisplendored thing that is the geography of the Northwest, those outliers in the lower right all have interesting stories:
Big Horn, Montana: is on the Crow Indian reservation
Glacier, Montana: is on the Blackfoot Indian reservation
Deer Lodge, Montana: has been working class leftist for a 100 years ago. Full of the descendants of Irish miners. Along with the people in Silver Bow county, they invented the weekend and these counties have not voted Republican EVER.
Blaine, Montana: another reservation county
Lincoln, Clatsop and Columbia, Oregon: coastal counties. I don't know why the Oregon Coast, and the Washington Coast, tend so Democratic
Cowlitz, Pacific and Gray's Harbor, Washington: Another trio of working class, rural, coastal counties. Gray's Harbor is where Kurt Cobain came from.
Hood River, Oregon and Multnomah, Oregon: the power of HIPPIES. Multnomah is an interesting contrast with King: despite having less education, and less minorities, Obama did better than he did in King County, Washington.

Outliers on the other side of the map:
Okay, I lied about one thing. Idaho probably is more conservative than Oregon, Washington or (most of) Montana, even with its generally lower college rates. Consider Madison county, which has close to a 25% college graduation rate, but which Obama lost by a margin of 70%. This are is, as could be guessed, heavily populated by the Mormon ethnicity. In fact, I suspect that most of the counties that skew to the left on this diagram are probably heavily ethnically Mormon.

So now we know!

Wednesday, January 27, 2010

Oregon Measures 66, 67, and the Obama Proxy

During a special election yesterday, the state of Oregon passed two ballot measures, one increasing taxes on people in the upper income bracket, and the other increasing corporate taxes. The measures passed by a narrow margin, and were only passed because of a large margin in Multnomah County, Oregon's largest and most liberal county.

The measures had very similar margins on the state level, 54.2 percent for Measure 66, increasing personal income taxes, and 53.5 percent for Measure 67, raising corporate taxes. Broken down by county, there was also, as could be expected, an almost perfect correlation.In fact, because of the way I round, there might have been even more correlation than this chart suggests. In any case, that isn't very interesting, as the results could probably be expected.
What is slightly more interesting is when I run the numbers for 66 (which, as I said, are also pretty much the numbers for 67) against my best current proxy of Oregon's politics: Obama's 2008 margins.
There is still a high deal of correlation here, alt:hough the numbers are different: Obama did a lot better than either of the margins did. However, some areas over performed Obama's numbers, whereas others underperformed. The counties on the right were underperformers, the counties on the left overperformers. Hood River, Washington, Clackamas and Deschutes counties are all underperformers, and the reason could be that they are counties with a lot of education, that might have been drawn to the image of "Professor Obama", but because they are also fairly wealthy, are less then enthusiastic about tax increases. On the other hand, Lincoln and Clatsop counties on the coast, as well as Umatilla, Wallowa and Harney in Eastern Oregon, are all more rural, traditional areas that might not like the image of liberalism, but might be more prone to economic populism.
At least, that is one way to look at the data. And of course, the skewing doesn't change the absolute numbers: rich, suburban Washington County still supported the measure, and poor, rural Umatilla County did now.

Thursday, January 21, 2010

The Oregon senate! Education! Massachusetts! This one has it all!

After coming to the (probably foregone) conclusion that the votes for two candidates for the same party will closely correlate with each other, even if the actual numbers are very different, I decided to test it out:
This shows the result of the 2008 Oregon senate rate. Obama got a 17 point margin, Jeff Merkely got a 2 point margin and a plurality. Despite the differences, the numbers they scored lined up very well. This is some of the best correlated data I have ever seen. There are no significant outliers. There are not even any insignificant outliers.
(BTW, I will probably now be fishing around for a pair of elections that DOESN'T correlate like this)
Long, long ago, I posted a plot of Obama's margin versus college graduation rates in Oregon's counties. I did the same thing for Merkley's margins versus college graduation rates, and got this:

This chart looks very much like the Obama one, and shows that over all, education is pretty strongly correlated with Merkley's margin. (Which only makes sense, since Obama and Merkley's margins correlate perfectly, and Obama and education correlate, therefore Merkley and education correlate. There is probably a formula for it)
Now, back to Massachusetts. Nationally, college graduation rates correlated with Obama's margin. In several of the states I looked at, college graduation rates correlated with margin, by county. So if you would have asked me, I would have guessed, quite strongly, that Massachusetts' most educated counties were the most Democratic. But guess what I found:

In Oregon, in Colorado, or even in Montana and WYOMING, the counties with the highest education are the most Democratic. In Massachusetts...nope! This is for Coakley's election, but as discussed, since Coakley and Obama's margins correlated very closely, Obama had pretty much this shape.
Now I have some MATHEMATICAL PROOF of something I figured out when I first went to school in Vermont: in the Northwest, education is used to stuff people full of LIBERAL PROPAGANDA so that they agitate for GREEN CITIES and JUSTICE FOR ALL. In Massachusetts, education is something you do to get some cocktail party talk, before you marry an investment banker and move to the suburbs.
(And for those who like this blog only for the MATHS, please forget that previous paragraph).

Tuesday, January 19, 2010

Its late, but I just have to show that I was right: Coakley vs. Brown

Not too much comment, because my neck kind of hurts right now, having been staring at a glowing box all day, but:
With a few minor exceptions, the counties all line up pretty directly, meaning that there is no major breach or rearrangement between 2008 and now. Not to say that the 30 point shift between Obama's margin and Coakley's margin isn't significant, but Brown didn't get there by targeting some special region or demographic.
When I was looking at the graph before, I thought Middlesex would be the cut-off point, just because of its name. It seems Middlesex was not EXACTLY in the middle (although if it would have followed the 30 point drop exactly, it would have been), but it was pretty close. I didn't really know how Massachusetts population broke down, but apparently the most liberal counties are not actually that populous.

Sunday, January 17, 2010

Where I actually make a prediction:

So, one of the biggest political issues as of late is the Massachusetts special election, which has taken an unexpected turn, in that Massachusetts seems to actually have a chance of electing a Republican senator.
Trying to do a qualitative analysis of what is going on in Massachusetts would probably take us into some pretty murky territory, but luckily we can look at numbers.
I charted Obama's margin in 2008 versus Kennedy's margin in 2006, and came up with this:What this shows us is that the Democratic margins in two different races (and for that matter, I could have used any two races) correlate very strongly. Kennedy and Obama both got big margins, with Kennedy getting uniformly bigger margins, although the spread was different in different counties.
So the prediction I am going to make is that whatever the result of Tuesday's election, the plot will look a lot like this (and I will have one ready pretty soon after the election to see if I am telling the truth). Whatever the numbers turn out to be, they will correlate pretty well with either of the two numbers used here.
If there is some skewing of this diagram, it will probably be that the upper left 5 counties, which includes the city of Boston, will probably stay more stubbornly liberal than the counties between Plymouth and Nantucket.
I guess only time will tell!

Wednesday, January 13, 2010

You will die someday:

You will die someday.

But it probably won't be bears.

I had a chart to show this, but it won't upload. LATER.

Sunday, January 10, 2010

Back to my old tricks: the 2008 election, poverty, and further comments on the coalition nature of American politics

After my last post, one of my occasional posts on something not state related, I decided to go back to my old tricks: the American states, demographics, and the 2008 election.
The election if fascinating for me because it allows an operationalization of attitudes that it is hard to capture with other statistics. Vermont and Idaho are very different, but what exactly the difference is, is hard to describe until you have an election, and then you have a gigantic difference between the two, that can be plotted. Or, for that matter, Montana and Idaho are very different, but it is hard to say so until you have an election. I mean, besides its obvious that Montana is better than Idaho.
The two variables I looked at today, poverty rate and Obama's margin, show again something that I have been harping on: the different states and areas of the United States have different demographics. This was especially obscured during the evil era of 2004-2005, and "red state/blue state". Despite some similarities in electoral patterns, there is not a lot else that Wyoming and West Virginia have in common. To wit:
What is interesting about this diagram (and I thought it was so interesting that I put it a larger size, and labeled every single state, because there is a lot going on here), is that there is some pretty clear geographic grouping. Especially over on the left side of the diagram, we have two different groups of states that supported McCain: a low-poverty group, consisting of mountain and prairie states, and a high-poverty
group that consisted of southern and Appalachian states. Also, notice that almost all the mountain and prairie states are relatively McCain-supporting, (which depends on whether Nevada and New Mexico are considered mountain states) and relatively low in poverty. Likewise, all southern/Appalachian states are high in poverty, and all support McCain. There is also no middle ground: there are (almost) no states with a poverty rate close to the US average that were strongly McCain-supporting. (The exception to this is possibly Idaho).
Another interesting thing about this diagram is none of the states that were close states (defined currently as within a 5 point margin either way) had a very high or very low poverty rate. What is even more interesting about these states is that they otherwise don't have much in common: Montana, Georgia, North Carolina, Ohio, Florida, Missouri and Indiana don't have much in common, besides all being close to the US median poverty rate, and being right on the fence in the election. Arizona was also probably in this group, but broke a little bit more for McCain than its "true" politics would suggest, because it was his homestate. I think these eight states will continue to be 'in the middle'.
There is also a gap between the Florida/Ohio line and states down and to the right. My intuition is that states like Virginia and Colorado, which are to the right of that gap, are where the true Democratic electoral base begins. Although Virginia and Colorado were seen as "swing" states this election, I believe they were actually states that shifted into the base. They have high levels of education, and low levels of poverty, and I think that they are closer to the base than Florida and Ohio are.
Another point to make, especially in regards to me grouping together the prairie/mountain states, is the difference between those states and, say, Colorado, Oregon and Washington might be less than you would expect either from stereotypes or from this chart. From what I have looked at before, the patterns in, say, Wyoming and Oregon are the same. Counties with large amounts of college educated people still go Democratic, sometimes dramatically so. The difference is, Oregon has a lot of those counties, and Wyoming has two of them, and those two counties have fewer people. The pattern in mountain/prairie states is to have the major Democratic-leaning group to be college educated and urban people. In the Appalachian/South, the major Democratic-leaning group seems to be African-Americans. The two sides of the Republican-coalition are also moving in opposite directions: South/Appalachia is becoming more conservative (although partially this is just a result of not having a Clinton and a Gore on the ticket), which the mountain/prairie seems to be becoming more liberal.
My own feeling is, if 2008 is the underlying picture, the Democrats have a really strong position. Greater education and urbanization seem to be moving some areas permanently into their column, such as Colorado and Virginia. It could be that Obama's success in 2008 was a reaction against Bush, and 2004 is closer to the underlying electoral picture. My own belief is that for some of the states, 2008 was the real picture, and for others it was a reaction. And this diagram actually, to me, gives a pretty good idea of which states are which. Ohio and Florida were reaction, Colorado and Virginia were demographic shift.
Of course, there are many other things that can come up. Will Hispanics remain a Democratic group? Will young and college educated people remain Democratic? Will the low-poverty, high education mountain/prairie states continue to become (at least slightly) more Democratic, and the high-poverty, low education Southern/Appalachian states become more Republican? Will incumbency be an advantage, or a disadvantage?
We don't know the future. We can only plot the past.

Saturday, January 9, 2010

Mostly because I like poking holes in this one: education around the world

Some years ago, I took a graph of educational attainment amongst the G-8, blanked out the names, and posted it on an internet community, and had people guess which countries they represented. For the most part, people guessed that the United States was the country with the lowest education, while a number of Western European nations had the highest education.
It was actually America with the most education, or at least up there.
Conservatives and liberals like to create a dichotomy between the United States and Western Europe that does not actually exist quite so much, as far as I can operationalize. Liberals believe that Western Europe is an enlightened land of socialism and the United States is a country of knuckle-dragging idiots, while Conservatives believe the United States is the last bastion of capitalism against the socialist realms of Europe. The truth is, the US and Western Europe (with "Western Europe" including Australia, Japan and Canada, of course) are market-driven countries with large social welfare programs.
Someday, I will give you some operational proof of that last statement.
Anyway, education wise, I guess American tourists go to Paris and meet lots of educated people, and assume everyone in France is educated.
That is not even what today's graph is about: today's graph is just about years of education versus college attainment. These are, (as you know by now) related, but not as related as you might guess. As we showed with the United States, its possible for an area to have a high amount of high school graduates, but a low or medium amount of college graduates.So, as could be believed, they are related, but not extremely strongly.
Also, I took this data from nationmaster.com Notice that not all the dots on there are named. The dot next to Norway is New Zealand, but it could be Madagascar or Costa Rica for all you know. This is why I like using the states of the US: they are easy to define. No trouble to decide what nations are comparable.

Thursday, January 7, 2010

The two states with the biggest political differences:

A while ago, I wrote that the electoral behavior of Oregon and Washington is fairly similar, which is not exactly a gigantic surprise.

I decided to look for the two states that have the least electoral similarity, and I probably have found them:

Vermont and South Carolina. Throughout much of its history, Vermont was a very Republican state, and South Carolina was a very Democratic state. (This may come as a surprise to some of you.) Since 1992, this has of course reversed. But in their history, Vermont and South Carolina have usually had gigantic differences in electoral margins.
In fact (and this is probably the only states this can be said of), Vermont and South Carolina have never voted for a Democratic candidate in the same election).
Notice that these numbers are margins, not total votes. Notice that South Carolina is up around the 98 mark as a margin. During the years of the "solid south", South Carolina had elections with numbers like 99%-1%. Vermonts Republican margins were usually pretty big, but not quite THAT big.
So, when do you think Vermont and South Carolina will finally agree?

Tuesday, January 5, 2010

Perot in 1992 and the coalition nature of American politics

I am sitting here wondering what is wrong with me, and why people don't like me.
But there is no answer to a question like that, so I will ignore it.

Instead, I will talk about what is on your mind: Ross Perot, and where exactly his support came from. As discussed in the last post, the evidence that Perot gained votes predominantly from one party or the other is, from the evidence I have, not apparent.

One thing about the diagram yesterday is it looked suspiciously like a much earlier plot I had done: Obama's margin versus high school graduation rates. States that went for McCain tended to have either low or high graduation rates.

(I hope you are following my intuition, because I am actually not, so connect the dots for me. After all, connecting dots is what this blog is all about).

I decided to plot Perot's total vote (not his margin, obviously) against high school graduation rates. (Using data from the 1990 census)

If any of you have been paying attention to my futile quest to link together demographic data with election margins, this data should really stick out, because...there is actually a pretty strong correlation here. I actually ran Perot vs. college, and then the same diagrams for Bush and Clinton, and none of them were very conclusive. But this diagram shows that states with high High School graduation rates had a pretty meaningful tilt towards Ross Perot. In fact, there is something about the way I made this diagram that conceals this: the cluster of states just inside the upper right hand corner is mostly New England states, which have two things in common with prairie/mountain states: lots of high school graduates, and a liking for Ross Perot. Notice that there is not much else they have in common though: Massachusetts and Idaho are stereotypically the polar opposites of national politics.
Notice down in the lower left, low High School and low Perot support states. Now, if you can remember back 18 years, the Southern states were divided between Bush and Clinton. Arkansas, Tennessee and West Virginia were Clinton states. Alabama, Mississippi and South Carolina were Bush states. But all of them have low High School rates, and low Perot rates.

One of the ways that I look at the American "two party system" is as a "two coalition system". The coalitions are made of many different regional groups, with both official power structures and differing demographies. Perot carved up a big part of that coalition for himself in some parts, but not so in others.

This is still relevant, because the current coalition that makes up the Republican Party has two major geographic bases of support: the Prairie/Mountain states and the South/Appalachia. But these two groups have very different demographies, and different cultures and politics.

Monday, January 4, 2010

Perot "taking" votes in 1992

There are only two possible ways for a Democratic candidate to win an election: ACORN, or a third party "steals" them from the Republican Party.
Dog food isn't very tasty, neither is it nutritious.
Anyway, one of the pieces of CW thrown around about the 1992 election is that Clinton won because Perot peeled off votes from Bush. This is also disputed, but since the ballots, with their '2nd choice' bubbles, are all sequestered in a vault under Mt. Rushmore for 99 years, we won't know for a while who people would have voted for if Mr. Perot was not in a race.
But, we do have a technology that can hint at it. And that technology is...scatterplotting! Of course!
First, let me apologize that the new version of openoffice done gone and thrown my Y-Axis labeling right in the middle where it confuses things. I updated to karmic koala because I was trying to get Mario Kart's sound to work right, and it just kind of happened...
Anyway, back in 1992, George Stephapolous has to rent a motel room to call Clinton and tell him when a bimbo is erupting, because they don't have cell phones, and they have no idea what KARMIC KOALA is. But they do know who Ross Perot is.
The above diagram has basically no correlation between how much of a margin Clinton had, and what total percent Perot got. If Perot was mostly taking votes from conservatives, he would be getting a lot more votes in Nebraska, where there are plenty of conservatives, than in Massachusetts, where they are not quite as many. And yet Perot got 23% of the vote in both, even though Nebraska was 18 points against Clinton and Massachusetts was 18 points for him. Of course, looking at the plot a bit closer shows that there may be a little bit of a lean towards Perot in more conservative states: so maybe there was a few states where it made a difference. Over all, though, there doesn't seem to be much evidence either way from this data.

Sunday, January 3, 2010

As promised: Canada oh Canada

One good thing about Canada is, there is only 13 demographic units to enter data for.
Of course, I am still looking for good sources of data, and for good sources of interesting data. Those lacking, I just looked at what wikipedia could tell me, and decided to look to see if gdp per capita and land area were correlated:
There does seem to be some sort of trend here, or more than one. Also, the two big outliers are both very small territories. So I don't know. Also, yesterday I promised that we would get a plot that looked like a dragon's head. And of course this doesn't look like a dragon's head, but it does look somewhat like a pair of pliers. Or, as they say in Canada: "spanners".

Saturday, January 2, 2010

I keep on telling myself I will diversify, and then don't: ERS data on

I really do want to do more international stuff, but I tend to look at the US data, because I know where to start, and I know where the good data can be found. But some day, I will try to figure out how the Canadian census data works, and you will be able to compare Alberta to Nova Scotia all you want.

Another thing is, when I started this, I wanted to look for interesting shapes. But, truth be told, most data looks about the same: a big smudgy diagonal line. Such as this:
Rural and urban poverty go up together, and rural poverty is almost always higher: although by differing margins. While there seems to be the expected grouping in poverty along regional lines, the ratios themselves seem to be caused by different factors. Massachusettes, Nevada and Indiana all have higher urban poverty than rural poverty, but for very different reasons.

I will still be out there looking for an interesting shape. Life span versus miles of highway amongst Canadian provinces looks like a DRAGON HEAD! Maybe!