Monday, October 16, 2006

Election 2006 is really heating up these days. In the Senate, things have shifted from a slight lead for the Republicans to a slight lead for the Democrats. The house has gone from a dead heat to a slight lead for the Democrats. This is, no doubt, the result of fallout from the Foley scandal. Not only that, I suspect that there have been some non-linear effects here as well. In addition to the Abramoff/Cunningham, Bob Ney, and Tom Delay scandals, these have gone from individual events to the revelation of a pattern of corruption in the Republican party, and thus the whole is worse than the sum of its parts.

For election news, electoral-vote.com has fancy maps to look at, along with the most recent projections. Projecting the election outcome from polls is an interesting problem. The guy from the aforementioned site, Andrew Tenenbaum, a notable Linux developer, has chosen a normal average of the last week's polls as the best predictor for the outcome. This seems like a simple approximation to using exponential weighting, which I can only assume would be better. One could certainly do a historical study to find the best decay rate to use for political contests. This of course depends on the non-stationarity of the populace, which may vary by type of election. There could be an incumbancy effect there as well, as in people are more likely to vote for one or the other candidate on election day compared to what they say to pollsters. Also, I believe these polls restrict the tally to those that they deem "likely voters," and are continually updating these selection rules, meaning they should be correct on average.

Also, tradesports.com has a bet going for GOP control of the house and senate. Currently, the predicted probability of GOP senate control is 60.1% (down 9.4% just today). That number is 32.2% for the house. These markets seem to be fairly accurate. I looked at a study of another political betting exchange, somewhere in Iowa, and their numbers were correct on average with low variance. They are most wrong when everyone is surprised by the outcome, indicating they are a public information aggregation mechanism. This is as opposed to the global capital markets where there is enough money at stake (the Iowa exchange had $500 max per person) that people have the incentive to pursue private information. I doubt anyone is commissioning their own polls to make money on tradesports. I did a quick sanity check, where I calculated the probability of GOP control implied by the probabilities given on the individual senate races, and the market passed very well (within 0.5%, I think).

A good question on this topic is whether the opinion of the populace has any sort of momentum. For positive momentum this would mean that changes in poll numbers over time are non-Martingale, so when polls go up one period they tend more often to go up the next period, and the same for the negative side. For negative momentum, the opposite would be true, so polls would be self-corrective. Foley (Peter, not Mark) and I bumped into this in one of out projects, but the results were inconclusive for our data set.

In the context of a market, prices would tend to "overreact" to individual events, taking them as a signal of further movement. In the stock market, I tend to think of people who talk about momentum as idiots, which is probably still true. However, it is possible that the underlying generator of stock market activity (the world) has some kind of non-zero momentum, which may even explain why people often say that the market overreacts to news. In the case of the stock market, we do not have such a clear view of the underling process being reflected in prices as we do in the case of political betting.

Back in the real world, NYT says that the Republicans are focusing mainly on Missouri, Tennesse, and Virginia. They apparently suspect that New Jersey, though tied in the polls, will go Democratic regardless, as they have for the past 30 years. Supposing these are the only races truly up in the air, whoever wins two of three wins control. Should be an exciting month. The recent shakeup in the Reynolds-Davis race motivated me enough to sign up as a Davis volunteer. I'm not in the district, but I'm only a few miles from the border.

Also, I've been thinking about redistricting lately, in terms of geometric constraints that you can place on districts that would reduce bias and also increase the quality of representation, most likely through increased competetiveness. My intial thought was that you could make them be convex, or at least convex at some level of granularity, since the smallest voting subunit, the precinct, is often non-convex. There would be exception for state borders as well. This would of course imply that they are closed, which I find to be a very reasonable constraint. Others have suggested going further and actually making this an optimization problem. They say you should minimize "compactness," not the mathematical kind, but some arbitrary measure of said colloquial idea. One measure I've seen of this is the average distance between two people in the district. This would also be the average value of the two-point correlation function.

This of course requires that I figure out how to use GIS software and make sense of the TIGER files provided by the US Census Bureau. I have to say, I'm a little dismayed at the state of GIS software. It's obviously very complex, but I always come away thinking that there has to be a better way to do it. Maybe if people could just agree on one format. Anyway, some guy has already done some of this for a few states, including California. He find the set of districts with nearly equal population that minmize the afrementioned compactness measure. I forget the website, maybe next post. I would like to do this myselt. I would really like to see what the two point correlation function looks like, and if it differs between Democrats and Republicans.