Friday, May 22, 2009

Correlation = Causation?

One thing that you hear all the time is that correlation does not equal causation. Basically, just because two things have outcomes that appear to be correlated doesn't mean that the underlying reasons for the outcomes are truly dependent. The lesson that you're supposed to take away from this is that just because you see a correlation, you shouldn't place too much weight on it, unless you can show causation.

So what the heck do you do when you have a very good conceptual argument for causation, but the correlation is nil?

A case study: 2008-09 O.N. Thugs, UPL Basketball.

If you think about the stats that we use (PTS, 3PM, FG%, FT%, AST, BLK, STL, REB, OREB, A/TO), you'd probably guess that 3PM and PTS would be correlated (though not necessarily in a super high way), and you'd figure that the accuracy stats would be correlated with PTS, and also that AST would be correlated with A/TO. But two categories should stand out: OREB and REB. Because each OREB is also a REB. So, if a player is a good offensive rebounder, then he'll likely be a good rebounder. in fact, it would be somewhat shocking if a team was very good at OREB, but bad at REB. Yet somehow, halfway through the season, I was something like 2nd in offensive rebounding (and had the highest rate by far), but was 9th in overall rebounding (with a rate that was about 7th). Eventually, things caught up a bit, and I finished 1st in OREB, and 4th in REB, but even that was a bit strange. So maybe it's reasonable to expect things to even out over time.

The current problem: 2009 O.N. Thugs, UPL Baseball.

If you think about the stats in baseball (R, HR, RBI, SB, OBP, SLG, W, L, SV, K, ERA, and WHIP), the ones that you'd figure to be the most correlated are HR with RBI and HR with R, since each HR you hit guarantees 1 R and 1 RBI. And you expect a little lesser correlation between HR and SLG. But take a look at this:

Team R HR RBI SB OBP SLG RBI/HR R/HR
'90 Reds 249 55 229 30 0.388 0.506 4.164 4.527
O.N. Thugs 256 47 253 43 0.397 0.471 5.383 5.447
IamJabrone 248 78 238 41 0.358 0.495 3.051 3.179
Westy's Sluggers 247 74 259 37 0.389 0.522 3.500 3.338
Black Sox 231 66 249 26 0.361 0.476 3.773 3.500
Cheeseheads 230 61 223 43 0.346 0.458 3.656 3.770
Muddy Mush Heads 239 51 195 54 0.360 0.444 3.824 4.686
IStillSuckCurveballs 234 58 244 14 0.382 0.489 4.207 4.034
Phatsnapper 189 54 219 23 0.346 0.456 4.056 3.500
TheJimmyDixLongballs 202 54 226 40 0.334 0.450 4.185 3.741
Benver Droncos 244 63 238 29 0.338 0.460 3.778 3.873
Hats for Bats 224 51 212 40 0.338 0.424 4.157 4.392

Somehow, I have managed to lead the league in R, and am a close 2nd in RBI. But I am dead last in HR. Compared to the Jabrones, I have 31 fewer HR, which means that I've managed to score 8 more R, despite giving away 31 R from my lack of HR. And historically, R:HR and RBI:HR ratios come in around 4 (just a quick glance suggests that 3.5 to 4.5 are reasonable values to expect. Note, the R:HR can be a lot further off, given the relatively less rare case of guys who score 90+ runs on only 10 or so HR (whereas someone like Adam Dunn, whose 100 RBI on 40 HR is about as low as you'd probably get). But overall, as you look at how UPL teams are put together, you see some stability in these ratios. And then you have the '09 O.N. Thugs, who are at about 5.4 to 1 for both R and RBI.

So what does this mean? I have no clue. Moving forward, you can either make the case that a) I'm due for a bunch of HR since my team is good, but just underachieving right now in power, or b) my team sucks and has been overachieving with everthing other than HR. I think that a) is more likely than b), although I'm definitely biased on this one.

I don't really know what to make of this right now, and will think about this more, but I definitely have had some interesting questions open up regarding how teams should be constructed, if you take a statistical look at the way the UPL is structured. Taking a look at some of these insights is interesting. For example, I'd bet that if you were to take a poll that asked fantasy players which of the 6 offensive categories would be the most useful in predicting fantasy success, you'd probably get HR as the overwhelming answer, with SB being the worst. However, my first look at the numbers suggests that if you were to use only one criteria in evaluating offense, you should look at RBI over anything else (though this is very preliminary, and restricted to historical UPL numbers).

-Chairman (aka O.N. Thugs)

Tuesday, May 5, 2009

Doomed to Repeat

Note, this was originally written on 4/21, but wasn't posted.

As I was looking through the discussion regarding the proposed Mauer/Lester/Dukes for Berkman/Holliday trade in the last post, I came across a fun little gem, addressed to me from Rup:

"NOT EVERYTHING in FANTASY. IS THREE YEAR HISTORICAL PROJECTIONS IF THAT WAS THE CASE YOU WOULD WIN THE LEAGUE EVERY YEAR. "

Now, what's funny, is that as much as I win, if you ignore the random use of punctuation/capitalization, and accept Rup's assertion that my strategy is based exclusively on 3-year historical projections, then you sort of have to accept that 3-year historical projections are are phenomenally accurate and consistent measure. This is particularly the case if you make the assertion that everyone else uses a different model for their analysis.

Let's look at order of finish in every season of the UPL, and we'll assign points like we do in Roto. N points for finishing first in a league of N teams, N-1 for finishing 2nd, ... and 1 point for finishing last.

Max Thugs Jabrone Westy 90 Reds Phat Cheese
2001 10.0 9.0 8.0 7.0 10.0 6.3 6.0
2002 10.0 10.0 9.0 8.0 5.0 6.3 5.0
2003 11.0 11.0 10.0 7.0 2.0 6.9 9.0
2004 14.0 14.0 10.0 12.0 7.0 8.8 8.0
2005 12.0 12.0 11.0 10.0 7.0 6.0 6.3
2006 11.0 9.0 8.0 10.0 7.0 11.0 5.0
2007 11.0 10.5 9.0 6.0 10.5 7.0 5.8
2008 12.0 11.0 12.0 4.0 8.0 5.0 3.0
Total 91.0 86.5 77.0 64.0 56.5 57.4 48.2
% Total 1.00 0.951 0.846 0.703 0.621 0.630 0.529
AVG 11.375 10.813 9.625 8.000 7.063 7.169 6.021
STDEV 1.302 1.646 1.408 2.563 2.705 1.892 1.857
Correl 1.000
0.904 0.555 0.471 -0.008 0.194 0.253

Here, we're looking at some results from some of the UPL regulars who have been there from the start (O.N. Thugs, Jabrones, Westy, '90 Reds, Cheeseheads), as well as Phatsnapper (Rup's franchise) who's a more recent addition. You'll note that for Cheeseheads and Phatsnapper, there are some strange results - in the years that they didn't play, I just used their historical average as that year's entry. Also, in 2002, I can't quite remember the order of finish, after me. I recall C-Lauff finishing 2nd, Westy in the upper half, and Greg about middle of the pack. So, those are guesses. In any case, you'll notice something. There are some high correlations between the franchises and the number of teams per season (the max score). This stat speaks to how consistently a team does. If you combine this information with the average score, then you get a nice picture of how consistent a team is, as well as how successful. Now, this was sort of bragging. I already knew how this was going to turn out, since the UPL Bill keeps track of a lot of this stuff for us.

And, one conclusion that you can make is that year-to-year in UPL Baseball, the optimization based on 3-year projections is a pretty solid strategy. Now that isn't quite what I do, but I'd guess that's a big chunk of my evaluation. But more interesting is what happens if you start to look at the correlations between the different teams.


Max Thugs Jabrone Westy 90 Reds Phat Cheese
Max 1 - - - - -
Thugs 0.904 1 - - - - -
Jabrone 0.555 0.613 1 - - - -
Westy 0.471 0.440 -0.237 1 - - -
90 Reds -0.008 -0.181 -0.199 -0.185 1 - -
Phat 0.194 -0.054 -0.557 0.649 -0.068 1 -
Cheese 0.253 0.442 -0.132 0.482 -0.473 0.234 1

If you have a correlation close to zero, that suggests that there is a very different strategy being employed. It's no surprise that the O.N. Thugs (me) have virtually no correlation with Phatsnapper (Rup). What's also interesting is how small the correlation between the O.N. Thugs and the '90 Reds (Greg). And in turn, the '90 Reds have a virtually no correlation with Phatsnapper. So, it's like there are three distinct strategies going on (or at least, three distinctly different sets of outcomes). Interestingly, the '90 Reds and Phatsnapper have very similar track records (though Greg's is over 8 seasons, and Rup's is over 4), each with a ring (2 for Greg).

We've classified the O.N. Thugs as basing value on recent historical performance. Now, if you were to classify the other strategies, I'd label each value something like this:

Phatsnapper - value based on youth and potential upside.
'90 Reds - value based on personal perception.

In any case, it appears that the Jabrones have similar results to the Thugs. Imitation. Flattery. Yada, yada. Westy has high correlations w/ both the Thugs and Phatsnapper. So maybe some hybrid of both strategies? I'm guessing that it's a blend of historical numbers and the upside of Minnesota players doing well. Sadly, that strategy doesn't work.

The closest thing that I can interpret from about Greg's strategy is that it's the opposite of the Cheeseheads. Of course, the Cheeseheads outcomes are based on finishing in the middle of the pack every year, so we can't really learn much about his strategy here. I'm not sure what the opposite of that is - maybe something like feast or famine? But that's more of an outcome, not necessarily a strategy. Basically, my best guess is that Greg's in his own little world, and when his perceptions (think Bret Boone and Paul LoDuca in 2001, not that I'm bitter about it or anything) are all on for a given year, then he does really well. And when his perceptions are off, he finishes in the bottom half of the league.

Obviously, we don't have a lot of historical data - 8 years worth isn't a ton, though if you break this down further, it may be of interest. As I think about it, maybe it's not order of finish, but total points in a given year that's more useful. And an interesting question may be looking at what stats teams value (or at least evaluate well), historically. I can already guess that the O.N. Thugs manage to do well in OBP, SLG, HR, and RBI, a little bit less so in R, and terrible in SB. And with regard to pitching, I can see the O.N. Thugs doing well in SV, W, K, and ERA, and so-so in L and WHIP...

In any case, I think that this idea of looking at the broad, macro level of the UPL is worth revisiting.

-Chairman (aka O.N. Thugs)