Wednesday, November 10, 2010

Assumptions

I'm copying and pasting parts of an e-mail exchange that I'm having with Westy about some basketball stats. Mainly, it's me making fun of John Hollinger's power rating statistic, which has somehow become a fixture on ESPN.com. A second issue worth thinking about is how perfectly smart people (like Westy) can fall into common traps. Basically, on the front page of ESPN.com, there was this slick graphic that showed that the Heat were the top team in the NBA sofar, based on John Hollinger's Power Ratings. And of course, I thought that was crap.

My initial e-mail was intended to give a little ribbing to Westy about his love for "advanced" stats like the stuff that John Hollinger uses. Just one sentence and a picture from ESPN.com:

Westy - I guess that I see why you love your advanced stats and hate winning :-)

Note the "winning isn't everything" lead-in. We don't need wins; we have numbers with decimal points.

Also, note that this is before my man D-Will led a 4th quarter comeback, which in turn led to an upset of the Heat by the Jazz in OT.

Of course Westy's response is that Hollinger uses strength of schedule (SOS), with the idea being implied that a statistician like Hollinger wouldn't post anything as ridiculous as a 5-2 team being better than a 7-0 team, unless there was a good reason. To be honest, I had no idea what Hollinger's methodology was, other than it was spitting out shady results. So I took a peek at what was under the hood by looking at the methodology. And of course, I'm appalled.

So, I shoot off an e-mail to Westy. I make some smart ass comment about how going 4-0 against the Nets, T'Wolves and Sixers, while going 1-2 against the Magic, Celts, and Hornets clearly makes you the best team in the NBA. But more importantly, you notice that the Hornets have a higher SOS than the Heat, yet are ranked well below them by Hollinger (despite winning head-to-head). I'm OK with using margin of victory as a component in team evaluation, but you probably shouldn't use it straight up. At this point, I hadn't scrolled far enough down to see the actual equation being used, but I did see that the Hollinger starts off by talking about margin of victory, which suggests that's the major component of this ranking (which is what it looked like at first glance). Part of my comment to Westy:

Seems that Hollinger's first criteria is margin of victory, probably w/out any sort of cap or deeper view beyond the final score. In that sort of scenario, you're rewarded more for winning by 54 points against the T'Wolves and Nets and losing twice by "only" 3 and 8 points (only 2-2, but still +43), than you would be for, say, going 4-0 and winning by 9 points each game (+36). Don't get me wrong. Margin of victory/loss should count, but once you get beyond, say 15 points, you'd have to show me a strong case for why it matters.

Now, I'm sure that Westy didn't look at the Hollinger methodology very closely, and my guess is that he assumed that a statistician that's associated with the "advanced stat" movement wouldn't use a crude tool. At least, certainly not to the extent that I was insinuating. Westy suggests that Hollinger is actually using offensive and defensive efficiency, which would be the best (simple) predictor of team performance. Westy's probably right about how team efficiency stats would be a pretty good predictor, at least compared to the readily available stats. But the assumption is that Hollinger is using something that has built off of those stats.

I had a suspicion that Hollinger's stuff just couldn't be as useful as Westy was suggesting, with the results it was spitting out. So I went back to the explanation of the methodology. And sure enough, we see this:

RATING = (((SOS-0.5)/0.037)*0.67) + (((SOSL10-0.5)/0.037)*0.33) + 100 + (0.67*(MARG+(((ROAD-HOME)*3.5)/(GAMES))) + (0.33*(MARGL10+(((ROAD10-HOME10)*3.5)/(10)))))

Basically, here's what this equation says. Everyone starts with 100. Then, you look at strength of schedule, and measure how much it differs from .500. Then you divide this difference by 0.037 (no explanation for why he uses this number). Do this for the entire season, weighting it 2/3, and do this for the last 10 games, weighting it 1/3 (so that you're placing greater emphasis on the last 10 games). Also, note that he doesn't explain why he chooses the 2/3 and 1/3 weighting (or 10 games for that matter). Finally, you look at the scoring margin, and add an adjustment of 3.5 points for each game played on the road. Again, do this for the entire season, weighting it 2/3, and for the last 10 games, weighting it 1/3. Seems simple enough. But what does this really tell you?

Look at some recent NBA history, readily available on ESPN.com. Since 2002, NBA teams SOS at the end of the season will range from 0.484 to 0.514. The reason that everyone doesn't have a .500 SOS is that you have unbalanced schedules, and the certain conferences/divisions are stronger than others. But if you take the difference from 0.500, and then divide by 0.037, you find that SOS will impact a team's power rating anywhere from -0.378 to +0.432. Okay. So what?

Well, remember that we're starting off at 100 points. SOS impacts you less than half a point either way. So how do you get teams with ratings of 86.814 (the Wizards so far this year) and 116.15 (the Heat, before their loss to the Jazz)? Well, SOS can be a bit skewed right now, but even with a 0.600 SOS, the contribution to the power rating would only be about 2.7. The only other component is scoring margin. Again, if you look at the stats since 1999, you'll see that the lowest/highest scoring margins have been -11.5 and +10.2 points per game. Remember that SOS contributed somewhere between -0.378 to +0.432 points in power rating, we're seeing how scoring margin contributes about 25 times more. This suggests that over 96% of the power rating comes from scoring margin, and less than 4% comes from SOS. Basically Hollinger's power rating is just point differential with a slight tweak based on SOS. Not quite as advanced as something based on offensive and defensive efficiency.

As for the Heat, and their 116.15 power rating? So far this year, their SOS was a very high .595. Divide by .037, and you get 2.57. So, if 100 is the baseline, their SOS contributes 2.57, then their scoring margin contributes the other 13.58. If you place this in context, and look at the actual results, then the interpretation is that the Heat are the best team in the NBA because a) they've smoked the Nets and T'Wolves and Magic, and have lost two relatively close games. Now, if we were in a game where you took all of the points that a team scored in a season, and then subtracted the points that were scored on them, and then awarded a trophy to the team that had the greatest differential, then scoring margin by itself would be a great stat. But, in a game where you have discrete wins and losses, you really should capture the variance of margins in wins and losses. I'm not inclined to look in depth at creating these measures right now (mainly because I'm lazy). But you should be able to tell that once you look under the hood, this Hollinger power rating isn't quite as advanced as the the slick graphics and front-page placement on ESPN.com would have you believe.

In an ideal world, Westy would be safe in his assumption that something that makes the front page of ESPN.com as something from an "advanced stat guy" would be truly useful. Smart people would actually be putting together useful information that extends existing knowledge. Unfortunately, in our world, people have an incentive to sandbag on the truly useful stuff, and instead, we're exposed to the stuff that can fool most of the people most of the time. I'll give Hollinger the benefit of the doubt, and assume that he's got better stuff that he's keeping under wraps, hopefully because it's proprietary for some NBA team that he's consulting. In fact, he even admits that this power ranking needs some caveats. But I don't think that most folks think that it's as crude as I've (hopefully) demonstrated it to be. After all, when you see that the Heat are 116.150, and the Wizards are 88.814, you have all these decimal points that suggest that there's something smart going on under the hood.

-Chairman (aka O.N. Thugs)

13 comments:

Robby said...

Heat vs Lakers regular season wins.

Heat vs Lakers NBA finals champions.

Heat vs any other team regular season wins/Finals champions.

I'll take any of those bets for up to $2k each if you're interested.

Robby said...

Alternative bet. You have to not look at Hollinger's Rankings for 2 weeks and at that point come up with your own rankings. Feel free to use Stein's rankings. Then we use a pre-determined point system factoring in regular season final win totals and NBA Playoff finish position. I will take Hollinger vs you in this bet for up to $500, interested?

Chairman said...

Robby - with regard to the Heat v. Lakers, absolutely not. Odds are running something like 8 to 5 for the Heat, while they're 7 to 2 on the Lakers. You could arbitrage that way too easily.

As for the alternative bet, I'm intrigued. Number of questions. Some general ones:

Are you laying me odds, or is this even money?

Can we run it a few times? I.e., having me come up with a few different methods, each for a portion of the $500 bet, where maybe I'd have 5 different methodologies that we'd put up against Hollinger's Power Rankings, for some smaller amount each.

Can we separate regular season and playoff positions?

More technical questions. Does my method have to be some sort of numerical ranking, or are we doing just rank order?

For the regular season, we can do something like just measuring the correlation to actual W's. But how do you propose we measure playoff finish?

Chairman said...

Okay - nobody panic, just because we lost at home again. We only lost by 5, so we should still be on top of Hollinger's Power Rankings after we go pound Toronto on Saturday.

Robby said...

You're right in that Hollinger probably doesn't have the best stuff but I don't think any NBA team does either, it's probably some sports bettor and certainly not more than 5 teams if any are even close. In the case of the NBA it is likely Haralabos Voulgaris who took 2-3 years off of betting to pursue a job in the NBA but either nobody was willing to pay him enough or nobody was willing to give him enough control and he has decided to return to sports betting this year.

These aren't US-facing sportsbooks but was the only comprehensive link I could find and currently it looks like the best you can get on Miami is 9/4 and the best on LA is 3/1. Futures markets are not known to be the most efficient in the sportsbetting world but there is definitely a reason why Miami has shorter odds and it's not because the public think they are the favorites because LA is the huge favorite in the publics eye as well as most NBA writers/non sports-betting analysts eyes and even GMs as shown by this laughable poll (http://sports.espn.go.com/nba/news/story?id=5654644).

Your blog seemed to be arguing against the use of statistics so I feel as though the bet shouldn't allow any statistical system of your own. Would you be able to come up with different methods not derived from some sort of statistical ranking? If so, yes I would allow that. Regular season and playoff would be separated... records would be counted from after the date that the bet picks were made, not sure of exact scoring system but would have to be based on the numbers Hollinger provides (You would need to come up with your own somehow?). Playoffs I think could be handled similar to NCAA brackets and in Hollinger's case would be made by ranking 1-8 Eastern/Western conference teams and choosing the winner based on the rankings set at the time of the bet after the NBA playoff schedule is set.

Robby said...

http://odds.bestbetting.com/basketball/nba/winner/

Forgot odds link

Chairman said...

Robby - in the interest of fairness, whatever I came up with would actually be very statistical. I actually came up with a modified team-level scoring efficiency stat that seems to be a better predictor of wins than the efficiency stats that you normally see (Wins Produced, Offensive Rating, etc.) Ryan and I had a discussion about this stuff about 2 years back.

My point isn't that stats are bad. My point is that when you sell very simple stats (really, what Hollinger is doing is just taking point differential, and tweaking it slightly) as something that they aren't, then you're being disingenuous. I'm actually OK with using box score stats to evaluate team-level performance. I'm just arguing that the way Hollinger's Power Ranking does it is crude.

Some simple methods that I'd bet would be awfully close (no statistically significant difference) would be a) scoring margin, b) scoring margin adjusted for home/away, c) scoring margin w/ a cap of 15 points max per game, d) scoring margin with an additional penalty for losses, e) RPI, f) winning percentage.

Other more complicated methods would look at offensive and defensive scoring efficiency (basically points scored/allowed per possession) with some minor adjustment for pace. If I recall correctly (and maybe Ryan can refresh my memory), the various efficiency and team "quality" ratings had something like a .80 correlation w/ winning (a pretty high number, actually). I think that the methodology that I came up with was a slight improvement (maybe a .85) across the 2 or 3 seasons that I calculated it for.

Westy said...

Note that Hollinger's certainly not the only one ranking the Heat ahead of the Lakers:
link

Robby said...

Ok, I had a slight misread of the situation but alright, I'll bet on Hollinger vs your methods if you would like. What do you think of the basketball-reference rankings? Can I take both Basketball-Reference.com's rankings and Hollinger's rankings vs 2-6 of your own rankings separately?

What are your personal thoughts on LA/MIA/other contenders?

Chairman said...

I'm actually more comfortable with the basketball-reference.com ratings, since they're using data on possessions from the ground up. As far as betting against their method? I'm much less interested in doing so :-) That bet would be purely an empirical question, whereas the comparison with Hollinger's Power Ranking is very much a conceptual difference.

What I'd throw out there would be awfully similar to the BBR rankings, with only minor tweaks (maybe some different weightings for team "momentum," maybe some different coefficients). It would be strange if there was any statistically significant difference. And if anything, they'd have access to better coefficients based on historical data than I would.

That said, my working hypothesis is that if you had a simple way to track who the possessions came against, and then weighted the efficiency numbers (overweighted for possessions against good teams' top lineups and underweighted against bad teams and good teams backups), you'd have a better playoff predictor. At that point, you could just optimize the algorithm based on historical performance to figure out how much to over/under weight results. Unfortunately, I don't know of a really easy place to get that data.

How does this sound? I'll have a running item on the UPL Blog, starting in early January (so that we can get a little more into the season), where I'll list Hollinger's rankings and my DelTSC statistic (Delta True Scoring Efficiency), and we'll compare regular season win totals (correlation across all NBA teams). If Hollinger beats me, I'll donate $25 on behalf of the UPL to a charity of your choice and buy you a burger next time we cross paths. If I beat Hollinger, you do the same.

What will be more interesting is that I'll also put up some "stats for idiots" up there (average scoring margin, winning percentage, etc.) along with some "advanced stats" (BBR's rankings, Wins Produced, or whatever else I can easily find). My suspicion is that the "stats for idiots" won't be much worse off than some of the "advanced stats," in a statistically significant (p<.05) way.

Robby said...

Sounds good.

Robby said...

http://espn.go.com/blog/truehoop/post/_/id/21723/blowing-out-bad-teams-matters

Chairman said...

Robby - the original article is:

http://www.basketball-reference.com/blog/?p=8159

That analysis would get nowhere within sniffing distance of being published in anything resembling peer-reviewed.

The easiest way to test the hypothesis with any sort of rigor would be to test various models that incorporate scoring margin. The way to do it would be to measure the difference in the effectiveness of the model when you cap off the scoring margin of any given win/loss. The question of how much it matters is an empirical one - just optimize to see what cap-off gives you the best predictor, based on historical data.