FAQ
Q: What does 'CJB Ratings' stand for?A: My name is Chris J. Breisch. You can probably figure out the rest.
Q: Your ratings are goofy! How can you possibly have [PatheticTeam] in the top 5?A: Depending on what you're viewing, it could be too early in the season. Most of the ratings take about 5-6 weeks to gather enough data to start producing ratings that make sense. If it's past that and you still think the ratings are wrong, perhaps it's you. :) However, if you doubt that, let me know. I'll try to look into it and see why [PatheticTeam] is rated so highly.
Q: How are your ratings calculated?A: All of the ratings are calculated using a Colley Matrix as described here. I do a few things differently from Colley, so my outputs are a bit different. Also, as far as I know, Colley only produces one rating. Colley's is comparable to my "Win" rating. The algorithm for the "Win" rating is almost exactly what is described in chapter 4 of Dr. Kenneth Massey's article on sports ratings (pdf). The only significant difference is that I add a home field factor as well. For the other ratings, the Colley matrix is just the first step. Later steps minimize the effect of surprises.
Q: I've heard people talk about "Elo Chess" ratings. Do you do that?A: No, but I will probably add an "Elo Chess" based version of my "Win" rating at some point in the future. Elo is an iterative calculation opposed to the matrix method promoted by Colley, and originally by Massey (I believe Massey no longer uses the matrix for his BCS ratings, but don't quote me on that). Many people consider the Elo method superior for Win-only based ratings. I'm not wholly convinced, and I do know that some of the post-processing I do would be more difficult with Elo. Wikipedia has a good write up of Elo.
Q: Wow, Sagarin, Massey, Colley...I had no idea there were so many different rating systems. Why is yours any better?A: Well, I can't say that it is. Certainly they've all been doing this longer than I have. Sagarin doesn't talk much about what goes into his ratings, but Massey has written and spoken many times over the years about his and rating systems in general. I believe Colley's method is published in detail. I have started with some of the information available about their systems (and others), and produced my own, so I believe that I have improved upon their methods. As far as I know, none of the others produce projections, so at least the quality of mine can be verified. If you're interested in just how many rating systems there are out there, visit Massey's site. He tries to keep track of all of the published ones.
Q: How do you do your projections?A: In addition to the ratings calculation, I calculate a standard deviation for each team, by looking at each individual game and comparing their performance in the game to what would be expected based upon their rating. I then use the rating and the standard deviation to run Monte Carlo simulations of each game. For each simulation run, I save data on which team won, by how much, and what the total score was. I can then compare these results to the Vegas lines, and project which team will cover, and whether the final score will be Over or Under.
Q: Do you include a recency effect for trending?A: Recency effects are a big issue in sports ratings. Teams do change over the course of a season, sometimes significantly. However, significant changes are rarer than you might think, and in my mind adding recency effects may diminish the quality of the rating. There are two problems. The first is that anytime you try to weight a subset of the games, you're drastically reducing the amount of data you have available, thus reducing your accuracy. The second is the problem of surprises. Sometimes a team will play way over its head or way below its norm for a game or two. You see this quite often in college basketball where you can pretty much guarantee that every team will have 2 or 3 "clunkers" in a given season. An ideal rating system would maximize the effects of trends but minimize the effects of surprises. The solutions to the problems aren't diamatrically opposite, but they're not parallel either. For lack of a better term, I call the solutions perpindicular. If a team plays outside the norm for a couple games, is that a trend or a surprise? My ratings consider those games to be surprises, and I make some small tweaks to minimize the effects of those games on the overall rating. If the team continues to play at the new level, eventually those games will be considered the "right" ones, and the previous games will be the surprises. So, my ratings are slow to react to trends, but I believe are more accurate in the general sense.
Q: What about capping margin of victory?A: I do not cap margin of victory, exactly. But there is a diminishing returns effect, at least in MLE. Essentially the magin of victory is applied along a logarithmic curve. Therefore, winning 50-10 is more significant than winning 40-10, but it's not very much more significant.
Q: Can you be more specific about how your ratings and projections are calculated?A: No. :)
Q: How long have you been doing this?A: I just stumbled upon the e-mail I sent out after I produced my first rating. It was on October 10, 2008.
Q: If you've been doing this since 2008, how come I've never heard of you before?A: Well, 2011 is the first year that my ratings have been published anywhere. I started out with an Excel spreadsheet, which quickly morphed into a Windows C# application with an Access back end. This year I finally took the plunge and rewrote the application in a method that can be published to the WWW. This year's version is almost a complete rewrite from the ground up of the Windows/Access application.
Q: You're weird! Why do you do this?A: I'm a numbers guy. I've always felt that you could apply math to sports to determine quality of teams. I wrote my first one for the NFL when I was about 15. I've tinkered with them off and on since then for about 25 years, before getting serious about it.
Q: What are your plans for the future?A: My future plans include producing an "Elo Chess" style version of my Win ratings, as well as adding more sports. For now, the sports that I'm considering adding are NBA, Horce Racing, Division II college football and basketball, tennis, hockey, soccer, and baseball. It may be quite some time before all of these things get added, and I don't guarantee the order. That's basically the order that I expect, but each sport poses at least one new problem, and it may depend upon when inspiration hits me on the remaining problems. However, the hard work has been done. The rating system is general purpose enough that it can handle most anything I might like to throw at it. The sport specific problems are relatively trivial in comparison to the problems which have already been solved.
Q: The computer component of the BCS is ridiculous. Do you really want to compare yourself to THAT?A: The computers used in the BCS do occasionally produce some rather odd results. This is the fault of the powers-that-be behind the BCS, not the rating systems themselves, however. There are two big problems with the BCS ratings: 1) a few years ago they got rid of the MoV based computer ratings and now only consider Win-based rating systems, and 2) there aren't nearly enough computers being used. Instead of 6 computers, there should be 60. That would minimize the effect of outliers and give much higher quality results.
Q: Hey, TeamA beat TeamB, but TeamB is rated higher in your system! What's up with that?A: My ratings look at the season in its entirety, not just one game. And even if it didn't, it would be impossible to rate head-to-head winners ahead of losers in all cases. What if TeamA beats TeamB who beats TeamC who beats TeamA? Who's ahead? Having said that, what you are discussing is called a ranking violation. There are adjustments that could be made to my Win rating to minimize these ranking violations. I'm not doing them currently, but will likely some day. Elo is better about minimizing ranking violations than matrix methods.
Q: I went to that Massey site you linked, but now I'm more confused than ever. What are retrodictive and predictve rankings, and which category fits yours?A: Sports rating systems are designed by geeks, and geeks like to make up words. Predictive rankings are exactly what they sound like, rankings that can be used to predict the outcomes of future games. Retrodictive rankings reflect what has happened in the past, with no attempt to predict the future. Rating systems based on Win/Loss only are highly retrodictive. Adding in factors that include head-to-head matchups, as discussed in the previous question, make them more retrodictive. Ratings systems that use MoV or other factors are more predictive. My MLE system is highly predictive and is the most predictive rating system I produce. That's why I like my Composite rating for a ranking of "how good" teams are. It is a blend of a highly predictive rating and a highly retrodictive one.