Update Dec 16, 2002. I wrote this about 5 years ago and pretty much had not even reread it until now. This is just a general description. The method of summing z-scores of statistics has not changed. Which variables I include in the sum may have changed. For example I do not use passing yards, I now use team quarterback rating. The NFL currently uses more inputs than NCAA. I find it interesting to test different rating methods. Hence, the creation of the Prediction Tracker. It is my intention to write up a brief description of each model: performanz, elo, least squares regression, least absolute value regression, logistic regression, scoring effeciency, pythagorean, etc, just enough to let you know what each is doing mathematically. When I get that done remains to be seen. PerformanZ got it's start a little differently than many of the other computer systems that are around today. What interested me was the non-transitivity of football. Team A beats Team B, Team B beats Team C, and Team C beats Team A. Which is the better team? The other thing I was interested in was comparing the results (predictions) of different types of systems. The obvious first choice for a model would be least squares regression. This would work best if team played each other both home and away (Like the NBA perhaps?) I think least squares is a little more appropriate modeling the NFL than I do college football, where there are over 100 teams and you have about 10 leagues that have little interaction if any with each of the other leagues. Least squares(or least absolute value) reqression makes more sense in the setting of the NFL, which has(had) 30 teams with more interdivisional linkage than college football. So I took least squares to be my standard for comparisons. I then looked to see if I could come up with something totally different that would have an accuracy at least as good as least squares. Another thing I would like from a system would be that it be able to predict games more accurately than the vegas line. Now, trying to come up with a system that can beat both least squares and the vegas line is not an easy task. My first thought went back to the non-transitivity question, other people will give you different answers when they are asked why Team B is rated above team A even though team A has beaten Team B. To me the answer to that is so simple and obvious that people tend to forget about it. Simply put, the better team does not always win. Whether it is from injuries, weather, overconfidence, or turnovers, it certainly happens that in football and all other sports that there are times when the better team loses to the lesser team. And what that implies is that the scoreboard may not always be the best place to turn when you want to compare two teams. This flys in the face of all the other systems that believe winning is the only thing that matters. If that was the case why not just rank teams according to winning percentage and break ties based on difficulty of schedule. Where else can we look other than the scoreboard? Well I am one of those that believe in 'winning on the field'. I believe that in the majority of all games the team that plays better on the field wins the game. So most of the time the scoreboard is a good place to look. But where I differ from the 'just win baby' crowd would be that in my mind a team can still 'win' even if they lose. The better team may have been the better team during the game, just not on the scoreboard at the end of the game. So I decided to take a look at game performances. These are the factors in my college PerformanZ Ratings. The factors are similar for the NFL, and in basketballimportant game statistics such as points and rebounds can be used. 1 Team winning percentage. 2 How well a team can score points. 3 How well a team can stop the other team from scoring. 4. How well a team runs the ball. 5. How well a teams stops the run. 6. How well a team throws the ball. 7. How well a team stops the pass. 8. Turnovers I use the team yardage data, such as offensive rushing yards per carry, to measure a teams offensive running ability. These things get blown out of proportion if a good team plays a bad team, therefore they are weighted by a difficulty of schedule factor. Thus, Nebraska is not rewarded for piling up 500+ rushing yards against a weak team like Pacific. Each of the eight factors are then transformed to a standard normal distribution (they are already approximately normal to begin with). Now all the factors are on the same scale. Summing the eight factors gives a raw total. Teams can then be ranked based on these totals. Generally the higher ranked teams are teams that are good all across the board. A team with a great offense won't be at the top unless they also have a good defense. The totals are just meaningless numbers but by mapping them to a points per game scale I am able to make comparisons to the point spread and make game predictions. So basically I feel that PerformanZ is a measure of a teams true ability through their games to date. For example, in 1997 I had a 1-2 UCLA team in the top 5, which seems very odd. But then UCLA proceed to win all of their remaining games and finished in the top 5 in the national rankings. So it doesn't necessarily take long for teams to reach their true ability. And even if a team is losing it is still possible that they are a very good team and would have a legitimate chance to win any form of playoff tournament. I'll get a handful of team like this every year that deviate from where the consensus would place them. I think these are the teams to look at closely. These teams are often teams that are over or under rated by the public at large. Or they could just be the team with great players but some other factor, such as coaching, keeps them from reaching the winning potential. PerformanZ is created to measure past ability but obviously past ability is the best predictor of future ability. So how well does this system predict? Last season in the rec.sports.football.college college football pool (cfpoool.com) I was the highest rated system of all. I was 73.7% on their preselected matchups. Compared to the BSC systems, Sagarin 70.6% and Dunkel 68.2% (I encourage everyone with a system to enter their picks as a sytem this year.) In 1998 my straight up picking percentage for all Div IA was 80.7% and was 70.4% for NFL games. The normal approximations work better with the 112+ teams in NCAA than the 30+ of the NFL. That and the parity in the NFL explains why there is such a large difference. Notice from my 1998 NFL bias/variance plot that my PerformanZ ratings make unbiased predictions. That is, when I make predictions for next weeks games, on average I am off by 0 points. Ideally the best system would be one that is unbiased (the vegas point spread was shown to be unbiased in Stern, 1991) and also have the smallest variance. Also, notice from the bias/variance plot how large the standard deviations are. They are about 14 points for all of the NFL systems measured. That is huge. So if Vegas says Dallas is a 3 point favorite over Chicago, a 95% confidence interval on that game ranges anywhere from Dallas winning by 31 to Chicago winning by 24. So that gives you an indication of why a team can be ranked higher than a team they have lost to. The ratings/lines may be correct on average but the variances are extremely large. For college football these variances are even larger than the 14 points seen in the NFL.