I've been working on this for some time now. I'm planning to writing this up and submitting it to the Journal of Quantitative Analysis in Sports. I'll publish a more thorough explanation of my methodology within the next couple of weeks. For now, he's the quicky version.
The inputs to the model are:
Strength of Schedule (shamelessly lifted from Sagarin)
Conference Average Margin of Victory
Team Margin of Victory
Scoring Defense (points per game)
Scoring Offense (points per game)
TO Margin per game
All data other than strength of schedule is directly downloaded from the NCAA stats repository. The raw statistics are normalized by subtracting each team's statistic from the average of all 120 teams and dividing that by the standard deviation of the 120 teams.
Using data from 709 played in 2011, a multiple regression model was performed with the 6 inputs as the independent variables and a teams actual margin of victory in a game as the independent variable. Each game was entered in the model twice...each team is evaluated independently. This provided a regression model with 1417 df.
The base model has an r-squared of .636.
With the computed regression equation I can run a simulated season in which every team (i) plays every other team (j). A positive predicted y-value means that a team is predicted win, a negative predicted y-value means that a team is predicted to lose. There are 14280 (i,j) combinations. After the model is run, all that is left to do is total up the number of predicted wins to establish a 1-n ranking of all 120 teams. Using the data from 2011, the following is how my model ranked all 120 BCS teams. I'm pretty comfortable with the results.
My plan is to publish a weekly ranking this Fall.
|Team||Rated higher than X teams|
|San Diego State||51|
|North Carolina State||49|
|San Jose State||24|
|New Mexico State||11|