Tuesday, February 11, 2014

All FBS Recruits from 2002-2014

The map shows recruits for the last 4 years.  Use the filters to make the map more usable.

The size of the logo corresponds to the number of stars given the recruit by Scout.com.  I assigned 1 star to all unranked recruits...Scout.com does not identify recruits as less than 2 stars.

This visualization is copyrighted material.  You're welcome to use it as you'd like, but please credit me, Paul Dalen, as the author and provide a link back to this blog.

Monday, October 21, 2013

HuskerMath's FBS Power Ranking Win Percentage Matrix

Use the filters at the bottom to make the matrix a bit easier to handle, as well as easier to load.

Friday, April 12, 2013

Which Teams Overachieved and Which Teams Underachieved in 2012?

Tonight, while working putting together the final validation of the 2013 FBS Prediction Model, my analysis of the results highlighted some very interesting, and kind of disconcerting, things about Nebraska's 2012 season.

The model simulates every FBS game for the entire season.  

As validation, I used the methodology to simulate the 2012 season and prepared a comparison between between the model results and the real game results.  

This first chart shows some of those results of the predicted versus actual wins. 

A positive number of the vertical axis means the model predicted more wins than a team achieved.  Likewise, the negative numbers mean the model predicted fewer wins than the team achieved.

The average delta between the predicted win total and the actual win total is 1.15 games per team. 

21% of teams were predicted correctly.  20% were under-predicted by one win, and 29% were over-predicted by one win.  The model predicted more than 70% of seasons accurately to within one win.

The model under-predicted 12% of season by two wins and 5% by three wins.  It over-predicted 7% over season by two wins and 6% of season by three wins.

Can you guess which team was the most over-achieving?  

You betcha. Dear Old NU.  

The model predicted 6.91 wins but the Huskers finished with 10 wins, for a delta of -3.09 wins.  

While some may be inclined to see this as a positive, and it's certainly better than winning three fewer games than the model anticipated, the fact remains that Nebraska's scoring offense and scoring defense (the basis of the model) should have resulted in a 7-win season...not a 10-win season.

Remember the amazing streak of 4th quarter heroics in the Wisconsin, Northwestern, and Michigan State, and Penn State games?  Remember how Denard Robinson left the game before halftime?   

Sometimes, it's better to be lucky than good.

Interestingly, Ohio State is the #2 most overachieving team, with a predicted win total of 9.12.  Had they played in the Big Ten Championship Game or a bowl game, they would almost certainly have been the most overachieving team by a large margin.

The Top-10 overachievers and Bottom-10 underachievers are:


I put a table with the full results of the model at the end of this post.

Finally, the model provides some insight into the consistency of a team's on-field performance.  The standard deviation of the predicted wins can be used as a proxy for a team's consistency.

So, can you guess which team had the largest standard deviation in the model results, and by proxy, was the least consistent and predictable?  

Yup, Dear Old NU.  Again.

Now, guess which team had the lowest standard deviation in model results.  

Hint:  they won the National Championship.

I'll let my readers draw their own conclusions about this one.

By the way, I'm now a writer over at Football Study Hall.  Stop by and check it out.  I'll be writing about more than just the Huskers over there.

GBR!
@HuskerMath







RankTeamConfPred. WinsActual WinsDelta (rnd)Pred. DeltaAbs( Pred Delta)
1Northern IllinoisMAC13.11211.11.1
2AlabamaSEC12.7130-0.30.3
3Florida StateACC12.51210.50.5
4Utah StateWAC11.21100.20.2
5OregonPac-1211.012-1-1.01.0
6UCFC-USA11.01011.01.0
7GeorgiaSEC10.912-1-1.11.1
8Boise StateMWC10.8110-0.20.2
9Arizona StatePac-1210.7832.72.7
10CincinnatiBig East10.61010.60.6
11TulsaC-USA10.5110-0.50.5
12BYUInd10.5822.52.5
13Texas A&MSEC10.411-1-0.60.6
14Kansas StateBig 129.911-1-1.11.1
15North CarolinaACC9.8821.81.8
16RutgersBig East9.8910.80.8
17ClemsonACC9.811-1-1.21.2
18Oklahoma StateBig 129.8821.81.8
19Bowling GreenMAC9.7821.71.7
20OhioMAC9.7910.70.7
21VanderbiltSEC9.7910.70.7
22San Jose StateWAC9.611-1-1.41.4
23Fresno StateMWC9.6910.60.6
24StanfordPac-129.612-2-2.42.4
25Kent StateMAC9.511-1-1.51.5
26WisconsinBig Ten9.4811.41.4
27South CarolinaSEC9.211-2-1.81.8
28Notre DameInd9.212-3-2.82.8
29Arkansas StateSun Belt9.210-1-0.80.8
30Ohio StateBig Ten9.112-3-2.92.9
31San Diego StateMWC9.1900.10.1
32Penn StateBig Ten8.8810.80.8
33FloridaSEC8.711-2-2.32.3
34LSUSEC8.710-1-1.31.3
35Oregon StatePac-128.790-0.30.3
36Louisiana-LafayetteSun Belt8.690-0.40.4
37NorthwesternBig Ten8.610-1-1.41.4
38OklahomaBig 128.510-2-1.51.5
39MichiganBig Ten8.3800.30.3
40UCLAPac-128.29-1-0.80.8
41Louisiana-MonroeSun Belt8.2800.20.2
42PittsburghBig East8.1622.12.1
43Western KentuckySun Belt8.1711.11.1
44LouisvilleBig East8.111-3-2.92.9
45Mississippi StateSEC8.1800.10.1
46RiceC-USA7.9710.90.9
47TCUBig 127.8710.80.8
48SMUC-USA7.8710.80.8
49Louisiana TechWAC7.79-1-1.31.3
50USCPac-127.7710.70.7
51NevadaMWC7.6710.60.6
52Michigan StateBig Ten7.5710.50.5
53SyracuseBig East7.48-1-0.60.6
54North Carolina StateACC7.4700.40.4
55ToledoMAC7.29-2-1.81.8
56East CarolinaC-USA7.28-1-0.80.8
57NavyInd7.28-1-0.80.8
58Air ForceMWC7.2611.21.2
59Georgia TechACC7.1700.10.1
60TexasBig 127.19-2-1.91.9
61Texas TechBig 127.08-1-1.01.0
62NebraskaBig Ten6.910-3-3.13.1
63Western MichiganMAC6.8432.82.8
64UTSAWAC6.68-1-1.41.4
65Central MichiganMAC6.670-0.40.4
66New MexicoMWC6.5432.52.5
67Virginia TechACC6.47-1-0.60.6
68ConnecticutBig East6.4511.41.4
69Iowa StateBig 126.4600.40.4
70WashingtonPac-126.37-1-0.70.7
71Middle TennesseeSun Belt6.38-2-1.71.7
72Ball StateMAC6.39-3-2.72.7
73BaylorBig 126.28-2-1.81.8
74MississippiSEC6.27-1-0.80.8
75MinnesotaBig Ten6.2600.20.2
76TroySun Belt6.1511.11.1
77UtahPac-126.1511.11.1
78MemphisC-USA5.9421.91.9
79IowaBig Ten5.6421.61.6
80Miami (Florida)ACC5.47-2-1.61.6
81West VirginiaBig 125.37-2-1.71.7
82PurdueBig Ten5.26-1-0.80.8
83ArizonaPac-125.28-3-2.82.8
84Texas StateWAC5.1411.11.1
85HoustonC-USA5.1500.10.1
86WyomingMWC4.9410.90.9
87South AlabamaSun Belt4.8232.82.8
88MarshallC-USA4.850-0.20.2
89UNLVMWC4.8232.82.8
90VirginiaACC4.8410.80.8
91MarylandACC4.8410.80.8
92North TexasSun Belt4.7410.70.7
93Colorado StateMWC4.7410.70.7
94BuffaloMAC4.6410.60.6
95IndianaBig Ten4.5400.50.5
96TennesseeSEC4.45-1-0.60.6
97UTEPC-USA4.4311.41.4
98Florida InternationalSun Belt4.3311.31.3
99UABC-USA4.3311.31.3
100DukeACC4.26-2-1.81.8
101AkronMAC4.0133.03.0
102ArmyInd4.0222.02.0
103South FloridaBig East3.9310.90.9
104TempleBig East3.940-0.10.1
105Boston CollegeACC3.8221.81.8
106Florida AtlanticSun Belt3.8310.80.8
107MissouriSEC3.55-2-1.51.5
108ArkansasSEC3.54-1-0.50.5
109AuburnSEC3.4300.40.4
110Wake ForestACC3.35-2-1.71.7
111Miami (Ohio)MAC3.34-1-0.80.8
112Washington StatePac-123.1300.10.1
113CaliforniaPac-122.830-0.20.2
114HawaiiMWC2.730-0.30.3
115KentuckySEC2.5210.50.5
116New Mexico StateWAC2.4111.41.4
117IllinoisBig Ten2.4200.40.4
118Eastern MichiganMAC2.3200.30.3
119TulaneC-USA1.820-0.20.2
120KansasBig 121.8110.80.8
121Southern MississippiC-USA1.7021.71.7
122IdahoWAC1.3100.30.3
123ColoradoPac-120.910-0.10.1

Saturday, April 6, 2013

Jack Hoffman's Touchdown Run by the numbers

Jack's 69 yards per carry is at least 11 standard deviations above the mean for 7-year olds.  What more can you say?




Wednesday, March 27, 2013

Building NCAA Tournament Bracket Spreadsheets

Recently I wanted to do some analysis on the NCAA tournament brackets, but I couldn't find a simply source of data.  There are lots of .pdfs of previous year's brackets on the internet, but that doesn't convert to Excel very well.
So, let's do something about it.

I have a spreadsheet template built on one designed by the great folks at Vertex42.com that does a lot of the calculations.  We just have to input the first round's games and the scores for the following rounds.  I built spreadsheet brackets for 2000-2012.   I want to build brackets all the way back to 1985.  

When we do, we'll make them available for anyone to use.  There's great analytics here..we just need to get the data into a usable format.

So, download the .zip file.  Open the template file, input the first round games on the second tab, then flip back to the first tab and add score for the rest of the rounds.  The spreadsheet will populate the winners into the next round and add the info to the game summaries at the bottom.  Remember, make sure you get the seeds right...the box scores usually have the winner first.  We need to make sure the correct seed is first.


If you want to complete a year, please add a comment below about which year you're working on.  When you're done with a spreadsheet, email it to me at paul@huskermath.com and I'll add it to the .zip file.

I've been using the CBSSports site to fill in first round data.  

Thanks for the help!

Paul

2000-2012 - complete
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
- complete
1985- complete

Monday, March 25, 2013

A crazy Sweet 16?

This year saw the first #15 seed, Florida Gulf Coast, advance to the Sweet 16.  In all, it feels like an absolutely crazy year, with one upset after another.  

But is the current Sweet 16 lineup really that much crazier than past years?


If the top 4 seeds in each region advance to the Sweet 16 then the average seeding in the tournament is 2.5.  So, an average seeding higher than 2.5 indicates that a lower seeded team advanced in place of one of the top four in each region.



This chart shows two data points for the last 11 Sweet 16 rounds.  The line in blue is the average seed of the 16 teams playing each year.  The red line marks a hypothetically perfect year...a year in which the 1, 2, 3, and 4 seeds in each region advance to the Sweet 16.

Over the 11 years, the average seed of the Sweet 16 round is 4.35.  From 2007-2009 the average seed decreased significantly, but then increased dramatically in 2010, 2011, dropped a bit in 2012, and then reached an 11 year high this year.  So maybe you aren't imagining things.  There really is a #15 seed in the Sweet 16, and it really is crazy.


Each year the NCAA publishes bracket with 32 teams on each side.  Although the names of the regional have changed depending on the locations, the design of the bracket doesn't.  If I give each bracket a number looking like this I can break down the averages even further.




Remember, the average seed for a 'perfect' region is 2.5.  From 2003-2013 there have been only five 'perfect' regions in the Sweet 16, and two of those five were in 2009.  


So, next year, when you plan your brackets before the tournament starts, you might want to remember this...pick at least one team in the top four of each region to lose in the first or second round.  



This year's West Regional, which appears in this chart as Region 2, had the 2nd highest average seed of the 11 years. 

And FGCU wasn't even playing in the West.  It is is the South, or Region 3 on the chart for 2013.



If I assign a 'bracket position' to each game, numbered 1-16, it looks like the chart to below.



Returning to the idea of a 'perfect region', for each bracket position, there is an expected seed.  And from that we can calculate a variance each year from the expected or perfect seed.  


Over the 11 years, the delta between the team actually sitting in the bracket position and the expected perfect seed, looks like this:












See that bright red -13 on the bottom line?  Yeah, that's FGCU.  

The green zeroes represent bracket positions where the expected team (#1, 2, 3, or 4 seed) advanced to the Sweet 16.  The average number of expected seeds advancing over the 11 years was 9.9.  In 2013, 10 expected seeds advanced.  


Based on that, one has to conclude that FGCU's advance to the Sweet 16, while very Cinderella-ish, is largely responsible for the 11-year high average seed in the Sweet 16 for 2013.  


So, how did your bracket turn out?  Raise your hand if you picked FCGU in the Sweet 16.


Liar.


GBR!




Tuesday, March 12, 2013

Comparing Nebraska's and Alabama's Dynasties

The first part of this analysis was published on CornNation.com in January.  The second and third parts, however, have not been published yet.

Intro

There are myriad ways to compare the 5-year dynasties that Nebraska and Alabama have put together.  Some indicate Nebraska’s was more impressive, others indicate that Alabama’s was moreso.   This is my attempt to compare the two using defensible statistical analysis.  It is not the final word on this issue; I’m doing it simply to attempt to get at the question of who accomplished more during their 5-year run.


I organized the analysis along offense, defense, margin of victory, win-loss record, and strength of schedule.

Where I state that there is sufficient evidence to conclude 'X', the statement is based on standard hypothesis testing (t-tests) and evaluated at the alpha=.10 level of significance.  The conclusions I draw regarding win/loss records and the overall conclusions are subjective and not based on hypothesis testing.


Offense 1


This first chart illustrates the average points that Nebraska and Alabama scored when you categorize opponents by end of season ranking.  For simplicity sake, I used the end of season Congrove Composite Index.  Using the end of season ranking is better because it’s actually available and goes a long way towards to identifying teams that were ranked at one point in the season but should not have been or who were unranked or lower ranked but proved to be better that season.

 The number next to each point on the graph is the number of games Nebraska and Alabama played against teams of that rank.

As you would expect, as the opponents’ rank goes down, the average score that Nebraska and Alabama scored climbs.  For opponents of all ranks, Nebraska’s average points scored is markedly higher.  For all ranks, the average Nebraska score is 42.82 and the average Alabama score is 33.84 There is sufficient evidence to conclude that Nebraska's offense was superior to Alabama's.

Offense 2

This next chart compares Nebraska and Alabama scoring as a percentage of the average scoring allowed by their opponents.  While this is much the same as the chart above, it factors in the additional information of their opponent's defenses.  Obviously, if Nebraska's average scoring difference came because it played a 5-year slate of defensive duds then the argument that Nebraska scores more points is suspect.


Considering Nebraska and Alabama scoring as a percentage of their opponents’ scoring defense, one would expect that the percentage would remain basically steady, or show a slight increase as the rank of an opponent decreases.  This, however, does not seem to be the case.  If anything, there is a slight negative correlation between percentage scored and opponent rank.  It’s difficult to say why this is the case, but I would speculate that it’s because 150% of an opponent ranked 1-10 is, in real points, much less than 150% of an opponent ranked 100-110.  These games would be the times that 3rd and 4th string is played, which sometimes leads to offensive mistakes and ‘garbage time’ points for the opponent.

The flatter trend of the Alabama line may indicate that Alabama  as more consistent on offense than Nebraska over the five years.  While they did not put up the sheer number of points that Nebraska did, their offensive production was remarkably steady.  Nebraska, on the other, was less consistent, and their performance against teams ranked 101-110 actually underperformed Alabama’s comparably ranked opponents.  This notwithstanding, there is sufficient evidence to conclude that NU's offense was superior to Alabama's.


Margin of Victory



As one would expect, the average margin of victory by Nebraska and Alabama increases as their opponents’ rank decreases.  Both show a steady and reasonably linear relationship between average margin of victory and opponent rank.  The exception is Nebraska’s average margin of victory against opponent’s ranked 21-30.  This is because there are only two games here, and Nebraska lost one of them, leading to a much smaller average.

Generally, we can state that both teams did what great teams as supposed to do…they consistently beat other teams up.  As their opponents’ rank decreases, those beatings are more severe.  It’s worth noting that Nebraska's average margin of victory against teams ranked 1-10 was 2.5 times that of Alabama’s (17.5 vs 6.3).  Against teams ranked 11-20, Nebraska’s average margin of victory as almost three times that of Alabama, (20.0 vs  7.3). For opponents of all ranks, Nebraska’s average margin of victory was 28.2 points and Alabama's was 22.0 points.  There is sufficient evidence to conclude that Nebraska had a greater average margin of victory than Alabama.


Defense 1

This next chart illustrates a very similar comparison to Offense 1, but it compares the defenses that Nebraska and Alabama put on the field by measuring the average opponent score, again broken down by end of season rank.  As above, the numbers on the chart indicate the number of teams Nebraska and Alabama played in that rank group.




For teams ranked in the top-10, the difference in scoring defense is small. For Nebraska, it’s 18.2, for Alabama it’s 20.2. For most other opponent rank categories 
Alabama as a slight performance advantage, ranging from about 3-7 points.  For opponents of all ranks, Nebraska's opponents averaged 14.41 points and Alabama’s averaged 11.82 points. There is insufficient evidence to to identify defense as better than the other over the entire five years.



Defense 2

 The data points illustrated in the next chart are the average opponents’ score as a percentage of Nebraska and Alabama season scoring defense. A lower percentage indicates a better performance for Nebraska and Alabama .




At first blush, I would have expected there to be a negative correlation between the average percentage score by an opponent and the opponent’s rank…better opponents should do better offensively against Nebraska and Alabama than crummy opponents should. Both teams’ opponent scoring shows this general trend, with a decreasing effect as the quality of opponent decreases. This might be explained by the fact that games against far inferior opponents present opportunities to play the 3rd and 4th string, often meaning the opponent has opportunities to score that would not have otherwise been presented had the starters remained in the game.

Nebraska’s defensive performance shows no obvious correlation between rank and opponent points.  Alabama, shows a strong negative correlation between Opponent rank and the percent of scoring they allowed their opponents.  In other words, Alabama held inferior opponents to well under their season scoring offense average but gave up points to highly ranked teams.

For opponents of all ranks, 
Nebraska’s opponents' average score as a percentage of NU’s scoring defense was 102%.  Alabama’s opponents' average score as a percentage of Alabama’s scoring defense was 101%.  As with Defense 1, there is insufficient evidence to conclude that one defense is better than the other.

Wins and Losses

This one is simple: Nebraska had a better overall win-loss record (95.2% vs 89.6%) and a much better winning percentage versus top-20 teams (95% vs 83%). Nebraska  also had three undefeated seasons while Alabama had one defeated season. Only twice did Nebraska's winning percentage dip to 92% for the season (’93 and ’96). Alabama had four seasons at or below 92% (’08-86%, ’10-75%, and ’11-92%). 




For Alabama , six of their seven losses (86%) were to teams ranked 1-20 (UF-2008-#1, Utah-2008-#4, Auburn-2010-#2, LSU-2011-#2, LSU-2010-#11, and A&M-2012-#5) while two of Nebraska’s three losses (67%) were in the top 20 (FSU-1993-#1, ASU-1996-#4). Alabama  and Nebraska both lost to one team ranked 21-30 (South Carolina-2010-#27 and Texas-2006-#25)

 Looking at where losses occurred, Alabama lost three at home, two in Bowl/CCGs, and two away. Nebraska lost zero at home, one away, and two in Bowl/CCGs.). 




Though it is a subjective assessment, NU's three undefeated seasons, zero home losses, and the same number of bowl and conference championship game losses is sufficient to conclude that NU's win-loss record is better than Alabama's.

Strength of Schedule

Considering the season average rank of Nebraska’s and Alabama’s opponents, Nebraska’s opponent's five year season average is 48.29 and Alabama’s is 53.82.  There is sufficient evidence to conclude that NU’s average season opponent ranking was more difficult than Alabama’s. It follows, therefore, that Nebraska’s dynasty was established during seasons of greater difficulty than Alabama’s.


Combined with Nebraska’s better win-loss record, this may present the strongest evidence that Nebraska’s dynasty was more impressive than Alabama’s. It’s hard to ignore that Nebraska's three undefeated seasons were all more difficult than the average of the 10 seasons considered, while Alabama’s were less difficult than the average of the 10 seasons considered. Alabama’s one season more difficult than average was 2010…the season in which it lost three games and finished with a .75 win-loss record. Finally, the most recent two seasons, in which Alabama claimed its back to back National Championships, were the two least difficult seasons considered in this analysis.


Conclusion 

Nebraska was better on offense; neither team demonstrated a clear superiority in defense; and Nebraska had a better win-loss record and a stronger average strength of schedule. 

I'll allow my readers to make the final conclusions.  Who has the best 5-year dynasty?