As I've said before, I'm keeping detailed records of how teams have performed relative to what the rankings indicated, in all four leagues where I'm doing ratings this year. Right now I haven't gotten data from week 1 or 2 of either AAA league into the spreadsheet, and I didn't calculate ratings at all for week one of USA Pro. Here's how data looks for the 112 games I do have in my spreadsheet so far though (all 3 weeks of Canada Pro, weeks 2 and 3 of USA Pro, and week 3 only for Canada AAA#2 and USA AAA#2):
In 112 games, the favorite has covered the spread 50 times (44.6%) and won outright 81 times (72.3%). The correlation coefficient between adjusted rating differential and result is 0.906 (the correlation between raw roster rating differential and result is only 0.892, so the adjustments ARE increasing the accuracy of the rankings, though not by a huge margin). The correlation between other key rating differentials and results are:
Offense: 0.840 (WR = 0.752, TE = 0.677, OT = 0.655, HB = 0.628, FB = 0.618, G = 0.591, C = 0.555, QB = 0.477)
Defense: 0.873 (CB = 0.723, LB = 0.709, DE = 0.686, DT = 0.681, FS = 0.680, SS = 0.556)
Sp. Teams: 0.687 (K = 0.614, P = 0.418)
Chemistry: 0.477
As for favorites only covering the spread 44.6% of the time, it appears to be primarily due to blowout games, where the linear formula I use to predict the spread breaks down in the face of the blowout prevention adjustments Bort made this year. There were 7 games in the sample with a spread of 100+, and if you exclude those (where the favorite only covered once), then in the other 105 games the favorite covered 49 times (46.7%), and won outright 74 times (70.5%). There were an additional 9 games with spreads between 60 and 99, where the favorite only covered 3 times, and excluding those as well leaves us with a 96 game sample in which the favorite covered 46 times (47.9%) while winning outright 65 times (67.7%). There were another 12 games where the spread was in the 30-59 range, in which the favorite only covered thrice, so in the 84 games where the spread was less than 30 points the faves actually covered 43 times (51.2%), though they only won 54 (64.3%) of those games.
It's also worth noting that the correlation between rating difference and result is somewhat inflated by those games with huge spreads. As the spread drops, so does the correlation; and in the process the difference between the adjusted and non-adjusted ratings also grows. As I said, in the entire 112 game sample, the correlation between rating difference and result is 0.892 unadjusted or 0.906 adjusted. In the 105 games with spreads <100, the correlation is only 0.809 unadjusted or 0.839 adjusted. In the 96 games with spreads <96, the correlation drops to 0.638 unadjusted or 0.695 adjusted. In the 84 games with spreads <30, the correlation is 0.454 unadjusted or 0.562 adjusted.
This basically just tells us what was already obvious, which is that while these rankings do have some predictive merit they aren't telling the entire story, and they particularly don't do too well at distinguishing between teams of relatively similar talent levels. The adjustments, however, even after just a couple of weeks, do seem to be significantly increasing the accuracy of the ratings at predicting games involving teams ranked within ~10 points of each other, so there is reason to expect that after another several weeks worth of adjustments these rankings will begin to become much more accurate. The system does appear to be working, albeit slowly.
In 112 games, the favorite has covered the spread 50 times (44.6%) and won outright 81 times (72.3%). The correlation coefficient between adjusted rating differential and result is 0.906 (the correlation between raw roster rating differential and result is only 0.892, so the adjustments ARE increasing the accuracy of the rankings, though not by a huge margin). The correlation between other key rating differentials and results are:
Offense: 0.840 (WR = 0.752, TE = 0.677, OT = 0.655, HB = 0.628, FB = 0.618, G = 0.591, C = 0.555, QB = 0.477)
Defense: 0.873 (CB = 0.723, LB = 0.709, DE = 0.686, DT = 0.681, FS = 0.680, SS = 0.556)
Sp. Teams: 0.687 (K = 0.614, P = 0.418)
Chemistry: 0.477
As for favorites only covering the spread 44.6% of the time, it appears to be primarily due to blowout games, where the linear formula I use to predict the spread breaks down in the face of the blowout prevention adjustments Bort made this year. There were 7 games in the sample with a spread of 100+, and if you exclude those (where the favorite only covered once), then in the other 105 games the favorite covered 49 times (46.7%), and won outright 74 times (70.5%). There were an additional 9 games with spreads between 60 and 99, where the favorite only covered 3 times, and excluding those as well leaves us with a 96 game sample in which the favorite covered 46 times (47.9%) while winning outright 65 times (67.7%). There were another 12 games where the spread was in the 30-59 range, in which the favorite only covered thrice, so in the 84 games where the spread was less than 30 points the faves actually covered 43 times (51.2%), though they only won 54 (64.3%) of those games.
It's also worth noting that the correlation between rating difference and result is somewhat inflated by those games with huge spreads. As the spread drops, so does the correlation; and in the process the difference between the adjusted and non-adjusted ratings also grows. As I said, in the entire 112 game sample, the correlation between rating difference and result is 0.892 unadjusted or 0.906 adjusted. In the 105 games with spreads <100, the correlation is only 0.809 unadjusted or 0.839 adjusted. In the 96 games with spreads <96, the correlation drops to 0.638 unadjusted or 0.695 adjusted. In the 84 games with spreads <30, the correlation is 0.454 unadjusted or 0.562 adjusted.
This basically just tells us what was already obvious, which is that while these rankings do have some predictive merit they aren't telling the entire story, and they particularly don't do too well at distinguishing between teams of relatively similar talent levels. The adjustments, however, even after just a couple of weeks, do seem to be significantly increasing the accuracy of the ratings at predicting games involving teams ranked within ~10 points of each other, so there is reason to expect that after another several weeks worth of adjustments these rankings will begin to become much more accurate. The system does appear to be working, albeit slowly.






























