College Football Ranking System: The Logical Strength Rating (LSR)
What This Is
This is a college football ranking system built on philosophical logic rather than statistical correlations. Instead of asking "what patterns predict wins?", we ask a more fundamental question: "What does it mean for one team to be superior to another?"
The result is the Logical Strength Rating (LSR) — a ranking system that combines what teams have actually accomplished on the field (65%) with the structural realities of talent and resources in college football (35%).
Why We Need This
The Problem with Existing Rankings
Most ranking systems fall into two camps:
- Pure Results Systems (like raw win-loss records): A 12-0 team from the Sun Belt automatically ranks above an 11-1 SEC team, even if the Sun Belt team hasn't played anyone ranked and the SEC team lost by 3 points to the #2 team in the country.
- Pure Predictive Systems (like computer models): These try to predict who would win head-to-head matchups, often relying on statistical correlations (e.g., "teams that rush for 200+ yards win 73% of the time"). But correlation isn't causation, and these models can contradict what actually happened on the field.
Our Approach: Logic Over Statistics
We built this system from first principles — starting with the fundamental definition of what a game is, what winning means, and what makes a team superior. No shortcuts, no statistical tricks. Just pure logical reasoning applied to the structure of college football.
The Core Philosophy: Six Logical Principles
1. Victory is the Goal (But Margin Matters Too)
The Logic: The fundamental purpose of a team is to win games. A win is better than a loss, period. But how you win provides additional evidence.
- A 50-0 victory demonstrates greater dominance than a 10-9 victory
- But a 10-9 victory is categorically better than a 45-50 defeat
- Solution: We track both wins/losses AND point differential, but wins count much more (scaled 15x higher than point differential)
- The Dynamic Cap: We cap blowouts, but the cap changes based on opponent quality:
- vs. elite teams (#1-20): Full cap at 31 points (dominating great teams matters!)
- vs. average teams: Cap at ~22 points
- vs. terrible teams (#200+): Cap at ~12 points (running up the score doesn't impress)
Why dynamic? Beating Ohio State 31-0 is genuinely more impressive than beating Morgan State 60-0. The dynamic cap rewards quality performance, not stat padding.
2. Strength of Schedule Matters (Recursively)
The Logic: Beating a great team means more than beating a bad team. But how do we know which teams are great? We can't just look at their records—we need to consider who they played too.
Solution: We define each team's strength in terms of all other teams' strength. This creates a system of equations that must be solved simultaneously. Every team's rating depends on:
- How often they won
- How much they won by (capped)
- The average strength of their opponents
This is recursive—your opponents' strength depends on their opponents' strength, and so on. The system iterates until all the ratings stabilize into a logically consistent solution.
3. Recent Performance Matters More (But Not Too Much)
The Logic: Teams change over time. A team in Week 1 is not the same team in Week 10—players improve, get injured, coaching adjusts, chemistry develops.
Solution: We weight games using a square root progression. Game 1 gets weight 1.0, Game 2 gets weight 1.41, Game 3 gets 1.73, and so on. By Game 12, recent games matter about 3.5x more than your first game.
Why square root? It's the Goldilocks solution:
- Linear weighting (1, 2, 3, 4...) would make Game 12 worth twelve times Game 1—too extreme, erases your body of work
- No weighting (all games equal) ignores that teams improve/decline
- Square root (1, 1.41, 1.73, 2...) balances both: recent games matter more, but early games still count
4. Context Must Be Neutralized (Home Field Advantage)
The Logic: If every home team in the league wins by an average of 3 points more than they "should," then winning at home by 3 doesn't prove you're 3 points better—it just proves you played at home.
Solution: We calculate the league-wide Home Field Advantage (HFA) by averaging all games. Then we subtract that advantage from every home team's performance and add it to every away team's performance.
This "neutralizes" all games to equivalent conditions.
5. Sample Size Matters (Confidence Adjustment)
The Logic: A team that has played 1 game provides minimal evidence. A team that has played 12 games provides much more reliable evidence.
Solution: We apply a piecewise linear confidence adjustment. Teams with fewer games get their rating "pulled toward zero" (the average) until they accumulate sufficient evidence:
- 1 game: 14.5% confidence
- 3 games: 28.7% confidence
- 6 games: 50% confidence
- 9 games: 70% confidence
- 12 games: 90% confidence
- 13+ games: 95% confidence (plateau)
6. Structural Advantage is Real (The Talent Component)
The Logic: College football differs from professional sports. In the NFL, salary caps and draft systems create competitive parity. In college football, there is systematic, persistent inequality in talent and resources.
The Reality:
- Alabama recruits top-10 classes every year with a $200M athletic budget
- A mid-major program recruits 2-star players with a $30M budget
- These advantages don't disappear after one game—they're structural features of the competitive landscape
Our Solution: Hybrid Approach (65% Demonstrated + 35% Structural)
We calculate two separate ratings:
Demonstrated LSR (65% weight)
What the team has actually accomplished on the field this season—wins, margins, strength of schedule, adjusted for context.
Structural Advantage (35% weight)
The measurable, persistent advantages a team possesses:
- Talent (60% of this component): Recruiting rankings averaged over the last 3-4 classes
- Resources (40% of this component): Athletic department budget, coaching salaries, facilities
We combine these into a Talent Index (0-100 scale), then transform it to a Structural Advantage score (±10 scale).
The Final Formula:
Final LSR = (0.6313 × Demonstrated Results) + (0.3687 × Structural Advantage)
How These Weights Were Determined:
Rather than choosing weights arbitrarily, we derived them empirically through optimization:
- Objective: Maximize prediction accuracy on historical head-to-head matchups
- Method: Golden section search minimizing log-loss
- Data: 531 games split into training (75%) and test (25%) sets
- Result: Optimal weights = 63.13% demonstrated, 36.87% structural
- Validation: 87.12% test accuracy
Why This Balance Works:
- ✅ Prevents empiricist extremes: A 12-0 FCS team with no talent can't rank #1 overall just because they haven't lost yet
- ✅ Prevents realist extremes: A 2-10 Alabama can't stay ranked highly just because they have elite recruiting
- ✅ Addresses CFB-specific reality: Unlike MLB or the NFL, college football has extreme talent stratification. Ignoring this creates absurd rankings. Acknowledging it creates honest ones.
- ✅ Data-driven, not arbitrary: The 63-37 balance maximizes prediction accuracy, not philosophical preference
Example: How to Read the Rankings
Rank Team Record Dem SA Final
1 Ohio St 7-0 23.11 -0.24 14.50
2 Indiana 7-0 22.33 -5.56 12.05
3 Alabama 6-1 16.34 0.00 10.32
14 South Florida 6-1 15.10 -7.44 6.79
Ohio State (#1):
- Demonstrated: 23.11 (very high—perfect record, strong opponents)
- Structural Advantage: -0.24 (near-neutral, they have elite talent ~98.8 index)
- Final: 14.50 (best in nation)
Indiana (#2):
- Demonstrated: 22.33 (nearly as good as OSU on field)
- Structural Advantage: -5.56 (significant penalty for mid-tier talent ~72 index)
- Final: 12.05 (still #2 because demonstrated results dominate at 63%)
The Story: Indiana's dominance on field keeps them #2 despite talent gap. Alabama's elite talent keeps them #3 despite a loss. South Florida's G5 talent prevents them from cracking top 10 despite matching Alabama's record. The system balances what you've done (63%) with what you have (37%).
Philosophy: Why This Approach Matters
The Problem with Pure Empiricism
If you only rank based on game results, you get absurdities:
- 12-0 teams from weak conferences rank above 11-1 teams from elite conferences
- A lucky team that won 5 close games ranks above an unlucky team that lost 2 close games
- The schedule you were assigned (random) matters more than your actual capability
The Problem with Pure Realism
If you only rank based on talent/resources, you get different absurdities:
- 8-4 Alabama ranks above 12-0 Indiana because "they have better players"
- Games don't matter—it becomes a recruiting ranking, not a performance ranking
- Upsets become impossible to explain
The Hybrid Solution
By weighting demonstrated results at 65% and structural advantage at 35%, we acknowledge both realities:
- What happens on the field matters most (empiricism)
- But structural inequality is real and measurable (realism)
This isn't "giving blue bloods a bonus"—it's acknowledging that college football has extreme talent stratification that creates genuine capability differences. The team that wins the national championship almost always has a top-10 recruiting class. That's not bias; that's empirical reality.
Transparency & Reproducibility
Everything about this system is transparent:
- All formulas are documented in
logic.md - All code is in
calculate_rankings.py - All constants are explicitly stated (no hidden tuning)
- All data sources are cited (recruiting rankings, game results)
You can run this yourself, modify the weights, and see how it changes. That's the difference between a logical framework and a black box.
Frequently Asked Questions
Q: Why not 50-50 demonstrated vs. talent?
A: We optimized the weights empirically by maximizing prediction accuracy on historical games. The data showed that 63-37 produces the best predictions. This makes demonstrated results almost twice as important as talent.
Q: Why cap point differential at ±31, and why is it dynamic?
A: Because winning 70-0 vs. 31-0 against the same opponent doesn't prove you're twice as good. But the cap adjusts based on opponent quality—full credit (31 pts) vs. elite teams, minimal credit (~12 pts) vs. terrible teams.
Q: Doesn't this favor big-name programs?
A: No—it favors programs with elite talent, which happen to be big-name programs. The correlation isn't arbitrary; it reflects decades of recruiting success. And demonstrated results still count almost twice as much (63% vs 37%).
Q: What if a G5 team goes undefeated?
A: They'll rank highly if they dominate their competition. An undefeated G5 team with strong demonstrated LSR (say, 20.0) and mid-tier talent (SA = -5.0) would typically rank top 10-15. That's fair—they get credit for dominating while acknowledging talent limitations.
Conclusion: Logic Over Luck
This ranking system is built on a simple premise: superiority should be defined logically, not statistically.
We started with first principles—what is a game, what is victory, what makes a team better—and built upward from there. The result is a system that:
- ✅ Respects what happens on the field (65%)
- ✅ Acknowledges structural reality (35%)
- ✅ Accounts for strength of schedule (recursively)
- ✅ Adjusts for context (home field, recency, sample size)
- ✅ Remains transparent and reproducible
It's not perfect—no ranking system is. But it's honest about what it measures and rigorous about how it measures it.
In a sport where controversy over rankings dominates the conversation every year, we offer something different: a philosophically coherent answer to the question "who's better?"