6/28/2026

My Model Says Argentina. My Judgment Says France.

My Model Says Argentina. My Judgment Says France.

The 2026 World Cup group stage is done. Here’s how the Risk Premium Research Champion Predictor is calling the knockouts, and the one place I’m overruling my own machine.

Forecast Wc2026 V6 28062026
612KB ∙ PDF file
Download
Download

Seventy-two matches. Three weeks. Not a single group winner the model didn’t see coming, and one team I can’t stop thinking about that the model keeps insisting is only the second-best in the field.

The group stage of the 2026 World Cup is complete, which means the Champion Predictor has now been fed every result and re-run from the ground up. Before the knockouts begin, this is where things stand: what the model is confident about, what it got right when it actually counted, and the one call where my own judgment and my own code refuse to agree.

Subscribe now

The state of the race

After conditioning on all 72 group-stage results, the title picture looks like this:

🇦🇷 Argentina, 21.8%
🇫🇷 France, 14.5%
🇪🇸 Spain, 14.4%
🇧🇷 Brazil, 7.4%
🇨🇴 Colombia, 6.3%
🏴󠁧󠁢󠁥󠁮󠁧󠁿 England, 6.3%

Argentina sits clear at the top, with France and Spain locked together a length behind and a tightly-packed chasing pack after that. The single most-likely final the model spits out is Spain vs Argentina.

What’s moved since the last update

Compared with the forecast I last published, Version 4, just before the second round of group matches, the group stage has done what good information should do: sharpened the picture.

  • Argentina surged, 17.7% to 21.8%. They won Group J at a canter and the bracket fell kindly for them.

  • France firmed, 12.2% to 14.5%. Three wins from three, and a clear, rising No. 2.

  • Spain held at 14.4%, having recovered completely from a nervy opening draw.

  • England slipped, 8.5% to 6.3%, on a limp finish to the groups.

The part I’m actually proudest of

It’s easy to update a forecast as results roll in and look smart in hindsight. The real test is the version you commit to before you know anything.

Share

I locked one forecast the night before kickoff and never touched it again. Graded against reality across the entire group stage, that frozen forecast went:

  • Group winners: 12 out of 12.

  • Round-of-32 qualifiers: 26 of 32 (81%).

  • Match-outcome accuracy: 61% (44 of 72).

  • Mean Ranked Probability Score: 0.156.

That last number deserves a sentence, because it’s the one I trust most. The Ranked Probability Score grades not just whether you were right but how confidently and how close you were. Getting a result wrong “by a draw” costs far less than calling a home win that turns into the opposite. A blind, uninformed guess scores about 0.278. My model’s long-run benchmark is 0.169. Across these 72 games it landed at 0.156, comfortably better than its own average. In plain terms: the model wasn’t just lucky on the winners, it was well-calibrated on the probabilities underneath them.

But the line I keep coming back to is the first one. Twelve group winners out of twelve, called before a single ball was kicked.

Where I overrule the machine

Here’s the honest part. The model’s pick is Argentina. I don’t think it’s right.

For my money, France are the strongest team in this tournament and my pick to lift the trophy. They won all three group games, 3-1, 3-0, 4-1, and carry the best goal difference in the entire field at +8. More than the numbers, they simply look the most complete side from front to back. The model has them a close, climbing second. My judgment puts them first. I’m comfortable with the disagreement. A forecast you only ever nod along with isn’t worth much.

And the flip side, the team that’s let me down most: Portugal. A genuine pre-tournament dark horse who never got going. Second in their group behind Colombia, draws against DR Congo and Colombia, a single comfortable win, and now a distant eighth (3.6%) in the title race. For a side many of us fancied, it’s been a real disappointment.

Leave a comment

Under the hood

For the curious: the engine is an ensemble of two models, a Dixon-Coles goals model and a World Football Elo rating, blended together and run through 10,001 Monte Carlo simulations of the remaining tournament, conditioned on every actual result so far.

Two small acts of rigour worth mentioning, because details are the whole game in this work. First, FIFA quietly changed the 2026 tiebreaker rules so that head-to-head now outranks overall goal difference, a break from every previous World Cup, and one that materially changed who topped a group. Second, the Round-of-32 third-place allocation follows FIFA’s official deterministic table, not a convenient approximation. Both were caught and coded properly. A forecast is only as trustworthy as the plumbing beneath it.

What’s next

The knockouts begin now, and the bracket is fixed: no redraws. Argentina open against Cape Verde, France against Sweden, Spain against Austria. The model will be re-run after every round, and I’ll keep publishing where it lands, and where, like now, I quietly disagree with it.

Model says Argentina. My perception and football awareness says France. We’re about to find out which one of us was right.


The Champion Predictor is a quantitative model built for research and curiosity, not betting advice. If you’d like the round-by-round updates through the knockouts, subscribe. The next one lands after the Round of 32.



from Risk Premium https://ift.tt/E8p5T4g
via IFTTT

6/10/2026

How My World Cup Model Changed Its Mind

What 1 week of refitting taught me about Spain, Argentina, and the limits of my own model

The 2026 World Cup kicks off tomorrow. My model thinks Spain wins it. So does Opta. We agree on almost nothing else at the top.

Opta’s three favorites are Spain, France, England. Mine are Spain, Argentina, France. We both put Spain first, then we part ways immediately. Opta has England third at 11.2 percent. My model has England fifth at 6.6 and Argentina second at 14.2, a team Opta ranks below France. Two models, the same public data philosophy, a genuine disagreement about who the second-best team in the world is.

Thanks for reading Risk Premium! Subscribe for free to receive new posts and support my work.

That gap is the most interesting thing I can show you, and it is worth understanding where it comes from. It comes from a month of watching the model change its mind.

Forecast Wc2026 V10
491KB ∙ PDF file
Download
Download

The first run was confident in a way it had not earned

When I first stood the engine up, Spain led comfortably. The gap to the field at the top was 7.4 points. That felt clean. It also felt like the kind of number that should make you suspicious, because a 48-team tournament with the variance football carries does not usually hand you a runaway favorite this early.

The model was not wrong, exactly. It was just reading a thinner slice of evidence than it would have a week later. Friendlies were doing more work in the fit than they deserved to. The match-importance weighting treated a June tune-up too much like a real fixture. So when Spain looked good in low-stakes football, the model believed it a little too readily.

The warm-ups did the teaching

Then the teams actually played, and the model got to watch.

Argentina was the story. Three nil over Iceland, two nil over Honduras. Not glamorous opposition, but the manner mattered, and the model rewards a team that wins the way it is supposed to win. Spain, meanwhile, drew Iraq before beating Peru. A draw with Iraq is not a crisis, but it is information, and the model took it as such.

By the final refit the 7.4-point lead had halved to 3.4. Spain 17.6 percent, Argentina 14.2. The top of the board had gone from a procession to a contest. None of this was me overriding the model. It was the model doing what it is built to do, which is update when the world gives it new results.

The chasing pack reshuffled at the same time. Colombia firmed from 4.6 to 5.6 on strong tune-ups. Morocco moved from 3.5 to 4.5. England slid to 6.6 after only edging New Zealand one nil, the kind of narrow win over weak opposition that the model reads as a mild warning rather than a triumph. France held at 11.0, Brazil at 8.8.

I also changed the engine, not just the inputs

Two things under the hood moved between the first run and the last.

The first was match-importance weighting. I rebuilt how the model values different kinds of fixtures. The Nations League now counts as a near-major rather than something closer to a friendly, which is the correct reading of how seriously teams now take it. Friendlies are discounted harder. This is the change that pulled the early over-confidence out of the system.

The second was calibration discipline. I held the validation work to the same standard I would hold any quantitative claim to. Across 990 internationals over the last twelve months the ensemble runs a mean ranked probability score of 0.169, against 0.278 for an uninformed baseline. That is the number I trust most in the whole exercise, because it is the one that tells me the model is actually adding information rather than dressing up noise.

Why I disagree with Opta, and why I am comfortable with it

Back to the hook. Opta has France second and England third. I have Argentina second and England fifth.

The honest answer is that Opta sees things I do not. Their models carry player-level data: injuries, suspensions, recent-form weighting at the individual level. Mine works from team-level signals only. It does not know which winger pulled up in training on Tuesday. So when we diverge materially, the smart prior is usually that the difference traces to player information they have and I do not.

But that cuts both ways. A team-level model has its own discipline. It is harder to talk into a narrative. It saw Argentina win the way contenders win and moved them up, without being anchored to a preseason consensus that had France as the European challenger. I am not claiming my second-place call is better than Opta’s. I am claiming it is honestly arrived at, and that the disagreement is the kind worth publishing rather than hiding.

What the single path says, and why you should not believe it exactly

The model’s most-likely path ends Spain 1-0 Argentina, with France taking third. Read that as a story the model finds plausible, not a prediction it is confident in. Every fixture in that bracket is the single highest-probability outcome at that node, and the joint probability of all of them landing exactly as drawn is small. The championship numbers, built from 10,001 simulated tournaments, are where the real belief lives. Spain 17.6, Argentina 14.2, France 11.0, Brazil 8.8, England 6.6, Colombia 5.6.

That is where the model stands the night before kickoff. Tomorrow the tournament starts feeding it real results again, and the refitting starts over. The next update lands after Round 1.

I will let the model keep changing its mind in public. That is the whole point of doing this where you can watch.

Risk Premium Research. Forecasts are probabilistic, built from public data only, and for research purposes rather than betting advice. The model carries no insider or player-level information.

Thanks for reading Risk Premium! Subscribe for free to receive new posts and support my work.



from Risk Premium https://ift.tt/Yku6rEZ
via IFTTT

6/06/2026

What a Football Model Teaches You About Forecasting Markets

What a football model teaches you about forecasting markets

In five days the World Cup kicks off. I built a model to forecast it. The model is not the point. The point is that football, unlike markets, grades you in public and on a deadline.

grey concrete figurine

Most forecasting lives in a comfortable fog. You make a call, the world moves, and by the time the outcome arrives the question has changed enough that nobody checks. Markets are the worst offender. A view on equities in June is unfalsifiable by December because ten other things happened in between. Football has no such mercy. The whistle blows, the score is the score, and four weeks from now everyone can see whether the model was right.

So I treated the World Cup as a stress test of the same discipline I apply everywhere else.

Forecast Wc2026
490KB ∙ PDF file
Download
Download

The approach, in outline

The model is an ensemble. It combines more than one independent statistical engine, each estimating team strength and match outcomes a different way, then blends them. It layers in external strength signals beyond raw results, and it recalibrates for the things simple models get wrong, draws chief among them. On top of that sits a Monte Carlo simulation: the tournament is played forward 10,001 times over the new 48-team, 104-match format, and the championship probabilities are the frequencies that fall out.

I am keeping the internals to myself. The value of a model is not the idea, which is freely available in any sports-analytics paper, it is the calibration, the choices, and the hours of getting the details right. What I will share is the output and, more importantly, the evidence that it works.

The discipline that actually matters

Anyone can produce a forecast. The question is whether you can produce one that beats doing nothing. So the model is validated the way I validate anything before I trust it.

Subscribe now

Walk-forward backtest, no peeking: every prediction is made using only data available before that match. Over the last 12 months of international football, 997 matches, the model scored a mean Ranked Probability Score of 0.165. An uninformed forecast scores 0.278. A perfect one scores 0. Outcome accuracy came in at 50.8 percent against a 33.3 percent baseline for blind three-way guessing. Across five recent major tournaments it beat both naive benchmarks and every single component model in the battery.

That last point is the one I care about most. A model has to earn its complexity. If the elaborate version cannot beat a simple rule, the elaborate version is vanity. This one clears the bar, but only modestly, and I will say so plainly: the edge is real and it is small.

What it says

Spain, champions, at 19.6 percent. Then Argentina at 12.2, France at 12.1, Brazil at 8.6, England at 7.6. The top four hold 52 percent of the title probability between them. The most likely final is Spain against Argentina, with France and Brazil meeting in the third-place game.

Read those numbers correctly. Spain at 19.6 percent means Spain loses this tournament four times out of five. The single most likely bracket, the one where every favorite advances exactly as projected, has a joint probability close to zero. That is not a flaw in the model. It is the truth about football, and the same truth holds for markets. The headline call is the least interesting number on the page. The distribution is the forecast.

Share

Why this connects to the day job

Three habits carry directly from this exercise into how I think about markets.

First, ensemble over conviction. Two independent methods that disagree tell you more than one method you happen to like. Where they agree, lean in. Where they diverge, the gap is information, not noise to be smoothed away.

First principles on uncertainty. A probability is a statement about long-run frequency, not a prediction. Spain at 19.8 percent and a portfolio position sized to a 20 percent base rate are the same kind of claim. Treat them the same way.

And backtest honestly or do not bother. The temptation in every model is to let a little future information leak into the fit and admire the result. The whole value of the football version is that the leak gets exposed in public, on a fixed date, with no second question to hide behind. If a process cannot survive that, it should not be running your money either.

What happens next

This is version one. I will keep iterating up to kickoff and through the tournament as results land. The model will be wrong about specific matches, often. The test is not whether Spain lifts the trophy. The test is whether, across 104 matches, the probabilities turn out to be calibrated. That is the only thing worth measuring, and unlike most of what I do, you will all get to watch it happen.



from Risk Premium https://ift.tt/QabToZv
via IFTTT

6/01/2026

A Reminder That Coherence Exists

Listening to Bill Burns's insider interview is such a positive and uplifting exercise. It seems we can still have voices from the US that are intellectually rigorous, with coherent, mature, and measured speech. Perhaps hope is not lost after all, but these kinds of voices need to return to the US administration urgently. https://www.economist.com/insider/inside-defence/how-to-handle-americas-adversaries

- Pedro

Read on Substack

5/29/2026

Who actually created the value?

Separating the CEO from the hand they were dealt

A new Claude Code skill that grades any public-company CEO on what is theirs, plus ten anonymized cases to show it working.

I have been having a lot of fun building skills in Claude Code lately. A skill is just a packaged set of instructions and tools that teaches the model to do one thing well and repeatably. Once it exists, I can point it at a new input and get the same structured output every time, without rebuilding the logic by hand.

Thanks for reading Risk Premium! Subscribe for free to receive new posts and support my work.

The latest one grades CEOs. Give it any public-company chief executive and it returns the same disciplined read: a setup, a scorecard, two composite scores, and a verdict.

The problem it fixes

Most CEO commentary is captured by the company’s own narrative. Management foregrounds the metrics it hit and quietly reframes the ones it missed. Headline results get flattered by cyclical tailwinds, by one-time items, by recovery off a depressed base, and by the execution of a strategy that someone else designed. Read enough earnings calls and you start scoring the press release rather than the person.

The skill is built to refuse that. It strips the headline back to what is attributable to the CEO. It credits real value-add fully, and declines to credit luck, inertia, or accounting noise.

How it works

The conceptual core is two measures that look similar and are not.

The first is the delta from baseline. I score the company’s qualitative state at handover, then again today, and take the difference. It answers a narrow question: does the company look better or worse than when this person took over? Useful as a sanity check. Not a measurement of value creation.

The second is the residual. It asks whether the CEO beat a peer-median replacement holding the same hand, over the same calendar window. Actual performance minus expected performance, where expected is what a competent peer would have delivered with the same inheritance in the same cycle. This is the number that matters, because it controls for both the cycle and the hand.

The two often disagree, and the disagreement is the interesting part. A CEO can post a positive delta and a deeply negative residual: the company looks better on paper, yet lagged every peer in the same cycle. The reverse happens too. A steward can inherit a near-perfect franchise, watch the qualitative score barely move, post a slightly negative delta, and still compound far ahead of the sector. Positive residual, negative delta. The franchise does the work, and the question becomes whether the leader added anything on top of it.

Around those two measures sits the machinery: a thirteen-dimension scorecard running from strategic vision to succession quality, and four adjustment disciplines that separate recurring from one-time, cyclical from structural, inherited from originated, and input metrics from output metrics. There are a couple of hard rules I cannot override. An ouster under cause caps the governance score. Only an originated, non-consensus vision earns a 9 or a 10, which keeps the top of the scale honest.

The whole thing is built to fight my own bias toward a generous 7. Most CEOs, in most tenures, land at 4 to 6 once the rigor is applied evenly.

Ten cases, no names

To show the framework working without grading anyone by name, I ran ten anonymized cases across industries, from semiconductors to media and entertainment. The industry stays visible. The identity does not. Anonymizing also lets me publish it freely and let the method speak rather than the personalities.

The spread runs from 2.8 to 8.8, and the ordering is clean rather than collapsing into a noncommittal middle.

Founder-originators sit at the top. They built the position the company holds, so the cycle helped them but did not create them. There is also a hired CEO near the top, which matters: the high marks are not reserved for founders, they are reserved for originated theses, and a non-founder who originates a winning, non-consensus strategy earns the same credit.

Value-destroyers ousted under cause sit at the bottom, where a high inherited baseline makes the destruction worse, not better, because the head start was squandered.

The most useful finding sits in the middle. The framework is consistently harsher than sell-side on competent operators of inherited premium franchises, and it refuses to read a cyclical recovery as skill. The market prices the franchise and the rebound. The framework grades the leader. Both views are defensible, and the gap between them is exactly the signal worth surfacing

Ceo Assessment Anonymized
363KB ∙ PDF file
Download
Download

.

What it cannot do

A note on honesty, because a framework that pretends to see everything is worse than useless. This one scores observable performance. It cannot detect undisclosed fraud, hidden accounting irregularities, or operational misbehavior that has not surfaced yet. When I tested it against historical cases where the outcome is now known, four matched cleanly and one only partially: the executive who looked strong right up until a concealed scandal broke. That case is the blind spot, and it is a standing reminder to pair this kind of scorecard with separate fraud-and-governance diligence rather than treat the composite as the whole picture.

The point

The point is not to dunk on chief executives. It is to keep score in a way that survives the disciplines, that treats a turnaround CEO and a fortress steward by the same rules, and that answers the only question worth asking: who created value relative to the hand they were dealt, rather than whose company has the nicer headline.

The full ten-case deck is below. Built, as ever, with a fair amount of fun, in Claude Code.


This is published for informational and educational purposes only. It is not investment advice, nor a recommendation, offer, or solicitation. The cases are illustrative and anonymized; no company or individual is named, and any identification a reader infers is the reader’s own. Views are my own, current only as of the date shown, and may change. Past performance is not indicative of future results. Do your own research.

Thanks for reading Risk Premium! Subscribe for free to receive new posts and support my work.



from Risk Premium https://ift.tt/huArlU5
via IFTTT

5/22/2026

Just read this Bartleby column in The Economist. Witty, sharp, painfully on point. It got me thinking about all the corporate buzzwords and slop we hear in meetings and town halls. And, full confession, that we sometimes deploy ourselves. Whoever has never sinned, cast the first stone 😀. So I turned the laugh into a game: Town Hall Bingo (attached). Six cards, five in a row to win, "Velocity pivot" enshrined as the free center square in honor of the article. Print it, hand it out before the next all-hands, and let me know how it goes. Introducing "Velocity pivot": the corporate world's Lorem ipsum 👇 [link] https://www.economist.com/business/2026/05/14/introducing-velocity-pivot

- Pedro

Read on Substack

5/10/2026

Fela Kuti: Fear No Man by Jad Abumrad

Sometimes in life you get overwhelmed, in this case positively overwhelmed, by a book, a piece of music, a painting, a movie… Something sweeps you off your feet and makes you stop, think and reflect on what you just experienced.

Fela Kuti – Fear No Man was one of those cases. An extremely well-crafted podcast series (13 episodes in total) about the life and work of Fela Kuti, it gave me so much more than I was expecting when I started it. Its narrative quality and care sometimes made me enter a state of flow while listening, that state where time becomes relative and goes by so quickly.

Subscribe now

Why I liked it so much?

1. Introduced to my universe a major 20th-century artist that I was completely ignorant about.

2. Introduced me to his music and work, which, on top of everything, I really liked.

3. Provided me a brief walk-through of recent Nigerian history, through the lens of his life.

4. Made me stop and think, more than once, about how people live their lives in such a different, rich, and by the same token difficult and complex environment, so far away from my reality.

5. Learnt about music concepts I was not aware of: the “ostinato” rhythm, which makes me feel at home, and “counterpoint,” which is the base of his music and, surprise, surprise, entangles his music with Bach’s one (surprised?).

6. How the author did not sugar-coat the most controversial areas of Fela Kuti’s life, which adds a strong plus to the full narrative.

Fela Kuti: AfroBeat and the Significance of Kalakuta Republic | The ...

If you want to jump to a completely different world and reality without leaving yours, if you want to get to know Fela Kuti, or if you already know him and want to deepen that knowledge, do not waste this opportunity. Start the journey.

I hope you like it as much as I did, and that by the end you just feel a little sad because the series is over.



from Risk Premium https://ift.tt/6QYRdro
via IFTTT