Risk Premium: The First Pillar Stands: Building an AI-Powered Equity Valuation Engine from Scratch

From blank Python file to institutional-grade model — and the bigger ambition behind it.

Four weeks ago, I set out to answer a simple question: can a rigorous, institutional-quality equity valuation be built entirely from scratch, using AI as a co-pilot, without compromising on methodology?

The answer, it turns out, is yes. Here is what I built, why it matters, and where it goes next.

The problem I was trying to solve

This project started with an ambition, not a gap. I wanted to build something sophisticated and scientifically sound — a model where every methodological choice reflected my own knowledge and judgment, not a template someone else had designed. Something that was genuinely mine: in its architecture, its assumptions, and its intellectual foundations.

The objective was also strategic. I wanted a tool rigorous enough to anchor a broader investment platform, something I could leverage across future projects rather than rebuild from scratch each time. A first pillar, in the most literal sense.

The context that made it necessary is well known. Professional equity valuation is expensive and largely inaccessible. The models used by investment banks and asset managers are built over months, maintained by teams of analysts, and guarded behind paywalls. Retail investors and smaller funds are left making decisions with far less rigor, often relying on a P/E ratio and a gut feeling. If the platform I am building is to have any credibility, it needs a valuation engine that can hold up against professional standards. That is what I set out to build.

What I built

Over four weeks, working session by session with Claude, I built a fully automated equity valuation engine in Python. Here is what it does, in plain language.

It reads ten years of official data. The model connects directly to SEC EDGAR — the US government's official financial filing database, and parses 40 quarterly reports and 10 annual reports for the company being analyzed. No manual data entry. No spreadsheet copy-paste. Audited numbers, machine-read.

It forecasts revenues using Geometric Brownian Motion, calibrated separately for each business segment on ten years of actual quarterly data. To prevent the model from over-reacting to recent quarters, Bayesian shrinkage blends short-term momentum with long-run sector drift. A Markov regime-switching layer adds economic realism: the model alternates between expansion and recession states, with recession volatility set 30% higher than in normal periods. Terminal revenue growth of 0.50% per year is derived from the GBM drift itself, not assumed arbitrarily.

It models costs through an Ornstein-Uhlenbeck mean-reverting process, anchored to the historical average cost ratio using an exponentially weighted moving average. This is an important design choice: it prevents the model from projecting runaway margin compression or perpetual expansion, keeping cost trajectories grounded in economic reality while still allowing for uncertainty. EUR/USD currency sensitivity is applied directly to the relevant segment's cost ratio throughout the simulation.

It applies distress guard-rails at multiple levels. CapEx is governed by a three-layer dynamic framework: a phase-gate structure tied to free cash flow thresholds, overlaid with a D&A-parity health floor that activates when net debt falls below 2.5 times EBITDA. The cost of equity carries an explicit distress premium , calculated as the spread between the cost of debt and the risk-free rate, scaled by a 60% loss-given-default assumption, reflecting elevated near-term financial risk. The conditional terminal value rule is a further guard-rail: if the company is forecast to earn below its cost of capital at the end of the horizon, the Gordon Growth model is suppressed and replaced entirely with a market multiple approach.

It runs 10,001 simulated futures simultaneously. This is the Monte Carlo engine. Rather than producing a single "correct" number, the model runs 10,001 different scenarios using Sobol quasi-Monte Carlo sequences, a low-discrepancy sampling technique that converges faster than standard random methods. Eight correlated risk drivers, revenue per segment, cost ratios, currency, and economic regime, are jointly simulated through a Cholesky decomposition, ensuring the correlations between them are realistic, not assumed to be independent. Every single path applies the conditional terminal value rule independently. With an odd number of paths, the median outcome is always a single, exact path, not an average.

It checks itself. Four automated sanity checks run after every execution: return on capital vs cost of capital, terminal value anchor, gap between the simulation median and the discounted cash flow estimate, and regime drift. If something looks wrong, it flags it.

It grades its own methodology. Across ten analytical dimensions: data quality, revenue modeling, cost structure, cost of capital, cash flow construction, terminal value, simulation design, stress testing, earnings quality, and internal consistency; the model scores its own rigor. The current score: 9.1 out of 10, grade A.

What makes this sound valuation, not just a number

Anyone can build a spreadsheet that spits out a price target. What makes this different is the methodology underneath.

The cost of capital updates live each run — fetching the current equity risk premium directly from Professor Damodaran's database, re-levering beta annually as the company's debt changes, and letting the risk-free rate evolve along a mean-reverting path rather than staying frozen at today's level.

The terminal value, the single most impactful assumption in any DCF, applies a conditional rule: if the company is expected to earn below its cost of capital at year seven, the Gordon Growth model is suppressed and replaced with a market multiple anchor. This is the correct thing to do financially, and most models simply don't do it.

The earnings quality checks are independent of the valuation itself. The Beneish M-Score, an academic model with an 85%+ accuracy rate at detecting earnings manipulation, runs on the same official filing data and produces a manipulation probability score for every year in the sample.

The model was built to be right and to allow your critical thinking, not just to produce a number.

What Claude brought to this

I want to be direct about this: I could not have built this in four weeks without Claude.

Not because the ideas were Claude’s, the financial methodology, the architecture decisions, the analytical judgments were all mine. But because building approximately 12,000 lines of Python, across a dozen interconnected modules, while simultaneously reasoning about financial theory, statistical methodology, and implementation correctness, is simply beyond what one person can hold in their head at once.

Claude served as a tireless co-pilot: catching logical errors before they propagated, translating financial intuition into working code, identifying edge cases I hadn't considered, and — critically — pushing back when my assumptions were financially unsound.

The result is a model I can genuinely defend. Every number has a traceable source. Every assumption has a documented rationale. That combination — human financial judgment, AI execution rigor — is where the real value lies.

The output on a real company

The model has been running live on a major publicly listed industrial company. The key outputs from the April 2026 live run:

DCF implied price: conservative single-path intrinsic value estimate
Monte Carlo median: the central outcome across 10,001 simulated futures — x% above the current market price
Downside scenario: a severe recession path implies −y% from current levels
Earnings quality: Beneish M-Score of −3.28 — manipulation unlikely, probability 3.6%
Model rating: A across 9 of 10 dimensions

The point is not the specific numbers. The point is that they emerge from a methodology that would hold up in a professional investment committee.

Where this goes next

This is the part that genuinely excites me.

Step one: multi-company. The engine is already company-agnostic, it reads from SEC EDGAR using any ticker. The next build extends it to run across an entire watchlist simultaneously: e.g screen 50 pre-selected companies, flag the ones where the Monte Carlo median diverges most significantly from the current market price, and surface the most mispriced opportunities automatically.

Step two: AI agents on top. This is the bigger ambition. Rather than a human analyst manually tweaking assumptions and re-running the model, I want a layer of autonomous agents that can monitor live earnings releases and update model inputs in real time, identify which assumptions are most sensitive and run targeted scenario analyses, compare a company's valuation across different economic regimes, and generate natural-language investment memos from model outputs.

The vision: give any investor — institutional or retail — the ability to ask "what is this company worth, and why?" about any publicly listed company in the world, and get a rigorous, methodology-grounded answer in minutes.

Why this matters

Equity research today is a privilege. It is expensive to produce and expensive to access. The analysis that moves capital is concentrated in a small number of institutions.

AI changes that equation. Not by replacing financial judgment, but by making it scalable. The same methodology that takes an analyst team weeks to build can, with the right architecture, run overnight on every company in an index.

I am not suggesting algorithms should replace human investment decisions. I am suggesting that the quality of information available to make those decisions should not depend on the size of your research budget.

That is what this project is about.

Follow along

I will be publishing the methodology in more detail, the valuation framework, the stress testing approach, the earnings quality models, on my Substack, blog and here on LinkedIn via a post.

If you are interested in quantitative equity research, AI-assisted financial modeling, or the intersection of the two, follow along. And if you are building something in the same space — or thinking about it, I would genuinely love to connect.

A presentation providing additional detail on the project, methodology, architecture, and outputs, is attached to this post.

Substack: https://substack.com/@pedrosantospinto

Blog: https://premioderisco.blogspot.pt

Personal page: https://pedrosantospinto.netlify.app