Jeremiah Thompson — trustworthy frontier AI systems

01 — Research

Manifold Destiny: learning by verification.

Today's AI is wasteful. It starts with billions of random numbers and grinds them, again and again, against mountains of data, burning electricity until the answers happen to improve. Most of that effort is thrown away on information that doesn't matter, and nothing it produces is ever checked for actually being right.

With my research partner, Justin Horowitz, I built the opposite. A system that keeps only what it can prove. It proposes an idea, a strict checker confirms the idea is correct, and the ones that pass become permanent building blocks for the next one. No parameters to tune. No training. No guessing. Only ideas that have been proven true, stacked one on the next. We tested it in three very different places, each one chosen because it has a checker that can say, with certainty, whether an answer is right or wrong:

Formal math

We pointed it at Lean 4, a program that acts like a referee that never makes a mistake, checking whether a mathematical proof really holds. On 218,866 real proofs, it rebuilt them one correct step at a time.

Quantum

On data from a real quantum computer, it found the one pattern that marks the edge between ordinary physics and quantum physics: the angle difference α − β. It found it all on its own, without being told what to look for.

Synthetic

On made-up logic puzzles, it cracked each hidden answer in the fewest guesses mathematically possible.

Then it did something nobody expected. Left running on Lean, the system discovered 158 brand-new mathematical facts that nobody had ever written down. Each one is the missing twin of a fact we already knew. We didn't write them. The system found them and proved them.

The technical detail, and an invitation to verify

In technical terms: a verifier-mediated learning architecture in which a bounded grammar hypothesizes candidate abstractions and a hard truth-checker accepts each by a consumer-relative admissibility criterion. Formally: ∀ x, y: q(x) = q(y) ⇒ c(x) = c(y). Accepted abstractions self-extend the grammar with a soundness proof preserved across generations. No weights, no gradients, no fitting. Validated across formal mathematics, synthetic GF(2), and quantum CHSH.

We are asking the community to read the paper, run the reproduction, and verify the result. The full code, tests, and reproduction artifacts are public.

git clone https://github.com/sumofagents/manifold-destiny.git && cd manifold-destiny && bash reproduce/all.sh

→ Read the paper (PDF) Manifold Destiny · preprint
→ Repository & reproduction github.com / sumofagents
→ mathlib PR #41508 leanprover-community / mathlib4

02 — Timeline

The arc, on the record.

2026

The paper

Manifold Destiny. Thompson & Horowitz.

A verifier-mediated learning architecture: a bounded grammar hypothesizes candidate abstractions; a hard truth-checker accepts or rejects each. No weights, no gradients, no fitting. Demonstrated across formal mathematics, synthetic GF(2), and quantum measurements.

Paper Code

Preprint

2026

Three-domain validation

GF(2) · quantum CHSH · Lean 4

Structure recovery in 2^k−1 probes against a zero-information floor. CHSH violation S ≈ 2.29 on real superconducting hardware, where the system constructs the reduction itself. Compositional proof recovery on a 218,866-record Lean corpus.

Reproduce

Validated

Jul 2026

Autonomous theorem discovery

241 candidates → 158 proven

Left running on Lean, the system discovered 241 candidate theorems absent from mathlib. Each is the missing dual of an existing result, constructed by transporting a sibling proof through a duality axis. 158 survived, each one certified by the Lean kernel.

mathlib PR #41508

Discovered

03 — About

A short introduction, on the record.

I'm an AI engineer and patent holder with an extraordinary background spanning large-scale production AI, high-stakes emergency response, and frontier AI infrastructure.

One career, three rooms. I am a named inventor on US11176557B2, a behavioral-transaction model for suspicious-activity detection and SAR workflows built in an institution-scale environment serving 100M+ customers^[1]. I delivered a baby on scene^[2]. I am now focused on trustworthy AI infrastructure: agents, evals, model-routing control planes, and transaction-intelligence research at trillion-record scale.

The sector is not the boundary; it is the proving ground. The through-line is AI behavior under consequence: opacity, false confidence, escalation, auditability, and control.

[1] US Patent US11176557B2 — behavioral anomaly detection Google Patents
[2] "Newborn couldn't wait for" — on‑scene delivery Gaston Gazette · 2013
[3] Gastonia firefighters help pregnant woman deliver baby WBTV · 2013

04 — Focus

Four areas where the work lives.

01

Verifier-mediated learning

The core architecture: knowledge built only from verification. No weights, no training, no fitting. Proven across three domains with a self-extending-grammar soundness theorem.

02

Automated discovery

The system pointed at formal mathematics, discovering new theorems a human never wrote. Each one the missing dual of an existing result, certified by the Lean kernel.

03

Production ML under consequence

Machine learning that has shipped under real operational and regulatory pressure: patented neural-network work in global financial-crimes detection, serving 100M+ customers.

04

Frontier AI with custody

Agents, model-routing control planes, and evaluation loops built for AI that stays observable, reviewable, and under human control. The same discipline, applied to machine reasoning.

05 — Work

Selected work, shipped and cited.

01

Manifold Destiny

A weightless, verifier-mediated learning architecture. With Justin Horowitz.

A learning architecture with no neural network and no training: a bounded grammar hypothesizes abstractions and a hard truth-checker keeps only the ones it can prove. Validated across formal mathematics, synthetic GF(2), and quantum CHSH. Then left running on Lean, where it discovered 158 new theorems absent from mathlib.

Paper Code mathlib #41508

Public

02

Behavioral transaction CNN

A patented behavioral-transaction model for suspicious-activity detection and SAR workflows.

A convolutional neural network for suspicious‑activity detection on transaction data, developed in a top-tier global bank environment serving 100M+ customers and governed through regulated change-control expectations.

Proof: US Patent US11176557B2

Shipped

03

Enterprise AI & skills intelligence

AI systems for interpreting labor, skill, and organizational signals.

Principal‑engineer work on AI across a global enterprise's human‑capital function, turning public labor data and internal signals into a calibrated forward view used for workforce and contractor strategy.

Shipped

04

AI control planes & autonomous engineering systems

Agents, model gateways, eval loops, and workflows that improve software and ML systems under review.

Frontier coding agents, model‑routing control planes, research agents, and evaluation loops connected to review, custody, and change‑control rails. The same platform pattern needed when AI becomes shared engineering infrastructure.

In flight

06 — Pipeline

Autonomous research, with a custody trail.

01

Research control plane

Agents propose hypotheses, but the platform owns the rails.

Domain-specific research agents, retrieval agents, and coding agents turn questions into candidate experiments, implementation plans, and review artifacts. The goal is not unchecked autonomy; it is a system where every suggested change has context, scope, and an escalation path before it becomes execution.

In flight

02

Experiment execution layer

A GPU-backed service for turning ideas into repeatable model runs.

The execution layer accepts structured job specs, validates them, generates experiment programs, runs training in isolated workspaces, and records code, metrics, artifacts, and logs. It connects frontier-agent reasoning to real ML execution without letting the agent become the source of truth.

Research build

03

Evaluation contracts

Every run has to leave behind evidence.

Experiments are compared through validation and holdout metrics, calibration summaries, saved model artifacts, result journals, and failure notes. The pattern is deliberately reviewable: generated code, data splits, scores, and decisions stay attached to the run.

Measured

04

Why it matters

A platform problem, not a model-call problem.

The hard part is giving engineers powerful frontier-model capabilities while preserving secure access, observability, evaluation, lifecycle discipline, and escalation boundaries.

Platform

07 — Pattern

The common pattern: capability with custody.

Different surfaces, same control problem.

Whether the surface is a suspicious‑activity model, an internal developer platform, a research agent, or a frontier‑model evaluation loop, the system problem is the same: increase capability without losing control.

My work sits at that boundary — model behavior, engineering velocity, evaluation, auditability, and human escalation.

01

High‑stakes engineering

AI systems for environments where mistakes have operational consequence.

02

Frontier model behavior

Evals, controls, and review loops for agentic systems and model‑assisted work.

03

Production platform discipline

Shared infrastructure, model routing, logging, custody, and rollout discipline.

08 — Proof

The same leadership, pressure‑tested in two very different chambers.

01 Internal · enterprise

Jeremiah is a strong technologist, especially in the AI/ML field, and a leader. He has a lot of expertise in AI and ML techniques and is an expert in the field. As a technology leader, Jeremiah directs the work of his team, guiding them toward the techniques that should be pursued, coaching them, and ensuring the success of the project.

He has successfully established relationships with partner teams, and Jeremiah is clear in his communication, does not hesitate to speak up, and is independent. There are a number of initiatives upcoming, and Jeremiah will be key in a lot of those efforts.

Amar Keshani Jeremiah's former manager · Bank of America Now Head of Technology Governance and Transformation, SMBC Capital Markets — overseeing a $200M+ technology budget and a 150+ person organization.

02 External · high‑stakes

Jeremiah excels in levelheaded leadership when it matters most. When precise decisions are required to maintain operational understanding while the world is burning down around him, Jeremiah has the ability to focus through the noise and see the problem for what it is.

He treats simulations and training as if they are the real thing, because he takes his role extremely seriously — as the first, last, and only line of defense between the circumstances of the worst day of someone's life and the outcome.

Rich Gasaway Former Fire Chief · Roseville Fire Department Appointed by President George W. Bush to the Medal of Valor Review Board for Firefighting; nationally recognized voice on situational awareness and decision‑making under stress.

Let's build compounding discovery at scale.

I read and reply to every inquiry.

Email Jeremiah →

Direct jeremiah@jeremiahai.com