Karpathy's autoresearch method, productized

Put every idea
through the loop.

An adversarial AI finds the weakest point in your thinking. A drafter strengthens it. A regression check ensures nothing was lost. Round after round, until the gaps are gone.

adversary loop
ROUND 1
Testing: pitch deck claim

"We 10x developer productivity"

Initializing adversarial critique...
R1
R2
0/2

The Premise

LLMs are better at finding problems than they are at avoiding them. So instead of asking one to “write something good,” we ask one to attack what you've already written— then fix what it found — then attack again.

This is how Karpathy's autoresearch produces research-grade papers autonomously. We took the same adversarial loop and made it work for anything with stakes.

The Loop

Critique. Score. Fix. Verify.
Repeat until converged.

01

Adversarial Critique

An LLM is tasked with one job: find the weakest parts. Logical gaps, unsubstantiated claims, missing context, internal contradictions, audience mismatches. Each finding gets a severity — HIGH, MEDIUM, LOW.

02

Binary Score

A separate pass decides: are these real problems, or manufactured nitpicks? Returns 0 or 1. If 0, the loop stops — there's nothing substantive left to fix. If 1, the issues are real and we continue.

03

Targeted Fix

A drafter applies surgical changes — only what the critique identified. Every edit is logged with a reason. Your voice, your structure, your intent stay intact. Nothing changes that wasn't flagged.

04

Regression Check

Before accepting any fix, key assertions from the original are re-verified. Did the edit accidentally drop a data point? Soften a critical claim? Introduce a new contradiction? Nothing gets lost in the process.

It knows when to stop

The loop detects diminishing returns automatically. It stops when no substantive issues are found, when consecutive rounds only surface low-severity nitpicks, or when the changes between rounds drop below 2%. You set the max rounds. The system finds the natural convergence point.

Where it works

Anything with stakes
benefits from adversarial pressure.

The loop doesn't care about the format. It cares about the claims, the logic, and the gaps. If you need something to survive scrutiny, run it through.

Fundraising & Investor Comms

Pitch decks, investor updates, board emails, term sheet memos

Catches unsubstantiated growth claims, missing context for board members, and asks that are buried below the fold.

Research & Academia

Papers, grant proposals, lit reviews, methodology sections

Finds logical gaps that a hostile reviewer would flag, unsupported claims, weak methodology framing, and missing citations.

Strategy & Planning

Business plans, roadmaps, RFPs, competitive analyses

Stress-tests assumptions, catches contradictions between sections, and hardens plans against the questions stakeholders will ask.

Sales & Client Work

Proposals, SOWs, case studies, pricing justifications

Ensures every stakeholder concern is addressed, pricing logic holds, and nothing in the fine print contradicts the pitch.

Content & Thought Leadership

Blog posts, newsletters, op-eds, launch announcements

Tightens structure, eliminates filler, strengthens your unique angle — while preserving the voice your audience expects.

Anything Else

Emails, memos, applications, policy docs, apologies

If someone important is going to read it, the loop finds what you missed. Even a 3-paragraph email can have blind spots.

Transparency

Watch every decision
as it happens.

No black box. The reasoning feed streams every step: what the critic scanned, what it found, why the scorer said yes or no, what the drafter changed, and whether all assertions survived. You see the loop think.

Live scan logSeverity tagsChange reasoningAssertion audit
Reasoning Feed
FeedSummaryFull
0:02SCANChecking claims against evidence...
0:04SCANTesting internal consistency...
0:06FOUND"40% faster" — no benchmark or baselineHIGH
0:08FOUNDPricing contradicts "start free" promiseMED
0:10SCORE2 substantive issues → 1 → continue
0:14FIXAdded industry benchmark to speed claim
0:16FIXAligned pricing section with intro
0:18PASS6/6 assertions preserved
0:19Round 1 complete — starting round 2...

Your Controls

Tune the loop to the task.

Pick the model, set the temperature, choose how many rounds to run. The loop adapts to what you need — from a quick sanity check to a deep adversarial stress-test.

Model

GPT-4o, Claude, Gemini

Free & Pro tiers

Temperature

0.0 — 1.0

Critique intensity

Rounds

1 — 7

Depth of pressure

Mode

Auto / Review

Approve each round

Stop shipping ideas
with blind spots.

Your next pitch, proposal, or paper is one adversarial loop away from surviving any scrutiny. Free to start.

Try the Loop →

No credit card required · 3 free runs per day