Product

One DNA score is not enough. Why we use two radars.

Judgment and execution are different muscles. A founder can be brilliant under simulated pressure and inconsistent in real life — or the reverse. One number hides the contradiction. Founders DNA shows two radars, side by side, deliberately — Judgment DNA from the simulator and Execution Profile from real activity. Here is why.

Karthik BalajiKarthik Balaji·30 Apr 2026·6 min read

TL;DR

Key Takeaway

Judgment and execution are different muscles, and the most useful information about a founder is whether the same person has both. We render Judgment DNA (six dimensions, from the simulator) and Execution Profile (four dimensions, from real activity) as two separate radars on the same page — equal visual weight, no average, no composite. The contradiction is the signal.

A pattern we kept seeing in the alpha

We ran the Founders DNA simulator with about forty founders during the alpha. The radar at the end of the run told us a lot — Risk Calibration, Capital Discipline, Growth Instinct, Operational Rigor, People & Leadership, Crisis Response, all scored from real decisions made under time pressure. The radar was authentically useful.

But two specific patterns kept showing up that the radar alone could not explain.

Pattern A. A founder finishes the simulator with a 92 on Crisis Response. They handled the simulated supply-chain shock with composure, picked the measured option, replied within forty seconds. Then in the post-sim conversation we ask how their actual business is going, and they have not shipped anything in three weeks. The simulator told us how they would respond to a crisis. It did not tell us whether they show up on a Tuesday morning when nothing is on fire.

Pattern B. The reverse. A founder finishes with a flat-ish radar — nothing exceptional, nothing alarming — but their LinkedIn shows daily posts, their GitHub is a wall of green, their Stripe shows seven new customers since we last spoke. The simulator did not catch what their week looked like. The radar undersold them.

In both cases the issue was not the simulator. The issue was the implicit framing that one number could capture a founder. It cannot, because judgment and execution are different muscles, and the same person can have very different amounts of each.

Why averaging breaks

The first instinct, and the wrong one, is to combine the two into a single composite — "Founder Score: 78 / 100" — that smooths out the contradiction. Every product I have seen in this space does some version of this, because a single number is easy to reason about and easy to put on a profile.

The cost of the smoothing is the part that matters. Consider two founders both with a composite of 78:

  • Founder A: Judgment 92, Execution 64. Decides well under pressure, but the heatmap is sparse. Maybe great as a co-founder paired with an operator. Probably risky as a first-time CEO.
  • Founder B: Judgment 64, Execution 92. Ships every day, never misses, but plays the simulator on autopilot — picks the obvious option, freezes on edge cases. Probably great as an executor; needs serious mentorship if they will be the one calling shots when something breaks.

Both are 78. They are not interchangeable bets. The composite erased the most important thing a viewer could know about either of them.

The fix is not to weight the two more cleverly. It is to stop combining them at all. Show both. Let the viewer read the divergence.

What each radar measures

Judgment DNA — six dimensions, sourced from the simulator.

The existing six. Risk Calibration, Capital Discipline, Growth Instinct, Operational Rigor, People & Leadership, Crisis Response. Each scored from -10 to +10 per decision, aggregated across thirty to forty decisions over a 90-day simulated business. The simulator delivers events on Telegram, the founder picks a choice, the local LLM scores the choice with reasoning. Response time is tracked and feeds into Crisis Response.

This radar answers: how does this founder think when something is on the line?

Execution Profile — four dimensions, sourced from typed founder events.

The new four. Velocity, Consistency, Depth, Reach. Built from the universal founder_events stream — every webhook from GitHub, LinkedIn, Notion, Calendly, Stripe, Razorpay, Vercel, every manual proof on Telegram, every sim decision (sim decisions paint the heatmap too).

  • Velocity — speed from intent to shipped artifact.
  • Consistency — streaks, contribution density, do they show up on boring days.
  • Depth — substance per action; the LLM judges whether the artifact survives a closer read.
  • Reach — does the action travel — replies, signups, payments, genuine attention from someone with no reason to be polite.

This radar answers: what does this founder actually do all week?

Why four dimensions for execution and not six

We thought hard about this. The simulator's six dimensions are tied to the structure of the event itself — every event has a decision under pressure that exposes one or more of risk, capital, growth, operations, people, crisis. The dimensions are behavioural.

Execution events are different. They are artifacts of work, not decisions. The right dimensions for an artifact are about the artifact's properties — how fast did it ship, how often does it ship, how substantive is it, how far does it travel. Four dimensions cover the space cleanly. Six would be padding.

I would rather have four well-defined dimensions with sharp scoring than six that overlap. The radar is a tool for reading at a glance; redundant dimensions blur the read.

The patterns the dual radar makes visible

Once you have two radars side by side, certain founder archetypes become visible in a way one radar could not show:

  • The strategist who does not ship. Judgment radar high and balanced; Execution radar low Velocity, low Consistency. Brilliant in conversation, slow in reality. Pair with an operator co-founder.
  • The grinder who needs supervision. Execution radar high and balanced; Judgment radar low on Risk Calibration and Capital Discipline. Will work harder than anyone in the room, but needs guardrails on the big decisions.
  • The operator-CEO. Both radars balanced and reasonably high. Rarest pattern; usually the founders an investor stops looking at deal flow for.
  • The first-timer with potential. Both radars uneven but with at least one high spike each — a strong Crisis Response paired with strong Reach, for example. The pattern that says "this person has the raw material; they need the right context to develop."
  • The talker. Judgment radar high on people-pleasing dimensions (People & Leadership inflated by safe choices), Execution radar flat across the board. The composite would have hidden this; the dual radar surfaces it within seconds.

None of these patterns require interpretation training to see. They show up in the shape of the two radars next to each other. The viewer reads the divergence and forms a picture.

What this means for product design

The dual radar is not a UI choice — it is the central thesis of the product. Specifically:

  • The founder profile page at /founder/<slug> is laid out in two columns, with the Judgment radar on the left and the Execution radar on the right. Same size, same visual weight.
  • There is no composite score anywhere in the product. We will not ship one even if users ask, because shipping one would undo the whole reason we run two radars.
  • The integrations marketplace suggests connectors based on which Execution dimensions are sparse — if your Reach is low because you have nothing connected that reports replies or audience, we suggest LinkedIn or Twitter; if your Depth is low because the LLM has nothing substantive to read, we suggest Notion or GitHub.
  • The simulator's cadence engine can dial up or down whether you get a sim event today versus a real-world task today, partly based on which radar is currently undertested. If your Judgment data is thin, you get more events. If your Execution data is thin, you get more tasks.

The whole product loops around the gap between the two radars and tries to close it — by giving the founder more ways to demonstrate both, by surfacing the divergence to the founder so they can see it, and by giving viewers the layout that lets them read it.

Why this matters for the credential

The endgame for Founders DNA is a credential that an investor, accelerator, or future co-founder can read in five minutes and act on. The dual radar plus the contributions heatmap plus the pinned ventures together form that credential.

A single number would have been an easier sell. It would also have been a worse credential. The signal value of "Founder X has Judgment 92, Execution 64, with a 31-day Stripe streak in March" is much higher than "Founder X scored 78." The viewer can do something with the first sentence. They cannot do anything with the second.

We are building for the read, not for the headline.

See your dual radar. Reserve a Founders DNA profile.

Get a spot in the beta

— Karthik

Frequently Asked Questions

Why not just average the two radars into one founder score?+

Because the contradiction is the signal. A founder who decides well in the simulator and ships nothing in real life is a different bet from a founder who ships every day and freezes when the simulator throws a crisis at them. Averaging the two hides the very pattern an investor most needs to see.

Are the two radars weighted differently?+

No. They sit side by side at equal visual weight. The viewer decides which matters more for their specific decision — a seed-stage investor looking at a pre-product founder may weight Execution heavily because the simulator has not yet been tested against reality; a Series A investor may weight Judgment because Execution has plenty of real-world signal already.

Can the two radars contradict each other?+

Yes, and the contradiction is the most useful pattern. A founder with Crisis Response 92 in the simulator and Consistency 38 in the activity graph is telling you something important about how they think versus how they show up. The two-radar layout is designed for that exact reading.

Does the simulator score change as my real activity grows?+

No. The two radars are independent by design. The simulator radar is a function of decisions made in the simulator; the Execution radar is a function of typed founder events from your integrations and proofs. Mixing the two would defeat the purpose.

Karthik Balaji

Karthik Balaji

Founder, CopilotVerse — ex-Microsoft Copilot engineer