SynthPanel · open source · backed by SynthBench

Run a synthetic panel. Know what it's worth.

When you can't run real user research, a synthetic panel is a reasonable stand-in. It won't give you the truth, but it will give you a direction. The catch is knowing how much to read into any given result, which is the problem SynthBench works on. SynthPanel is the open-source harness you run. SynthBench is the benchmarking service that tells you how well it holds up.

Visit synthpanel.dev See SynthBench

The pair

Run it,
then weigh it.

SynthPanel is the harness you run. SynthBench is the service that tells you how much to read into what it says.

synthpanel.dev · open-source harness

SynthPanel

Runs the panel.

SynthPanel lets an agent run a synthetic survey the way a researcher would set one up. It draws from persona packs, selects the population you want, surfaces the known biases, and suggests how to tune things like the ensemble blend. Underneath it is an API built for agents to operate, so the orchestration stays out of the way.

Visit synthpanel.dev → synthbench.org · benchmarking service

SynthBench

Measures how representative it is.

SynthBench checks whether synthetic answers actually resemble real people. It runs models and configurations against curated test datasets where the real human responses are already known, then measures how closely they match, where they fall short, and which groups they represent well. The setups that score well feed back into SynthPanel.

Visit synthbench.org →

Two projects, one feedback loop.

How they fit

Each one makes
the other better.

You can use either on its own. Together they form a loop: SynthBench scores runs against its test datasets, and the setups that score well shape how SynthPanel runs elsewhere.

01 — BENCHMARK

SynthBench scores the runs

SynthPanel runs against SynthBench's test datasets, questions where the real human answers are already known, so each model and configuration gets a clear score on how closely it tracks real people.

02 — RECOMMEND

SynthPanel learns

Whatever holds up best on those datasets becomes a recommendation inside SynthPanel, so the agent running your panel starts from a setup that's already been measured, not a guess.

03 — RUN

You run on what scored well

Your own panel runs on those vetted setups. As SynthBench adds datasets and tests new models, the recommendations underneath you keep improving.

↻ It compounds over time. SynthBench gives you a measurement instead of a guess, SynthPanel is what you run day to day, and as the datasets grow, the recommendations behind every panel get a little more reliable.

What's inside

What each
one does.

SynthPanel is the harness you run. SynthBench is the service that measures how representative it is.

synthpanel

An agent-first API

SynthPanel is built for an agent to operate. The API handles the orchestration, batching, and map-reduce, so the agent can stay focused on the research itself.

synthpanel

Persona packs & populations

Ready-made batches of personas to draw from, plus controls to choose who's in your panel, so you're sampling the group you care about rather than a generic crowd.

synthpanel

Bias surfaced up front

SynthPanel shows the known biases before you read too much into a result, and points the agent toward tuning that has tested well, like ensemble blends.

synthbench

Representativeness testing

Runs against curated datasets where the real human answers are known, measuring how closely synthetic answers follow actual people, including the demographic subgroups that aggregate numbers tend to hide.

synthbench

Cross-model bias

How bias shifts from one model to the next, which groups a large set of agents represents well, and the cases where nondeterminism helps rather than hurts.

synthbench

Public findings

SynthBench's results are public. They're worth reading on their own, and they feed back into the recommendations everyone running SynthPanel sees.

Source: SynthPanel on GitHub · SynthBench on GitHub · PyPI

Run a synthetic panel. Know what it's worth.

Run it,then weigh it.

SynthPanel

SynthBench

Each one makesthe other better.

SynthBench scores the runs

SynthPanel learns

You run on what scored well

What eachone does.

An agent-first API

Persona packs & populations

Bias surfaced up front

Representativeness testing

Cross-model bias

Public findings

Run it,
then weigh it.

Each one makes
the other better.

What each
one does.