The agents behind the toolkit

Most of the toolkit works on documents. The harder questions need an agent that can actually touch the data. The ones I've built run on a small, reusable engine, and that same engine is what the rest could be built on. I've been practicing it on synthetic claims before it matters on real ones.

The engine

It's one loop. I ask for something in plain English, it works out the codes, runs a single read-only query, and hands back the members and the count. When the ask is ambiguous, like what counts as a GLP-1, it stops and asks instead of guessing. That loop is the reusable part. Everything below is the same loop with one more step on the cohort it returns.

The loop

01plain-English ask

→

02resolve the codes

→

03one read-only query

→

04counted cohort

every built use case = the cohort, plus one step

04 · cohort+ break down by plan type and geography =Coverage Landscape

04 · cohort+ put the claims in order over time =Treatment Pathway

04 · cohort+ count a product's scripts by quarter and plan =Product Performance

04 · cohort+ …one more step =where it could go

⟲ ask-first gateWhen the ask is ambiguous — like what counts as a GLP-1 — it stops and asks instead of guessing.

Use case 01 · built

Claude Code plugin

Payer Practice · Cohort Builder

The first thing I pointed it at is building the cohort itself. It's a Claude Code plugin: a read-only connection to a synthetic claims database, and a subagent that knows how to query it. I ask for a cohort in plain English, it works out the codes, and it pulls the members in one query. When the ask is ambiguous, like what counts as a GLP-1, it stops and asks me instead of guessing. It hands back the count and a small sample rather than the whole member list, so the same ask works no matter how big the panel gets.

The kind of ask it takes

find diabetic patients (E10/E11/E13) on a diabetes-indicated GLP-1 only, continuously enrolled in 2025, age 18-75, with no HbA1c test in 2025

It shows the exact query it ran and the count it got back, on a synthetic panel.

It runs on synthetic claims by design. The mechanics transfer, the moat doesn't.

38-second clip of it running

Use case 02 · built

Claude Code plugin

Payer Practice · Coverage Landscape

The next thing I pointed it at: once it has a cohort, who pays for those patients and where they are. Same plugin, same ask-first discipline, with one more step on the cohort the loop returns. I give it a cohort in plain English, it builds the cohort the same way, then it breaks the members down by plan type and by state, with counts and percentages and a plan-by-state cross-tab. It counts each member once, and it suppresses any cell too small to share, the way you have to with real data. When it's not clear who belongs in the denominator, everyone with the diagnosis or only the continuously enrolled, it stops and asks.

It also holds at real volume. The plan-by-state breakdown on a two-million-member panel comes back in about two seconds. The first cut was slow, and the slow part was never the database, it was handing the whole member list back through the model on every step. Now it builds the cohort once and passes back a handle and a count instead of the list, so the size of the cohort stops mattering.

The kind of ask it takes

take everyone with a diabetes diagnosis (E10/E11/E13) and break them down by plan type and state

It shows the plan-type and geography split with counts and percentages, small cells suppressed, on a synthetic panel.

It runs on synthetic claims by design. The mechanics transfer, the moat doesn't.

Use case 03 · built

Claude Code plugin

Payer Practice · Treatment Pathway

The third thing I pointed it at: once it has a cohort, put their pharmacy claims in order over time. Same plugin, same ask-first discipline, one more step on the cohort. I give it a cohort and a therapy in plain English, it builds the cohort the same way, then it walks each member's fills across the year and reports lines of therapy, switches from one drug to another, add-ons where a second drug starts alongside the first, and how long people stay on before they stop. When the ask doesn't name a therapy to follow, or says "a GLP-1" without saying which, it stops and asks.

The one real choice here is what counts as a switch. By default it counts at the brand level, the way payers usually do, so Ozempic to Rybelsus counts even though both are semaglutide. Ask for the molecule and that kind of same-drug brand change drops out; ask for the therapeutic class and the within-class switches drop out too. It's the same cohort either way; the granularity is a real decision, and the agent states the one it used. Small groups are suppressed, the way you have to with real data.

The kind of ask it takes

take type 2 diabetics continuously enrolled in 2025 and put their GLP-1 and metformin therapy in order: lines of therapy, switches, and how long they stay on

It shows the lines-of-therapy split, the switch and add-on rates, and median time on treatment, small cells suppressed, on a synthetic panel.

It runs on synthetic claims by design. The mechanics transfer, the moat doesn't.

Use case 04 · built

Claude Code plugin

Payer Practice · Product Performance

The fourth thing I pointed it at: once it has a cohort, how a product is doing over time. Same plugin, same ask-first discipline, one more step on the cohort. I give it a product and a cohort in plain English, it builds the cohort the same way, then it counts that product's scripts by quarter and by plan, and breaks out the most recent quarter by plan. When I want a specific plan's number for last quarter, it reads that cell. When the ask doesn't name a product, or asks how something is performing without saying whether I mean total scripts or new patients, it stops and asks.

The one real choice here is what to count. Total scripts is every fill; new starts is each patient's first one, and they answer different questions, how much is being dispensed versus how fast a product is being picked up. The agent makes me pick and says which it used. The trend only means something if the data has a real one: most of the synthetic fills land in January, so almost any product looks like it's sliding off through the year, which is just an artifact of how the data was built. So the panel seeds a real shift, one product steadily gaining new patients across 2025 and another losing them, and the new-start curve picks it up. Small groups are suppressed, the way you have to with real data.

The kind of ask it takes

take everyone with an Ozempic fill and show scripts by quarter and by plan, with last quarter broken out by plan

It shows the scripts-per-quarter trend and the latest quarter by plan, total scripts or new starts, small cells suppressed, on a synthetic panel.

It runs on synthetic claims by design. The mechanics transfer, the moat doesn't.

Where it could go

Same engine, same ask-first discipline, pointed at a different question. Each one is the cohort plus a single step, built for external teams, pharma and payers. These are possibilities, not a roadmap. The rest aren't built yet.

On the claims

Burden of Illness

HEOR / Market Access

Cost the cohort against a matched comparator. Utilization and total cost of care.

Adherence & Care Gaps

HEOR / Payer quality

Who stays on therapy, and who has an open gap.

Risk Adjustment · HCC

Payer / Medicare Advantage

Surface suspected HCCs and what they do to the risk score.

If EHR or clinical notes were linked in

These don't ask the agent to read the notes. They assume the notes and EHR are already linked in as structured fields, and point the same engine at them. Same loop, richer data.

Real-World Outcomes

HEOR / Medical

Response, progression, and time on treatment from clinical fields, not claims stand-ins.

Biomarker Cohorts

Precision medicine / Oncology

Filter the cohort by biomarker and line of therapy. EGFR, PD-L1, that kind of thing.

Severity & Lab-Anchored Cohorts

HEOR

Split the cohort by stage, ECOG, and the actual lab value, not just whether a test was billed.