mergedelta measures whether two AI systems actually agree on the same input — frame by frame — through vocabulary convergence, reasoning alignment, and evidence integrity. When they diverge, the delta tells you which one is performing.
Pick a prompt. We run it through two AI substrates, measure the D-score, and surface the verdict. Five curated cases to start — pattern-matched to what most users ask first.
3 free runs remaining today
Substrate A · reasoning-focused
Substrate B · completion-focused
D-score—
0.0 — diverged0.70 threshold1.0 — converged
Rate limit reached — join the waitlist
3 free runs per day. Full access (unlimited cross-substrate queries, your own prompts, API integration) ships 2027 Q2 after the non-provisional patent files. Join the waitlist and we'll notify you.
✓ on the list
audio substrates · live playback
The same instrument on audio.
mergedelta extends to any two substrates. Here they're audio outputs. Hit async to play both streams at once and let your ears do the delta in real time — or sequential for careful comparison. D-score is computed from the sidecar acoustic probe (RMS, F0, fry, kurtosis). Three applications, one measurement.
3 free plays remaining today
Substrate A
—
Substrate B
—
Acoustic D-score—
0.0 — diverged0.70 threshold1.0 — converged
Acoustic rate limit reached
3 free acoustic comparisons per day. Full access — unlimited queries, your own prompts, bring-your-own-TTS pipeline — ships 2027 Q2.
✓ on the list
compose · tag · measure
Tone Shaper
deliberate per-phrase FX shaping for AI voice
Type a line. Drag tags onto phrases — the tag attaches invisibly to the words, no bracket syntax. Preview any highlighted phrase to hear the shaped output. Render through two substrates and see the D-score.
tagged preview · click a tag to arm (◆) then click a word · or drag tag → word
tag palette · drag onto selected word
Free: 6 compressive tags. Pro Max unlocks 20 more (expansive, emotional, scene-directives).
3 free renders remaining today
Substrate A · Gemini TTS
— awaiting render —
Substrate B · @Grok TTS beta
— awaiting render —
the three axes
What D-score measures
Three axes composed into one float. Above 0.70, the models have genuinely converged. Below, one is likely performing — say, producing confident-sounding output without the reasoning to back it.
axis 1
Vocabulary convergence
Do the two responses use the same terminology to describe the same thing? Surface similarity — necessary but not sufficient.
axis 2
Reasoning alignment
Do the stated reasoning chains lead to the stated conclusions? Catches performed confidence — surface certainty the reasoning underneath doesn't earn.
axis 3
Evidence integrity
Do claims cite checkable sources? Are those sources real? Hallucinations surface as integrity zeros — one substrate fabricates, the other says "I don't know."