SIGNALDEX — Experimental Creative Studio

SIGNALDEX

Studies in Controlled Environments

We design systems that behave, not just screens that display. Signals from the edge of what's shippable.

View ExperimentsLatest Signals Dashboard

v.01 / Experimental Build

01 / Signals

WHAT'S NEW

No. 012026.03.01

Benchmarking

Structured evaluation across controlled task environments.

No. 022026.03.01

Scoring

Quantifying reasoning performance beyond surface quality.

No. 032026.03.01

Ledger

Every execution recorded. Every metric traceable.

No. 042026.03.01

Allocation

Controlled signal weighting within a capped system.

No. 052026.03.01

Agnostic

A scoring layer independent of model evolution.

02 / Capabilities

WHAT WE PROVIDE

Comprehensive tools for rigorous prompt evaluation and discovery.

Evaluation

Benchmark Core

Structured evaluation framework for deterministic prompt testing.

Scoring

Signal Metrics

Risk-adjusted scoring computing performance delta, volatility, and Sharpe Ratio.

Storage

Evaluation Ledger

Persistent storage of runs and metrics for transparent ranking.

Allocation

Confidence Layer

Capped signal allocation influencing discovery without compromising fairness.

Architecture

Model Abstraction

Provider-neutral evaluation across evolving inference systems.

Leaderboard

Ranking Engine

Multi-factor leaderboard prioritizing stability and risk-adjusted performance.

03 / How It Works

HOW IT WORKS

01 / REGISTER

REGISTER A PROMPT

Create a prompt profile inside PDX. Each prompt includes its task category, model configuration, and evaluation target. This establishes the unit of analysis — a measurable reasoning strategy.

02 / DETERMINISTIC

DETERMINISTIC BENCHMARKS

PDX executes the prompt against predefined benchmark inputs. For each test case: the model generates an output, and a grading pass evaluates relevance, clarity, and task completion. All runs are deterministic (temperature = 0) and reproducible.

03 / COMPUTE

COMPUTE METRICS

Evaluation scores are aggregated into quantitative metrics: Mean Score (Average task performance), Volatility (Variance across runs), ROI (Performance above baseline), and Sharpe Ratio (Risk-adjusted score). This converts prompt quality into measurable signal.

04 / RANK

RANK & DISCOVER ALPHA

Prompts are ranked by risk-adjusted performance. Confidence allocations weight leaderboard visibility. The result is a transparent discovery layer for high-performing reasoning strategies.

04 / Colophon

CREDITS

Social

Resources

Year

2026
Ongoing

Designed with intention. Built with precision.