Conversation Dynamics Monitoring

Quality is not a model property.
It is a conversation property.

Horizon monitors the structural dynamics of multi-turn AI conversations — semantic drift, temporal desync, causal light-cone collapse — dimensions every LLM is architecturally blind to.

+15.7%
Quality lift (A/B, 4 scenarios)
87%
Fewer hallucination events
<50ms
Pipeline latency on CPU
0
LLM calls — fully local

Get started — three paths

Hosted MCP
pip install
Self-hosted

No Python required. Request an alpha key via GitHub Discussions, then:

// ~/.cursor/mcp.json
{
  "mcpServers": {
    "horizon": {
      "url": "https://horizon.leocelis.com/sse",
      "headers": { "Authorization": "Bearer YOUR_KEY_HERE" }
    }
  }
}

Reload the Cursor MCP panel. Done — new_conversation, process_turn, configure_session appear instantly.

# Install
pip install horizon-monitor

# Verify (5 canonical scenarios, ~25s)
horizon-validate

# Use in Python
from horizon import FidelityMonitor

monitor = FidelityMonitor()
session_id = monitor.new_conversation()
result = monitor.process_turn(
    session_id,
    human_message="How does Python handle memory?",
    agent_response="Python uses reference counting...",
    timestamp="2026-05-06T21:00:00Z",
)
print(result.fidelity_score, result.health_status)
# MCP server via pip
pip install 'horizon-monitor[mcp]'
horizon serve  # stdio — Cursor, Claude Desktop

# Or Docker
cd deploy/docker && docker compose up
# → MCP SSE on localhost:3847/sse

What Horizon sees — vs. other tools

Tool Per-response quality Drift across turns Temporal desync Horizon signals
LangSmith, Braintrust
RAGAS, DeepEval
Human raters ✓ (subjective)
Horizon intentionally skipped

Horizon does not replace per-response tools. It adds the fourth dimension they all lack. Read the demand research →

14 event types — all observe-by-default

alert
alert.drift
Fidelity declining for N consecutive turns
alert
alert.contradiction
Bipredictability below consistency threshold
alert
alert.verbosity
Token Waste Ratio above verbosity threshold
checkpoint
checkpoint.clarification
D_JS above clarification threshold
checkpoint
checkpoint.comprehension
Consistency drops below threshold
signal
signal.convergence
IGT trend consistently low — natural endpoint
signal
signal.temporal_desync
Gap + retention drop below desync threshold
signal
signal.light_cone_collapse
Reachable fraction below light-cone threshold
signal
signal.optimal_length
T* (estimated optimal length) reached
signal
signal.broken_reference
Reachable fraction drops too low
signal
signal.horizon_widening
IGT trend strongly positive — expanding
signal
signal.frame_shift
Spatial constraint shifts significantly
signal
signal.pace_shift
Conversation acceleration above pace threshold
signal
signal.session_reset
Large temporal gap with low retention

Validation — all four gates pass

Gate Constraint v0.2.0
V1 — proxy correlation per-conv ρ ≥ 0.6, per-turn ρ ≥ 0.5 0.685 / 0.659
V2 — per-event P/R every event P ≥ 0.7 AND R ≥ 0.7 all 16 events ≥ 0.70
V3 — beats heuristics rho lift > 25%, structural P ≥ 0.6 +202.4% lift, P=R=1.00
V5 — cross-domain per-turn ρ ≥ 0.4 AND per-conv ρ ≥ 0.48 min 0.517 / 0.718

Full evidence pack →

Built with IVD — Intent-Verified Development

Horizon was designed, validated, and shipped using IVD — a framework where the AI writes a structured intent artifact with constraints and tests, implements against it, and verifies before you see a line of code.

Every constraint Horizon enforces, every validation gate (V1–V5), every event type — all defined in a single horizon_intent.yaml before implementation began. The result: zero undeclared behavior, four independent validation gates passing at first attempt.

IVD — Intent-Verified Development
Stop correcting AI. Make it verify itself. Zero hallucinations, one turn.
Get started with IVD →