Stop optimizing tokens.
Start optimizing outcomes.

Compass is the first model selection engine that balances cost against real business KPIs — so you spend less where outcomes don't suffer, and invest more where they matter most.

Request Early Access See How It Works ↓
compass dashboard — customer-support workflow
Live Model Decisions ● Auto
Sonnet 4.5 Billing dispute CSAT ↑12% · worth 3x cost
GPT-4.1 Password reset same KPI · –78% cost
Gemini 2.5 Tech support Escalation ↓23% · –31% cost
Llama 4 FAQ queries same KPI · –91% cost
Cost ↔ Outcome (30-day) ● Live
CSAT Score
4.6
↑ 0.4 vs baseline
Cost per Resolution
$0.08
↓ 54% vs single-model
Escalation Rate
8.2%
↓ 31% vs baseline
Cost / CSAT Point
$0.02
↓ 61% vs GPT-4 only
The Problem

Choosing the right model is still a manual, never-ending process.

Without Compass

Manual experimentation, blind cost cuts

Teams either overspend by defaulting to the most expensive model for everything, or cut cost blindly and watch quality metrics drop — with no systematic way to find the right balance.

One model for all tasks — overspending on simple ones
Cost cuts that silently degrade business outcomes
No visibility into cost-per-outcome by workflow
Re-evaluation every time a new model ships
With Compass

Automatic, outcome-aware selection

Compass doesn't just cut cost — it finds the optimal tradeoff between what you spend and the business outcomes you get. Less where quality doesn't suffer, more where it matters.

Cost per resolved ticket, not cost per token
CSAT and escalation rate as optimization targets
Spend allocated by outcome impact, not uniformly
Continuous learning as models and costs change
How It Works

Three steps to outcome-driven model selection

01

Connect

Drop in our SDK or use the OpenAI-compatible API. Point your LLM calls through Compass. Define the KPIs that matter to your business.

import { Compass } from '@compass/sdk' const compass = new Compass({ kpis: ['csat', 'escalation_rate'], workflow: 'customer-support' })
02

Explore

Compass intelligently distributes traffic across models, running controlled experiments to learn which models drive the best business outcomes for each request type.

// Compass explores automatically // No manual A/B test setup const response = await compass.complete({ messages: conversation, // model selected automatically })
03

Converge

As KPI telemetry flows back, Compass converges on optimal selection policies per workflow. It continuously adapts as models improve and your product evolves.

// Report outcomes back compass.reportOutcome({ requestId: response.id, csat: 4.8, escalated: false, resolveTime: 240 // seconds })
The Closed Loop

What makes Compass different

A continuous feedback loop between model decisions and business outcomes — fully automated.

Your App
LLM Requests
Compass
Selection Engine
Claude Sonnet 4.5
GPT-4.1
Gemini 2.5 Flash
Llama 4 Scout
Output
Response
← KPI Telemetry: CSAT · Escalation Rate · Time-to-Close · Conversion · Custom Metrics → Selection Policy Update
The Insight

We've seen this problem before —
in a different industry.

Compass was born from a pattern we recognized across two seemingly different domains.

Co-founder Alex spent years building ad optimization engines for Return on Ad Spend (ROAS) campaigns. The breakthrough insight: ad platforms that optimized only on click-through rates and impressions consistently underperformed those that closed the loop with actual business outcomes — revenue, customer lifetime value, margin per acquisition.

The ad industry spent years to shift away from pure technical metrics (CTR, CPM) that are easy to measure but often misleading. The campaigns that actually moved the needle were the ones where ad telemetry was blended with downstream business data — creating a closed feedback loop that could learn what "good" really meant for each customer.

Today's LLM model selection typically relies on token cost, latency, and benchmark quality scores — proxies that are easy to measure but don't capture what matters downstream. The model that scores highest on a benchmark isn't necessarily the model that closes more tickets, converts more leads, or reduces churn. Compass brings the same closed-loop approach to model selection that transformed ad optimization.

📡
The Ad Engine Playbook
1
Ad Tech Stop optimizing for clicks
2
Ad Tech Blend ad telemetry + business data
3
Ad Tech Optimize for ROAS directly
Same playbook, new domain
1
Compass Stop optimizing for tokens
2
Compass Blend model telemetry + KPI data
3
Compass Optimize for business outcomes
Use Cases

Every AI workflow has a business metric

Compass learns the right model for each one — automatically.

Customer Support

Send complex billing issues to frontier models, simple FAQs to fast/cheap models. Optimize for CSAT and resolution time.

KPI: CSAT + Escalation Rate

Sales Outreach

Maximize reply rates and meeting bookings by learning which models generate the most effective personalized messaging.

KPI: Reply Rate + Meetings Booked

Content Generation

Balance creative quality with production speed. Compass learns which models produce content that drives the most engagement.

KPI: Engagement + Publish Rate

RAG & Search

Optimize retrieval-augmented generation for answer accuracy and user satisfaction across different query complexity levels.

KPI: Answer Accuracy + Click-through

Compliance Review

Direct sensitive regulatory content to models with the lowest error rates while keeping costs manageable for routine checks.

KPI: Error Rate + Processing Time

Product Recommendations

Increase conversion by learning which models generate the most compelling product suggestions for each customer segment.

KPI: Conversion Rate + AOV
Early Access

Spend less where it doesn't matter.
Invest more where it does.

Join our early access program. Compass finds the optimal balance between model cost and business outcomes — automatically.

Request Early Access Talk to Founders