semantic cache proxy · byok

Cut your LLM bill
40–70% today

Point your base_url at cache.kaissa.ai, keep your own API key, and we cache your LLM calls semantically — zero code changes.

See how it works

Three steps

How it works

01 — Add

Paste your API key

sk-ant-api01-your-key
→ sk-bf-abc123xyz

02 — Point

Change one line

Update your base_url. Use your Kaissa virtual key as the API key. Everything else stays the same.

OPENAI_BASE_URL=https://api.openai.com/v1 OPENAI_BASE_URL=https://cache.kaissa.ai/v1 OPENAI_API_KEY=sk-bf-abc123xyz

03 — Save

Cache hits are free

Every semantically similar request hits our HNSW cache (0.3ms). Only true misses reach the provider and cost tokens.

Cache HIT → 0.3ms · $0.00 Cache MISS → ~300ms · your cost

Simple pricing

Start free. Scale when you need.

Free forever

Hack

$0/mo

10,000 requests / month

Exact + semantic cache
OpenAI + Anthropic
1 API key
Community support

Get started free

Start in 60 seconds.
No credit card.

cache.kaissa.ai

Your Kaissa key

Integration

Change these two environment variables — everything else stays the same.

# OpenAI
OPENAI_BASE_URL=https://cache.kaissa.ai/v1
OPENAI_API_KEY=sk-bf-xxxxxxxx

# Anthropic
ANTHROPIC_BASE_URL=https://cache.kaissa.ai/anthropic
ANTHROPIC_API_KEY=sk-bf-xxxxxxxx

Connect provider key

Add your OpenAI or Anthropic key. We store it encrypted and route your requests through it.

sk-

Cache performance

Total requests

—

Cache hit rate

—

higher = more savings

Tokens saved

—

Cache hits

—

Cache misses

—

Est. savings

—

USD this period

Connected keys

Cut your LLM bill40–70% today