Scroll
semantic cache proxy · byok

Cut your LLM bill
40–70% today

Point your base_url at cache.kaissa.ai, keep your own API key, and we cache your LLM calls semantically — zero code changes.

Three steps

How it works

01 — Add

Paste your API key

Sign in and paste your OpenAI or Anthropic key. We store it encrypted and issue you a Kaissa virtual key.

sk-ant-api01-your-key
→ sk-bf-abc123xyz
02 — Point

Change one line

Update your base_url. Use your Kaissa virtual key as the API key. Everything else stays the same.

OPENAI_BASE_URL=https://api.openai.com/v1 OPENAI_BASE_URL=https://cache.kaissa.ai/v1 OPENAI_API_KEY=sk-bf-abc123xyz
03 — Save

Cache hits are free

Every semantically similar request hits our HNSW cache (0.3ms). Only true misses reach the provider and cost tokens.

Cache HIT → 0.3ms · $0.00 Cache MISS → ~300ms · your cost
Simple pricing

Start free. Scale when you need.

Free forever
Hack
$0/mo
10,000 requests / month
  • Exact + semantic cache
  • OpenAI + Anthropic
  • 1 API key
  • Community support
Get started free
Enterprise
Fleet
Custom
Unlimited requests
  • Dedicated Redis 80 GB
  • VPC deployment
  • SOC 2 (coming)
  • SLA + dedicated support
  • Unlimited keys
Contact us
Ready to save?

Start in 60 seconds.
No credit card.

Sign in with Google, paste your API key, change one env var. You're caching.

Sign in with Google
cache.kaissa.ai
Your Kaissa key
Connect provider key

Add your OpenAI or Anthropic key. We store it encrypted and route your requests through it.

sk-
Cache performance
Total requests
Cache hit rate
higher = more savings
Tokens saved
Cache hits
Cache misses
Est. savings
USD this period
Connected keys