tech/cloudflare/ai

AI

Cloudflare AI stack. Use skills in this subdomain when building:

production

improves: tech/cloudflare

Cloudflare AI Stack

Three services that compose into the 2nth.ai AI pipeline:

Service	Role	When to use
`tech/cloudflare/ai/workers-ai`	Edge inference	Classification, routing, embeddings — cheap + zero latency
`tech/cloudflare/ai/ai-gateway`	Proxy + observability	All Claude calls — token metering, caching, fallback
`tech/cloudflare/ai/vectorize`	Vector database	RAG, semantic search, skill discovery

The 2nth AI pattern

Request
  → Workers AI (Llama 3.1 8B) — classify intent at edge (5–20 tokens, ~1ms)
  → If complex: AI Gateway → Claude — domain expert response
  → Vectorize — retrieve relevant skills/context for RAG
  → Response streams back to client
  → AI Gateway logs token usage (→ Penny's token economy)

This pattern minimises Claude API costs by filtering and enriching at the edge before the expensive call.