tech/cloudflare/ai

AI

Cloudflare AI stack. Use skills in this subdomain when building:

production
improves: tech/cloudflare

Cloudflare AI Stack

Three services that compose into the 2nth.ai AI pipeline:

ServiceRoleWhen to use
tech/cloudflare/ai/workers-aiEdge inferenceClassification, routing, embeddings — cheap + zero latency
tech/cloudflare/ai/ai-gatewayProxy + observabilityAll Claude calls — token metering, caching, fallback
tech/cloudflare/ai/vectorizeVector databaseRAG, semantic search, skill discovery

The 2nth AI pattern

Request
  → Workers AI (Llama 3.1 8B) — classify intent at edge (5–20 tokens, ~1ms)
  → If complex: AI Gateway → Claude — domain expert response
  → Vectorize — retrieve relevant skills/context for RAG
  → Response streams back to client
  → AI Gateway logs token usage (→ Penny's token economy)

This pattern minimises Claude API costs by filtering and enriching at the edge before the expensive call.