Skill v1.0.1
currentAutomated scan100/10010 files
version: "1.0.1" name: cloudflare-vectorize description: "Cloudflare Vectorize vector database for semantic search and RAG. Use for vector indexes, embeddings, similarity search, or encountering dimension mismatches, filter errors."
metadata: keywords:
- vectorize
- vector database
- vector index
- vector search
- similarity search
- semantic search
- nearest neighbor
- knn search
- ann search
- RAG
- retrieval augmented generation
- chat with data
- document search
- semantic Q&A
- context retrieval
- bge-base
- "@cf/baai/bge-base-en-v1.5"
- text-embedding-3-small
- text-embedding-3-large
- Workers AI embeddings
- openai embeddings
- insert vectors
- upsert vectors
- query vectors
- delete vectors
- metadata filtering
- namespace filtering
- topK search
- cosine similarity
- euclidean distance
- dot product
- wrangler vectorize
- metadata index
- create vectorize index
- vectorize dimensions
- vectorize metric
- vectorize binding
license: MIT
Cloudflare Vectorize
Complete implementation guide for Cloudflare Vectorize - a globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications with Cloudflare Workers.
Status: Production Ready ✅ Last Updated: 2025-11-21 Dependencies: cloudflare-worker-base (for Worker setup), cloudflare-workers-ai (for embeddings) Latest Versions: wrangler@4.50.0, @cloudflare/workers-types@4.20251014.0 Token Savings: ~65% Errors Prevented: 8 Dev Time Saved: ~3 hours
What This Skill Provides
Core Capabilities
- ✅ Index Management: Create, configure, and manage vector indexes
- ✅ Vector Operations: Insert, upsert, query, delete, and list vectors
- ✅ Metadata Filtering: Advanced filtering with 10 metadata indexes per index
- ✅ Semantic Search: Find similar vectors using cosine, euclidean, or dot-product metrics
- ✅ RAG Patterns: Complete retrieval-augmented generation workflows
- ✅ Workers AI Integration: Native embedding generation with @cf/baai/bge-base-en-v1.5
- ✅ OpenAI Integration: Support for text-embedding-3-small/large models
- ✅ Document Processing: Text chunking and batch ingestion pipelines
Templates Included
- basic-search.ts - Simple vector search with Workers AI
- rag-chat.ts - Full RAG chatbot with context retrieval
- document-ingestion.ts - Document chunking and embedding pipeline
- metadata-filtering.ts - Advanced filtering examples
Critical Setup Rules
⚠️ MUST DO BEFORE INSERTING VECTORS
# 1. Create the index with FIXED dimensions and metricbunx wrangler vectorize create my-index \--dimensions=768 \--metric=cosine# 2. Create metadata indexes IMMEDIATELY (before inserting vectors!)bunx wrangler vectorize create-metadata-index my-index \--property-name=category \--type=stringbunx wrangler vectorize create-metadata-index my-index \--property-name=timestamp \--type=number
Why: Metadata indexes MUST exist before vectors are inserted. Vectors added before a metadata index was created won't be filterable on that property.
Index Configuration (Cannot Be Changed Later)
# Dimensions MUST match your embedding model output:# - Workers AI @cf/baai/bge-base-en-v1.5: 768 dimensions# - OpenAI text-embedding-3-small: 1536 dimensions# - OpenAI text-embedding-3-large: 3072 dimensions# Metrics determine similarity calculation:# - cosine: Best for normalized embeddings (most common)# - euclidean: Absolute distance between vectors# - dot-product: For non-normalized vectors
Wrangler Configuration
wrangler.jsonc:
{"name": "my-vectorize-worker","main": "src/index.ts","compatibility_date": "2025-10-21","vectorize": [{"binding": "VECTORIZE_INDEX","index_name": "my-index"}],"ai": {"binding": "AI"}}
TypeScript Types
export interface Env {VECTORIZE_INDEX: VectorizeIndex;AI: Ai;}interface VectorizeVector {id: string;values: number[] | Float32Array | Float64Array;namespace?: string;metadata?: Record<string, string | number | boolean | string[]>;}interface VectorizeMatches {matches: Array<{id: string;score: number;values?: number[];metadata?: Record<string, any>;namespace?: string;}>;count: number;}
Common Operations
Quick Reference
| Operation | Method | Key Point | |
|---|---|---|---|
| Insert | insert([...]) | Keeps first if ID exists | |
| Upsert | upsert([...]) | Overwrites if ID exists (use for updates) | |
| Query | query(vector, { topK, filter }) | Returns similar vectors | |
| Delete | deleteByIds([...]) | Remove by ID array | |
| Get | getByIds([...]) | Retrieve specific vectors |
Filter Operators
| Operator | Example | Description | |
|---|---|---|---|
$eq | { category: "docs" } | Equality (implicit) | |
$ne | { status: { $ne: "archived" } } | Not equal | |
$in | { category: { $in: ["a", "b"] } } | In array | |
$nin | { category: { $nin: ["x"] } } | Not in array | |
$gte/$lt | { timestamp: { $gte: 123 } } | Range queries |
📄 Full operations guide: Load references/vector-operations.md for complete insert/upsert/query/delete examples with code.
Embedding Generation
| Model | Provider | Dimensions | Best For | |
|---|---|---|---|---|
@cf/baai/bge-base-en-v1.5 | Workers AI | 768 | Free, general purpose | |
text-embedding-3-small | OpenAI | 1536 | Balance quality/cost | |
text-embedding-3-large | OpenAI | 3072 | Highest quality |
📄 Integration guides:
- Load
references/integration-workers-ai-bge-base.mdfor Workers AI setup - Load
references/integration-openai-embeddings.mdfor OpenAI integration
Metadata Best Practices
Key Limits
| Limit | Value | |
|---|---|---|
| Max metadata indexes | 10 per index | |
| Max metadata size | 10 KiB per vector | |
| String index | First 64 bytes (UTF-8) | |
| Filter size | Max 2048 bytes |
Invalid Key Characters
Keys cannot: be empty, contain . (reserved for nesting), contain ", or start with $.
📄 Complete metadata guide: Load references/metadata-guide.md for cardinality best practices, nested metadata, and advanced filtering patterns.
RAG Pattern (Full Example)
export default {async fetch(request: Request, env: Env): Promise<Response> {const { question } = await request.json();// 1. Generate embedding for user questionconst questionEmbedding = await env.AI.run('@cf/baai/bge-base-en-v1.5', {text: question});// 2. Search vector database for similar contentconst results = await env.VECTORIZE_INDEX.query(questionEmbedding.data[0],{topK: 3,returnMetadata: 'all',filter: { type: "documentation" }});// 3. Build context from retrieved documentsconst context = results.matches.map(m => m.metadata.content).join('\n\n---\n\n');// 4. Generate answer with LLM using contextconst answer = await env.AI.run('@cf/meta/llama-3-8b-instruct', {messages: [{role: "system",content: `Answer based on this context:\n\n${context}`},{role: "user",content: question}]});return Response.json({answer: answer.response,sources: results.matches.map(m => m.metadata.title)});}};
Document Chunking Strategy
Recommended chunk sizes: 300-500 characters for semantic coherence.
Key metadata for chunks:
doc_id: Parent document IDchunk_index: Position in documentcontent: Text for retrieval display
📄 Full chunking implementation: See templates/document-ingestion.ts for complete chunking pipeline.
Common Errors & Solutions
Error 1: Metadata Index Created After Vectors Inserted
Problem: Filtering doesn't work on existing vectorsSolution: Delete and re-insert vectors OR create metadata indexes BEFORE inserting
Error 2: Dimension Mismatch
Problem: "Vector dimensions do not match index configuration"Solution: Ensure embedding model output matches index dimensions:- Workers AI bge-base: 768- OpenAI small: 1536- OpenAI large: 3072
Error 3: Invalid Metadata Keys
Problem: "Invalid metadata key"Solution: Keys cannot:- Be empty- Contain . (dot)- Contain " (quote)- Start with $ (dollar sign)
Error 4: Filter Too Large
Problem: "Filter exceeds 2048 bytes"Solution: Simplify filter or split into multiple queries
Error 5: Range Query on High Cardinality
Problem: Slow queries or reduced accuracySolution: Use lower cardinality fields for range queries, or use seconds instead of milliseconds for timestamps
Error 6: Insert vs Upsert Confusion
Problem: Updates not reflecting in indexSolution: Use upsert() to overwrite existing vectors, not insert()
Error 7: Missing Bindings
Problem: "VECTORIZE_INDEX is not defined"Solution: Add [[vectorize]] binding to wrangler.jsonc
Error 8: Namespace vs Metadata Confusion
Problem: Unclear when to use namespace vs metadata filteringSolution:- Namespace: Partition key, applied BEFORE metadata filters- Metadata: Flexible key-value filtering within namespace
Wrangler CLI Reference
Essential commands:
# Create index (dimensions/metric are PERMANENT)bunx wrangler vectorize create <name> --dimensions=768 --metric=cosine# Create metadata index (MUST be before inserting vectors!)bunx wrangler vectorize create-metadata-index <name> --property-name=category --type=string# Get index infobunx wrangler vectorize info <name>
📄 Full CLI reference: Load references/wrangler-commands.md for all vectorize commands.
Performance Tips
- Batch Operations: Insert/upsert in batches of 100-1000 vectors
- Selective Return: Only use
returnValues: truewhen needed (saves bandwidth) - Metadata Cardinality: Keep indexed metadata fields low cardinality for range queries
- Namespace Filtering: Apply namespace filter before metadata filters (processed first)
- Query Optimization: Use topK=3-10 for best latency (larger values increase search time)
When to Use This Skill
✅ Use Vectorize when:
- Building semantic search over documents, products, or content
- Implementing RAG chatbots with context retrieval
- Creating recommendation engines based on similarity
- Building multi-tenant applications (use namespaces)
- Need global distribution and low latency
❌ Don't use Vectorize for:
- Traditional relational data (use D1)
- Key-value lookups (use KV)
- Large file storage (use R2)
- Real-time collaborative state (use Durable Objects)
When to Load References
| Reference File | Load When... | |
|---|---|---|
references/vector-operations.md | Need full insert/upsert/query/delete code examples | |
references/metadata-guide.md | Setting up metadata indexes, filtering best practices | |
references/wrangler-commands.md | Using Vectorize CLI commands | |
references/integration-workers-ai-bge-base.md | Integrating Workers AI embeddings | |
references/integration-openai-embeddings.md | Integrating OpenAI embeddings | |
references/embedding-models.md | Comparing embedding model options | |
references/index-operations.md | Index lifecycle management |
Templates
| Template | Purpose | |
|---|---|---|
templates/basic-search.ts | Simple vector search | |
templates/rag-chat.ts | Complete RAG chatbot | |
templates/document-ingestion.ts | Document chunking pipeline | |
templates/metadata-filtering.ts | Advanced filtering |
Secure Installation
When installing vector database packages, follow supply chain security best practices:
- Block post-install scripts —
npm config set ignore-scripts true(or Bun: disabled by default) - Cooldown period — Wait 7 days for new package versions to be vetted by the community
- Audit before installing — Run
socket package score npm <pkg>or usesocket npm install <pkg>to check packages
Load the dependency-upgrade skill for full security configuration including Socket CLI integration, cooldown setup, lockfile validation, and CI enforcement.
Official Documentation
Version: 1.0.0 Status: Production Ready ✅ Token Savings: ~65% Errors Prevented: 8 major categories Dev Time Saved: ~2.5 hours per implementation