Vector embeddings — Definition

In practice, an embedding model (text-embedding-3-large, voyage-3, BGE, etc.) takes text as input and returns a fixed-size vector (often 1,024 or 1,536 dimensions). Two texts with similar meaning produce vectors that are close under cosine distance.

Vector embeddings are the foundational building block of every semantic retrieval architecture: RAG, tool-RAG, content recommendation, deduplication, and clustering. Accuracy depends on the model used (text-embedding-3-large > ada-002), the language (multilingual models like Voyage outperform English-only models on non-English text), and the upstream chunking strategy.

Storage in 2026: pgvector (native Postgres, ideal up to 10M vectors per tenant), Qdrant (higher raw performance, operated separately), Pinecone (managed, premium pricing). For most SaaS products, pgvector is more than sufficient.

FAQ

Why pgvector instead of Pinecone?

Up to a few million vectors per tenant, pgvector is comparably fast with the added benefit of unified operations (single database to back up, native RLS, direct SQL joins with your other tables). Pinecone becomes relevant beyond that scale.

Cosine or Euclidean distance?

Cosine for the vast majority of cases (vector magnitude carries no semantic meaning — only direction matters). Euclidean only for specific cases such as normalized image search.