MAISON CODE .
/ Tech · AI · Search · Vectors · Database

Vector Databases: The Memory of AI

SQL uses exact keyword matching. Vector Databases use Semantic Search. How to build search engines that understand that 'Scarlet' is similar to 'Red'.

AB
Alex B.
Vector Databases: The Memory of AI

The History of Search: From SQL to Vectors

To understand why Vector Databases are necessary, we have to look at the history of Information Retrieval.

Generation 1: Exact Match (1970s) The SQL LIKE operator. Query: SELECT * FROM products WHERE description LIKE '%dress%'. This is binary. It either contains the string “dress” or it doesn’t. If the user types “attire”, they get zero results. This forced users to learn “Search-Speak” (typing specific keywords they knew would work).

Generation 2: Inverted Indexes (1990s) Lucene, Elasticsearch, Algolia. These split text into tokens (["blue", "dress"]), remove stop words ("the", "a"), and perform stemming ("running" -> "run"). They then score documents using TF-IDF (Term Frequency-Inverse Document Frequency) or BM25. This allowed for fuzzy matching and typo tolerance. But it still failed on meaning. “Warm coat” does not match “Winter Jacket” in a keyword engine unless manually synonymized.

Generation 3: Semantic Search (2020s) Vectors. We no longer look at words. We look at Concepts. The word “King” is no longer a string of 4 characters (K-i-n-g). It is a point in a 1,536-dimensional space. And guess what? The point for “Queen” is right next to it. Using cosine similarity, we can find documents that are conceptually similar, even if they share zero keywords.

Why Maison Code Discusses This

We don’t build search bars; we build Discovery Engines. For our clients with catalogs >1,000 SKUs, standard search is a conversion killer. We recently implemented hybrid vector search for a luxury furniture brand:

  • Challenge: Users searched for “Vibes” (“Cozy aesthetic for reading nook”), not Keywords (“Velvet Armchair”).
  • Action: We vectorized their entire catalog using OpenAI Embeddings.
  • Result: Bounce rate on search pages dropped by 35%. We believe that in 2026, search must be semantic, not syntactic.

Understanding Embeddings (The Math)

How does a computer understand “Meaning”? It converts text into numbers. This Process is called Embedding.

Imagine a simple 2D graph of all concepts.

  • X-Axis: Royalness (How royal is it?)
  • Y-Axis: Gender (Masculine vs Feminine)

Now let’s map some words:

  • “King”: [0.9, 0.9] (Very Royal, Masculine)
  • “Queen”: [0.9, 0.1] (Very Royal, Feminine)
  • “Man”: [0.1, 0.9] (Not Royal, Masculine)
  • “Woman”: [0.1, 0.1] (Not Royal, Feminine)

Here is the magic. You can do math with these words. King - Man + Woman = ?

  • [0.9, 0.9] - [0.1, 0.9] + [0.1, 0.1]
  • = [0.9, 0.1]
  • = Queen.

This is literally how LLMs “think”. They understand relationships via vector arithmetic. Modern models like OpenAI’s text-embedding-3-small use 1,536 dimensions (not just 2) to capture nuance like Sentiment, Color, Urgency, Formality, and hundreds of other linguistic features we can’t even name.

The Vector Database Stack

You cannot store these vectors in a normal MySQL database efficiently. Standard databases are optimized for B-Tree Indexing (sorting alphanumerically). Search Vector databases are optimized for Approximate Nearest Neighbor (ANN) search. Scanning 1 million vectors to find the closest one would take seconds (linear scan). Vector DBs use HNSW (Hierarchical Navigable Small World) graphs to do it in milliseconds.

Top Players:

  1. Pinecone: Managed, cloud-native. The industry standard. Easy to start.
  2. Weaviate: Open-source, modular. Allows storing objects alongside vectors. Good for self-hosting.
  3. Milvus: High scale, Zilliz. Built for billions of vectors.
  4. pgvector: A plugin for PostgreSQL. Great if you want to keep everything in one place (Supabase supports this).

Comparison Matrix

FeaturePineconepgvectorWeaviate
TypeManaged SaaSPostgres ExtensionOSS / Managed
LatencyUltra LowMediumLow
ComplexityLowLowMedium
Cost$$$$ (Free if self-hosted)$$
Metadata FilteringExcellentGood (SQL)Excellent

Implementation: Building a Semantic Search Engine

Here is how you build “Search by Meaning” using Node.js and Pinecone.

import { Pinecone } from '@pinecone-database/pinecone';
import OpenAI from 'openai';

// Initialize Clients
const pinecone = new Pinecone({ apiKey: process.env.PINECONE_KEY });
const openai = new OpenAI({ apiKey: process.env.OPENAI_KEY });

const index = pinecone.index('maison-products');

async function searchProducts(userQuery: string) {
  console.log(`Searching for: ${userQuery}`);

  // 1. Convert User Query to Vector
  const embedding = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: userQuery,
  });
  const vector = embedding.data[0].embedding;

  // 2. Query Vector DB for nearest neighbors
  const results = await index.query({
    vector: vector,
    topK: 5, // Get top 5 matches
    includeMetadata: true,
    // Optional: Filter by category
    filter: {
      price: { $lt: 200 } // Only cheap items
    }
  });

  // 3. Display Results
  results.matches.forEach(match => {
    console.log(`Found: ${match.metadata.title} (Score: ${match.score})`);
  });
}

// Usage
searchProducts("Something elegant for a summer wedding under $200");

What happens:

  1. OpenAI turns “Something elegant…” into a vector.
  2. Pinecone calculates the distance between that vector and every product in your catalog.
  3. It finds “Silk Floral Dress” and “Linen Suit”.
  4. Even though the word “Wedding” never appears in the product description! The embedding model knows that “Silk” and “Linen” are associated with “Summer Weddings”.

Vector search is magic, but it isn’t perfect. It sometimes hallucinates or misses obvious keyword matches.

  • Query: “Product ID 1234”.
  • Vector Search: Might return “Product ID 1235” because the numbers are mathematically close in the vector space. It treats the ID as distinct features.
  • User: “I typed the exact ID, why did you give me the wrong one?”

Solution: Hybrid Search. You combine Keyword Search (BM25) with Vector Search (Dense).

  • Keyword Score: 1.0 (Exact match)
  • Vector Score: 0.2 (Low match)
  • Weighted Sum: The Keyword match wins.

Pinecone and Weaviate now support Hybrid Search out of the box (“Sparse-Dense” vectors). You pass both the dense vector (embedding) and the sparse vector (keywords) to the query. This is the Gold Standard for E-commerce search.

RAG: Retrieval Augmented Generation

Vector Databases are also the backbone of Enterprise AI (RAG). (See LLM Fine-Tuning). When you ask ChatGPT a question about your documents, it strictly uses a Vector Search step behind the scenes to find the “Page Number” before answering. Without Vector DBs, Enterprise AI is impossible. You cannot feed 10,000 pages into a prompt. You feed the 3 most relevant pages. Vector Search finds those 3 pages.

The Skeptic’s View: When NOT to use Vectors

Vectors are hype right now, but they are expensive and complex. Do NOT use Vector Search if:

  1. You have < 100 items: Just use Array.filter().
  2. Your users search by Exact SKU: Vector search is terrible at exact codes.
  3. Your budget is $0: Embeddings cost money (OpenAI API). Vector DBs cost money. Algolia is often cheaper and “good enough”.

FAQ

Q: Can I use Vectors for image search? A: Yes! Models like CLIP (Contrastive Language-Image Pre-Training) can embed Images into the same vector space as Text. You can search for “Dog” and find a picture of a dog.

Q: How often should I update the index? A: Real-time. If a product goes out of stock, you should update the metadata in Pinecone immediately so the search doesn’t return OOS items.

10. Quantization: Reducing Costs

Vectors are heavy (Float32). 1 million vectors * 1536 dims * 4 bytes = 6GB RAM. That’s expensive. Quantization compresses vectors to Int8 or Binary. We lose some precision (ACCURACY drops 1%), but size drops 4x-32x. For a catalog of 100M products, Quantization is mandatory. Pinecone handles this automatically.

11. Re-ranking: The Second Pass

Vector Search finds “Conceptually Similar” items. But it doesn’t know “Business Rules”. It might return an Out-of-Stock item because it’s a perfect semantic match. We use a Re-ranker (Cross-Encoder Model like Cohere).

  1. Retrieve: Get top 100 candidates from Vector DB (Fast).
  2. Re-rank: Pass them through a heavy model that checks stock status, margin, and exact keyword overlap.
  3. Return: Top 10 to user. This “Two-Stage Retrieval” gives the best balance of Speed and Accuracy.

13. Under the Hood: The HNSW Algorithm

How does Pinecone search 1 billion vectors in 10ms? It uses Hierarchical Navigable Small World graphs. Imagine a multi-layer map.

  • Top Layer: High-speed highways connecting distant cities.
  • Bottom Layer: Local streets. The query starts at the top, zooms to the general neighborhood of the vector, then drops down to the streets to find the exact house. This is $O(log(N))$ complexity. A linear scan is $O(N)$. This is the computer science breakthrough that made Vector Search viable in production.

14. The Cost of Embeddings

Vectors don’t appear out of thin air. You pay OpenAI to generate them. Pricing is cheap: $0.00013 / 1k tokens. To embed the entire Harry Potter series (1M tokens) costs $0.13. However, for a User-Generated Content site (Millions of comments per day), this adds up. Optimization: Use open-source models (Hugging Face) running on your own GPU/CPU (ONNX Runtime) to generate embeddings for free. Quality is 95% of OpenAI, cost is 0.

15. Conclusion

If your site search returns “No Results Found”, you are leaving money on the table. Users don’t know your exact terminology. They speak in intent (“warm coat”, “cheap gift”, “something for dad”). Vector Search bridges the gap between Human Intent and Database Inventory. It is the upgrade from “Data Retrieval” to “Knowledge Retrieval”.



Zero results found?

We implement Hybrid Vector Search ecosystems to ensure your customers always find what they need.

Hire our Architects.