AN Alpesh Nakrani
BlogBooksPraiseAbout Work with me →
All books
Embeddings, Honestly cover
2025 / Free online book · Technical Deep Dives

Embeddings, Honestly

What Vectors Do and Don't Know

Access
Free
Chapters
28
Read time
258 min

Embeddings can make fuzzy language computable, power semantic search, cluster messy information, retrieve context for LLMs, and support recommendations. They do not know truth, authority, chronology, identity, permissions, freshness, or business rules. This research-backed edition works through the HONEST framework, retrieval architecture, metadata, reranking, evaluation, security, cost, model choice, and production anti-patterns for teams building semantic systems that must survive contact with real users.

A full field manual on what embeddings capture, what they forget, and why similarity is not truth.

This edition is free to read onsite. Each chapter has its own URL, so readers can bookmark, share, and return to the exact section they need.

Table of contents
PRE Preface: What Vectors Do and Don't Know Embeddings are one of the most useful misunderstandings in modern AI. 10 min 01 A Vector Is a Shadow A support-ticket search pilot looks impressive until the CTO asks why the system cannot tell the difference between an old draft policy and the approved one. The failure is not that embeddings are useless. 9 min 02 What Similarity Can Do A customer support manager wants one queue for refund intent even when users write 'money back', 'reverse charge', or 'cancel my renewal'. The failure is not that embeddings are useless. 8 min 03 What Similarity Cannot Know A legal assistant retrieves a confidential merger memo because it is semantically similar to a public policy page. The failure is not that embeddings are useless. 8 min 04 Similar Is Not Correct A healthcare policy assistant returns the closest paragraph, but the correct paragraph is farther away because it uses a newer name for the program. The failure is not that embeddings are useless. 8 min 05 The Numbers Are Not Labels A product engineer stares at a 1536-dimensional vector and tries to name dimension 812. The mistake is educational. 8 min 06 Distance Is a Ranking, Not a Verdict A sales-search feature treats cosine 0.82 as confidence and shows a document that sounds relevant but belongs to another product line. The failure is not that embeddings are useless. 8 min 07 The Map Is Learned, Not Universal A generic embedding model fails on internal acronyms that every employee understands, because the model never learned the organization's map. The failure is not that embeddings are useless. 8 min 08 Dense, Sparse, and Hybrid A pure vector search misses SKU AX-1042 because the text around it is not semantically rich, while BM25 finds it instantly. The failure is not that embeddings are useless. 8 min 09 Semantic Search From First Principles A team builds semantic search over internal markdown docs and learns the pipeline matters more than the database brand. The failure is not that embeddings are useless. 9 min 10 Chunking Is Where Retrieval Is Won A PDF manual is chunked by fixed token windows; the answer exists, but the table heading and value land in different chunks. The failure is not that embeddings are useless. 9 min 11 Metadata Is Reality A finance assistant must prefer approved Q4 forecasts over similar draft Q3 notes, and the only way is metadata. The failure is not that embeddings are useless. 9 min 12 Vector Databases Are Not Magic A startup nearly buys a vector database before discovering its existing Lucene stack can run the first version well enough. The failure is not that embeddings are useless. 9 min 13 RAG Still Needs Judgment A RAG assistant retrieves the right source but still hallucinates a confident conclusion not supported by the text. The failure is not that embeddings are useless. 9 min 14 Rerank Before You Believe A vector index returns many plausible candidates; a reranker finally chooses the passages that answer the actual question. The failure is not that embeddings are useless. 9 min 15 Evaluate or Guess The demo works with five hand-picked questions; the golden dataset exposes that most real user questions fail silently. The failure is not that embeddings are useless. 9 min 16 The Failure Modes Nobody Shows Every failure in a support bot looks like 'the LLM hallucinated' until the team separates retrieval failures from generation failures. The failure is not that embeddings are useless. 9 min 17 Hybrid Search in Production A production catalog search needs exact SKU, synonym handling, semantic intent, stock status, and freshness scoring at once. The failure is not that embeddings are useless. 9 min 18 Graphs Remember Relationships A compliance question needs relationships: which entity owns which policy under which jurisdiction, not merely similar paragraphs. The failure is not that embeddings are useless. 9 min 19 Multimodal Embeddings A manufacturing team wants to ask questions over screenshots, inspection photos, PDF pages, and text notes in one system. The failure is not that embeddings are useless. 9 min 20 Personalization and Recommendations A recommender learns from user clicks and then traps users inside its own feedback loop unless exploration and privacy controls exist. The failure is not that embeddings are useless. 9 min 21 Embeddings for Code A codebase assistant retrieves a similar function but misses the canonical usage pattern because symbols and tests were not indexed. The failure is not that embeddings are useless. 9 min 22 Security Starts Before Retrieval A tenant-aware knowledge base leaks a restricted document because the team filtered permissions after retrieval instead of before output. The failure is not that embeddings are useless. 9 min 23 The Cost of Meaning The prototype costs $40. The same design at production traffic costs $40, 000 because token length and reranking were ignored. 9 min 24 Choosing an Embedding Model A model upgrade improves public benchmark scores but breaks the company's golden queries and requires a full reindex. The failure is not that embeddings are useless. 9 min 25 The Honest Retrieval Architecture The team stops arguing about vector database brands and designs the retrieval system as a set of explicit trust gates. The failure is not that embeddings are useless. 9 min 26 Use Case Playbook Twelve teams ask for embeddings; each use case needs different metadata, evaluation, retrieval strategy, and failure response. The failure is not that embeddings are useless. 9 min 27 The Embedding Anti-Patterns A design review finds every classic anti-pattern in one architecture diagram: whole PDFs, no ACL, top-5 only, no eval, no freshness. The failure is not that embeddings are useless. 9 min 28 The Honest Checklist Before Shipping The shipping meeting changes from 'does the demo work?' to 'which readiness gate is still red?' The failure is not that embeddings are useless. The failure is that the team expected the embedding to carry information it was never designed to carry. 9 min A Appendix A: Source Map This edition intentionally uses chapter-specific research bases rather than repeating the same source list everywhere. 3 min