Design a vector search engine that stores 1 billion embeddings and serves approximate-nearest-neighbour (ANN) queries for semantic search and RAG retrieval.
Requirements:
- •Insert, update, and delete vectors (with metadata) in near real time
- •k-NN search with metadata filtering (e.g. "top 10 similar docs where tenant = X")
- •Tunable trade-off between recall and latency
- •Multi-tenant: thousands of isolated indexes
What you'll be assessed on
ANN index choice, sharding billions of vectors, the recall/latency/cost trade-off, filtered search, and real-time updates.