Data Engineering & Databases

Building the Database for Trillion-Scale AI Search

This talk goes beyond architecture diagrams to share what actually happens when you operate an agentic search engine on trillions of documents. We'll dig into how an object storage-native design allows a small team of engineers to manage an AI search engine that scales to: * Peak load of 1M+ writes per second and 30k+ searches per second * 1+ trillion documents * 5+ PB of logical data * 400+ tenants * p90 query latency <100 ms Topics include: * How using a modern storage architecture decreases COGS by 10x or more * Optimizing traditional vector and FTS indexes for the high latency of object storage * Building search algorithms that are fine-tuned for LLM-initiated searches * A simple rate-limiting technique that provides strong performance isolation in multi-tenant environments * Observability, reliability, and performance lessons learned from production incidents. Attendees will leave with a concrete understanding of how separating storage from compute-and treating object storage as the primary database changes not only the cost structure, but the entire operational model of large-scale AI search.

Speakers