TLDR: At OpenSearchCon Europe 2026, Fernando Rejon Barrera from Zeta Alpha challenged standard vector search benchmarks by evaluating Lucene’s HNSW, Faiss, and jVector under realistic production workloads. The test revealed that performance gaps stem from internal index construction, memory management, and concurrency handling rather than raw speeds. Ultimately, no single engine wins; architects must choose based on specific system complexities, filter selectivity, and merge strategies. |
Based on the OpenSearchCon Europe 2026 talk by Fernando Rejon Barrera, Zeta Alpha | Watch the full session →
Every vector search engine has a benchmark page. Every benchmark page shows impressive numbers. And almost none of them tell you what actually happens when you layer in Boolean filters, facets, and hybrid queries on a real production dataset with real access patterns.
That gap is exactly what Fernando Rejon Barrera, researcher at Zeta Alpha, came to Prague to address. His OpenSearchCon Europe 2026 session, “Beyond Benchmarks: Real-World AI Search Performance in OpenSearch,” is the kind of talk that makes you look skeptically at every vendor comparison chart you have ever trusted.
Three engines, one honest test
OpenSearch users today have three meaningful choices for vector search: Lucene’s built-in HNSW implementation, the Faiss engine integration, and jVector, an open-source vector search plugin. All three publish compelling benchmark results. All three claim strong scalability. Most existing comparisons, as Rejon Barrera pointed out, focus on isolated Approximate Nearest Neighbor (ANN) performance, which is not what your production system actually runs.
The session used identical datasets, identical queries, and controlled filter selectivity to test all three engines under conditions that more closely mirror production reality. What happens to latency when you add Boolean filters? How does recall change as aggregations are introduced? Where does throughput fall apart under hybrid retrieval? These are the questions that actually matter when you are choosing an engine for a production system, and they are the questions that most benchmarks do not answer.
Where the differences actually come from
The results revealed something useful: the performance gaps between engines do not originate where most people expect. The headline latency numbers are one thing, but Rejon Barrera broke down where the differences actually live, inside OpenSearch’s index construction behavior, memory usage patterns, concurrency handling, and merge strategies. Understanding those root causes changes how you think about engine selection entirely.
It is not just about which engine is fastest on a warm cache with clean queries. It is about which engine degrades most gracefully when real users hit it with the full complexity of their actual search patterns, and which one gives you the knobs to tune when things go sideways.
What this means for your architecture decisions
The session did not declare a winner, which is the correct conclusion. Each engine has a context in which it is the right default choice. Lucene’s HNSW is deeply integrated and benefits from Lucene’s merge and segment management. Faiss offers different performance characteristics under certain workloads. jVector brings a different approach that suits specific use cases.
What Rejon Barrera gave attendees was something more valuable than a ranking: a framework for interpreting vendor and community benchmarks critically, and a methodology for designing vector search evaluations that reflect real production systems rather than synthetic ideal conditions. If you are about to make an architecture decision about vector search in OpenSearch, the 32 minutes of this session could save you from a very expensive wrong turn.
The broader point—that benchmark theater in the vector search space does not serve practitioners—is one the community needs to hear more clearly. This talk is a good starting point for that conversation.