Generative AI: OpenSearch's journey as an open-source search engine

The rise of generative AI has transformed search technology, evolving it from simple keyword-based retrieval to intelligent, multimodal search experiences powered by deep learning. In this keynote address at the Linux Foundation Member Summit 2025, I covered the journey of OpenSearch—an Apache 2.0–licensed, open-source search and analytics engine—and how community-led innovation has transformed it into a generative-AI-powered search platform supporting vector search, semantic search, and retrieval-augmented generation (RAG) workflows.

A major catalyst for this evolution was OpenSearch’s transition to the Linux Foundation in September 2024, establishing a vendor-neutral governance model through the OpenSearch Software Foundation. This move accelerated innovation and fostered a diverse ecosystem, attracting contributions from leading organizations such as AWS, SAP, Uber, and, most recently, ByteDance. It is because of open source that we’ve been able to innovate faster and deeper—pushing the boundaries of search technology at an unprecedented pace.

Performance and capability improvements in lexical search

While embracing AI-driven enhancements, OpenSearch remains committed to improving lexical search, recognizing that keyword-based search remains the foundation of all search use cases. Many search applications, including log analytics, e-commerce, and security monitoring applications, rely heavily on precise keyword matching. OpenSearch continues to optimize its BM25-based ranking algorithms, indexing efficiency, and query execution speed to ensure that keyword search remains fast, scalable, and reliable.

Recent performance improvements have focused on reducing latency for text and vector querying, improving ingestion and indexing throughput, and enhancing storage efficiency. The latest OpenSearch version exemplifies the community’s commitment, delivering improvements in query speed, resource utilization, and memory efficiency across text queries, aggregations, range queries, and time-series analytics. These enhancements ensure that even as OpenSearch integrates AI-driven techniques, its core keyword search capabilities remain robust and performant.

The transition to generative-AI-powered search

Two pivotal events shaped OpenSearch’s trajectory as a search engine: the generative AI revolution that began in 2022 and OpenSearch’s transition to the Linux Foundation in 2024, advancing a community-led innovation model. These milestones catalyzed OpenSearch’s transformation from a text-based search engine into an AI-driven platform.

Historically, search engines have relied on lexical search, which ranks documents based on term frequency and exact keyword matches. However, with the advent of large language models (LLMs) and AI-driven search techniques, OpenSearch evolved to support semantic search, hybrid search, and multimodal capabilities. When the generative AI and LLM boom gained momentum, OpenSearch was well positioned to embrace these advancements, largely due to its robust vector database capabilities.

This foundation enabled the seamless integration of AI-driven search techniques, allowing OpenSearch to transition from a keyword-based search engine to an AI-augmented search platform capable of handling complex semantic queries and RAG workflows.

Vector database capabilities and scaling AI search

One of the fundamental shifts in OpenSearch’s architecture has been its transformation into a vector database, enabling AI-powered similarity search. OpenSearch’s vector engine, backed by k-nearest neighbor (k-NN) algorithms, allows for efficient similarity search across diverse data types, including text, audio, video, and documents. The workflow involves generating vector embeddings using transformer models, indexing them in OpenSearch, and making them available for AI-driven search applications.

As AI adoption surged, OpenSearch’s vector database capabilities had to scale from millions to billions of vectors. However, traditional CPU-based processing became a bottleneck, making vector search operations increasingly time consuming and cost intensive. To address this challenge, OpenSearch is collaborating with NVIDIA to introduce GPU-based acceleration for indexing workflows. Additionally, Uber is contributing to the project, leveraging GPU acceleration for its large-scale search platform and enhancing OpenSearch’s ability to handle vast amounts of vectorized data efficiently.

Hybrid search and multimodal capabilities

As AI-powered search advances, OpenSearch has embraced hybrid search, combining the precision of lexical search with the intelligence of semantic search. This means that users no longer have to rely solely on keyword matches; instead, OpenSearch can understand the intent behind a user’s query and return more relevant and intelligent results. Beyond hybrid search, OpenSearch has introduced multimodal search, enabling searches across text, images, audio, and video. This advancement is particularly valuable for media organizations, e-commerce platforms, and enterprise applications, where diverse data types must be searchable in a unified manner.

Sparse models for democratizing AI-powered search

One of OpenSearch’s key contributions to the open-source AI community is its adoption of sparse models. Unlike dense embeddings, which require substantial computational resources, sparse models are lightweight deep learning techniques that make AI-powered search accessible to organizations of all sizes.

This approach ensures that AI-driven search capabilities are not exclusive to companies with deep financial resources, furthering OpenSearch’s mission of democratizing access to advanced search technologies.

LLM integrations and ML Commons

Generative AI’s true power lies in its ability to enhance search relevance. OpenSearch provides seamless integration with LLMs, offering out-of-the-box connectors for Cohere, OpenAI, and DeepSeek. Through ML Commons, an open-source machine learning (ML) library within OpenSearch, developers can easily integrate AI models and leverage community-driven blueprints to build powerful AI applications.

ML Commons allows you to use your own ML models and flexibly integrate various model formats and runtimes for seamless inference within OpenSearch—a testament to OpenSearch’s agility is its ability to quickly adapt to new AI models. For example, when DeepSeek launched, OpenSearch contributors were able to release a connector within just two days, thanks to the platform’s flexible ML Commons framework and strong community collaboration.

Simplifying RAG and conversational search

While AI-powered search offers unprecedented capabilities, implementing RAG workflows remains complex. RAG involves using an LLM to generate responses based on relevant context retrieved from a search engine. OpenSearch aims to simplify this process with the Flow Framework, a no-code/low-code builder that provides an intuitive drag-and-drop interface for constructing AI-powered search workflows.

Additionally, OpenSearch is enhancing conversational search, allowing users to interact with search results in a natural, chat-based manner. Through contributions from Aryn.ai, the platform is evolving beyond static search queries, making search experiences more interactive and intuitive.

Open-Source innovation and the future of OpenSearch

OpenSearch’s transformation into an AI-powered search engine has been made possible by community-led innovation. The open-source community continues to shape the project’s future by contributing features, enhancements, and integrations that push the boundaries of what AI-powered search can achieve.

As OpenSearch moves forward on its journey, it remains committed to several key strategic objectives:

Enhancing vector database scalability
Improving the performance of the core search engine
Expanding LLM and AI model integrations
Improving hybrid and multimodal search capabilities
Democratizing AI search through lightweight sparse models

Join the OpenSearch community

The strength of OpenSearch lies in its open-source contributors. Whether you’re an AI researcher, developer, or search enthusiast, there are many ways to get involved and make a meaningful impact:

Join the OpenSearch Slack channel to engage with the community.
Contribute code or submit proposals on GitHub.
Explore ML Commons and help build AI-powered search applications.
Attend OpenSearch events and hackathons to collaborate with experts.
Learn from the new vector database tutorials to deepen your understanding of AI search technologies.

The future of search is open, intelligent, and AI driven—and OpenSearch invites you to be part of building it. Join us in shaping the next generation of open-source search technology!

Let’s build the future of search together in an open-source way!

Author

Pallavi Priyadarshini

Pallavi Priyadarshini is a Senior Engineering Manager at the OpenSearch Project leading the development of high-performing and scalable technologies for search, security, releases, and dashboards.
View all posts

Generative AI: OpenSearch’s journey as an open-source search engine