Neural sparse search

Introduced 2.11

Semantic search relies on dense retrieval that is based on text embedding models. However, dense methods use k-NN search, which consumes a large amount of memory and CPU resources. An alternative to semantic search, neural sparse search is implemented using an inverted index and is thus as efficient as BM25. Neural sparse search is facilitated by sparse embedding models. When you perform a neural sparse search, it creates a sparse vector (a list of token: weight key-value pairs representing an entry and its weight) and ingests data into a rank features index.

To further boost search relevance, you can combine neural sparse search with dense semantic search using a hybrid query.

You can configure neural sparse search in the following ways:

Generate vector embeddings automatically: Configure an ingest pipeline to generate and store sparse vector embeddings from document text at ingestion time. At query time, input plain text, which will be automatically converted into vector embeddings for search. For complete setup steps, see Generating sparse vector embeddings automatically.
Ingest raw sparse vectors and search using sparse vectors directly. For complete setup steps, see Neural sparse search using raw vectors.

To learn more about splitting long text into passages for neural sparse search, see Text chunking.

Accelerating neural sparse search

Starting with OpenSearch version 2.15, you can significantly accelerate the search process by creating a search pipeline with a neural_sparse_two_phase_processor.

To create a search pipeline with a two-phase processor for neural sparse search, use the following request:

PUT /_search/pipeline/two_phase_search_pipeline
{
  "request_processors": [
    {
      "neural_sparse_two_phase_processor": {
        "tag": "neural-sparse",
        "description": "Creates a two-phase processor for neural sparse search."
      }
    }
  ]
}

Then choose the index you want to configure with the search pipeline and set the index.search.default_pipeline to the pipeline name, as shown in the following example:

PUT /my-nlp-index/_settings 
{
  "index.search.default_pipeline" : "two_phase_search_pipeline"
}

For information about two_phase_search_pipeline, see Neural sparse query two-phase processor.

Text chunking

For information about splitting large documents into smaller passages before generating embeddings, see Text chunking.

Neural sparse search

Accelerating neural sparse search

Text chunking

Further reading

Related articles

OpenSearch Links

Get Involved

Resources

Contact Us