You're viewing version 2.16 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Neural sparse query

Introduced 2.11

Use the neural_sparse query for vector field search in neural sparse search. The query can use either raw text or sparse vector tokens.

Request fields

Include the following request fields in the neural_sparse query:

Example: Query by raw text

"neural_sparse": {
  "<vector_field>": {
    "query_text": "<query_text>",
    "model_id": "<model_id>"
  }
}

Example: Query by sparse vector

"neural_sparse": {
  "<vector_field>": {
    "query_tokens": "<query_tokens>"
  }
}

The top-level vector_field specifies the vector field against which to run a search query. The following table lists the other neural_sparse query fields.

Field	Data type	Required/Optional	Description
`query_text`	String	Optional	The query text from which to generate sparse vector embeddings.
`model_id`	String	Optional	The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see Using custom models within OpenSearch and Neural sparse search. For information on setting a default model ID in a neural sparse query, see `neural_query_enricher`.
`query_tokens`	Map<String, Float>	Optional	The query tokens, sometimes referred to as sparse vector embeddings. Similarly to dense semantic retrieval, you can use raw sparse vectors generated by neural models or tokenizers to perform a semantic search query. Use either the `query_text` option for raw field vectors or the `query_tokens` option for sparse vectors. Must be provided in order for the `neural_sparse` query to operate.
`max_token_score`	Float	Optional	(Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided pretrained sparse embedding models, we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated as of OpenSearch 2.12.

Example request

Query by raw text

GET my-nlp-index/_search
{
  "query": {
    "neural_sparse": {
      "passage_embedding": {
        "query_text": "Hi world",
        "model_id": "aP2Q8ooBpBj3wT4HVS8a"
      }
    }
  }
}

Query by sparse vector

GET my-nlp-index/_search
{
  "query": {
    "neural_sparse": {
      "passage_embedding": {
        "query_tokens": {
          "hi" : 4.338913,
          "planets" : 2.7755864,
          "planet" : 5.0969057,
          "mars" : 1.7405145,
          "earth" : 2.6087382,
          "hello" : 3.3210192
        }
      }
    }
  }
}

Request fields
- Example: Query by raw text
- Example: Query by sparse vector

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.

Neural sparse query

Request fields

Example: Query by raw text

Example: Query by sparse vector

Example request

OpenSearch Links

Get Involved

Resources

Contact Us