Link Search Menu Expand Document Documentation Menu

You're viewing version 2.16 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Using raw vectors for neural sparse search

If you’re using self-hosted sparse embedding models, you can ingest raw sparse vectors and use neural sparse search.

Tutorial

This tutorial consists of the following steps:

  1. Ingest sparse vectors
    1. Create an index
    2. Ingest documents into the index
  2. Search the data using raw sparse vector.

Step 1: Ingest sparse vectors

Once you have generated sparse vector embeddings, you can directly ingest them into OpenSearch.

Step 1(a): Create an index

In order to ingest documents containing raw sparse vectors, create a rank features index:

PUT /my-nlp-index
{
  "mappings": {
    "properties": {
      "id": {
        "type": "text"
      },
      "passage_embedding": {
        "type": "rank_features"
      },
      "passage_text": {
        "type": "text"
      }
    }
  }
}

Step 1(b): Ingest documents into the index

To ingest documents into the index created in the previous step, send the following request:

PUT /my-nlp-index/_doc/1
{
  "passage_text": "Hello world",
  "id": "s1",
  "passage_embedding": {
    "hi" : 4.338913,
    "planets" : 2.7755864,
    "planet" : 5.0969057,
    "mars" : 1.7405145,
    "earth" : 2.6087382,
    "hello" : 3.3210192
  }
}

Step 2: Search the data using a sparse vector

To search the documents using a sparse vector, provide the sparse embeddings in the neural_sparse query:

GET my-nlp-index/_search
{
  "query": {
    "neural_sparse": {
      "passage_embedding": {
        "query_tokens": {
          "hi" : 4.338913,
          "planets" : 2.7755864,
          "planet" : 5.0969057,
          "mars" : 1.7405145,
          "earth" : 2.6087382,
          "hello" : 3.3210192
        }
      }
    }
  }
}

To learn more about improving retrieval time for neural sparse search, see Accelerating neural sparse search.